Challenge Data

Few-shots to learn anatomic and oncologic structures and in radiology
by Raidium


The data was updated on February 8, 2024. The number of images in the y_train file has been reduced to 2000 to match the x_train images provided.

Login to your account


Description


NO LOGO FOR THIS CHALLENGE
Competitive challenge
Biology
Health
Segmentation
Images
More than 1GB
Advanced level

Dates

Started on Jan. 10, 2024


Challenge context

Here the goal is to segment structures using their shape, but no exhaustive annotations. The training data composed of two types of images:

  • CT images with anatomical and oncological segmentations masks of individual structures
    • They act as the ground truth definition of what are anatomical structures.
    • However they don’t intend to be representative of all of the possible structures and their diversity, but can still be used as training material.
    • This makes this problem a mix of a zero-shot learning problem (some structures in the test set are not found in the train set) and few-shot learning problem (some structures are common between the train and test set, but there are limited examples)
  • Raw CT images, without any segmented structures
    • They can be used as additional training material, in an unsupervised setting.

The test set is made of new images with their corresponding segmented structures, and the metric measures the capacity to correctly segment and separate the different structures on an image;

Note: The segmented structures are not covering the entirety of the image, some pixels being not part of identifiable structures, as we see on the image above. They are thus considered part of the background.


Challenge goals

Here the goal is to segment structures using their shape, but no exhaustive annotations. The training data is composed of two types of images:

1) CT images with anatomical and oncologic segmentations masks of individual structures

  • They act as the ground truth definition of what are anatomical structures and tumors.
  • However, they do not intend to be representative of all of the possible structures and their diversity, but can still be used as training material.
  • This makes this problem a mix of a zero-shot learning problem (some structures in the test set are not found in the train set) and few-shot learning problem (some structures are common between the train and test set, but there are limited examples).

2) Raw CT images, without any segmented structures

  • They can be used as additional training material, in an unsupervised setting.

The test set is made of new images with their corresponding segmented structures, and the metric measures the capacity to correctly segment and separate the different structures on an image.

Note: The segmented structures are not covering the entirety of the image, some pixels being not part of identifiable structures, as we see on the image above. They are thus considered part of the background.


Data description

The input is a list of 2D grayscale images (i.e. a 3D numpy array), each corresponding to a slice of a CT-scan (in the transverse plane) of size 512x512 pixels; The slices are randomized and thus there is no 3D information;
The label/output is a list of 2D matrices (i.e a 3D numpy array) of size 512x512 pixels, with integer (uint8) values. Each position (w,h) of each matrix $Y_{i,w,h}$ identifies a structure.
For example on the Figure 1 above, the 23 colors correspond to 23 different segmented structures, thus each pixel label $Y_{i,w,h}$ at position (w,h) has values in the integer range [[0; 23]]. 0 is a special value meaning this pixel is not part of any structure and then part of the background.

In practice, the output is encoded as a CSV file that encodes the transpose of the flattened label matrix. Note: The transpose is used here for performance reasons: Pandas is very slow to load CSV files with many columns, but is very fast to load CSV files with many rows. Thus, the CSV is composed of 262144 columns, each corresponding to a pixel of the image, and 500 rows, each corresponding to an image.

To get the list of 2D predictions, you must therefore transpose the received CSV, and reshape it:

import pandas as pd
predictions = pd.read_csv(label_csv_path, index_col=0, header=0).T.values.reshape((-1, 512, 512))
# In the end, we get a list of 2D predictions, i.e. 3D numpy array of shape (500, 512, 512)

In order to get the output CSV from a list of 2D predictions, it is necessary to flatten each of the predictions, and concatenate them into a single matrix, then transpose the matrix and finally save it in a CSV file:

import pandas as pd
# predictions is a list of 2D predictions, i.e. a 3D numpy array of shape (500, 512, 512)
predictions = np.array of [prediction_1, prediction_2, prediction_3, ...]
pd.DataFrame(predictions.reshape((predictions.shape[0], -1))).T.to_csv(output_csv_path)

This problem can be seen as an image-wise pixel clustering problem, where each structure is a cluster in the image: The pixel labels values are structure identifiers on a specific image and are not necessarily coherent between images. For example the structure associated with the liver can be mapped to the label 4 on one image and to 1 on another image.

The train set is composed 2000 images, split into two groups:

  • 400 with fully segmented structures.
    • For these images, the corresponding label is a 2D matrix with pixel labels for segmented structures, and the other pixels set to 0
  • 1600 of them have no annotations at all.
    • For these images, the corresponding label is a 2D matrix filled with zeros

Note: Segmentation maps of structures of the train set (in addition to the CSV) are given in the supplementary materials.

The test set is composed of 500 images with organ and tumor segmentation. For these images, the corresponding label is a 2D matrix with the segmented structures, and the other pixels set to 0. Considering the fact that an individual image with its label matrix is around 400KB, and we have 1500 images, we then have a dataset of about 600MB in total.

Note: The segmentation map is not dense, meaning that some pixels between structures are not segmented, as we can see on the image above. These pixels are considered to be part of the background.

Note: It is not authorized to use additional radiological training data, or radiological pre-trained models, or any other external radiological data source. You are however allowed to use pre-trained models and data that are not radiological ( (DINO v2, SAM ...).

The global metric is computed by averaging the Rand Index between each label and its associated prediction, while also excluding background pixels (i.e. 0 in the label). This clustering metric is invariant to inter-image permutation of the structure numbering and is implemented in sklearn here.

Notebook

A getting started notebook can be found at the following address:

https://colab.research.google.com/drive/1OOzMtT62OFl_tURo4TWjFKJf6jRMc5_O?usp=share_link

It includes examples of data loading, visualization, metric computation and a baseline.


Benchmark description

The benchmark is based on classical vision algorithms, such as watershed, sobel filters etc.


Files


Files are accessible when logged in and registered to the challenge


The challenge provider


PROVIDER LOGO

Pierre Manceron - Head of Science at Raidium