Challenge data

Description

Competitive challenge

Physics

Industrial

Segmentation

Images

10MB to 1GB

Intermediary level

Dates

Started on Jan. 8, 2025

Challenge context

## 1. Provider description
SLB is a global technology company that drives energy innovation for a balanced planet. With a global footprint in more than 100 countries and employees representing almost twice as many nationalities, we work each day on innovating oil and gas, delivering digital at scale, decarbonizing industries, and developing and scaling new energy systems that accelerate the energy transition.
## 2. Context
A well consists of several casings to withstand pressure, with cement layers inserted between them to ensure stability. To assess the stability of the cement, SLB uses an isolation scanner, which is part of the ultrasonic imaging tool. This tool acquires downhole data using flexural waves, which produce peaks from various reflective surfaces. In the figure below, the first peak represents the interface between the cement and casing, while the second peak represents the interface between the cement **(Third Interface Echo, TIE)** and the formation. Detecting both of these surfaces is crucial for evaluating the cement's quality and identifying any collapse in the cement. Signals are acquired at different depths, and these acquisitions are stacked to form an image, which is the focus of this challenge. Accurately detecting these components is essential for monitoring the well's health during the drilling process. Failure to detect and assess the condition of the casing and TIE can lead to issues such as wellbore instability, casing failure, or environmental hazards due to potential leaks or blowouts. Early identification allows drilling teams to address abnormalities promptly, minimizing risks of operational delays, equipment failure, or costly repairs. In this challenge, our goal is to accurately detect the casing and TIE elements.

![](/media/public/challengedata_20241030134254374176.png)
#### Glossary
**Well Casing:** Well casing is a steel or cement lining installed inside a drilled well to stabilize the borehole, prevent contamination, and ensure safe extraction of fluids like oil, gas, or water. It reinforces the well structure and isolates different underground formations.

**Flexural waves:** Flexural waves are mechanical waves that travel along thin structures like pipes causing them to bend or flex. These waves play a key role in ultrasonic testing for material characterization.

**Drilling Process:** Drilling is the process of creating a borehole in the ground using a rotating drill bit to access resources like oil, gas, or groundwater.

Challenge goals

Data description

As mentioned in the description of the problem, the images are ultrasonic images of wells. The database will consist of **11 wells** and their respective labels – a segmentation mask (multi-class mask) of the same shape as the input image.

For each well, we have multiple acquisitions, each corresponding to one cross-section for an azimuth. We call these "sections" in this challenge. Each well is divided into a certain number of sections **(36 or 18)**. For the 36 sections, each acquisition is separated from the next by **10 degrees**, and for the 18 sections, each acquisition is separated by **20 degrees**.

The wells consist of long images. To simplify the task, we divide the well into patches. These patches vary in size depending on the well. You will find patches of size **(160x160)** and **(160x272)**. For standardization, we use a size of** 160X160 to ensure square inputs**. However, we could consider using different sizes to capture more vertical information. This could help extract additional details from the well data in the vertical dimension.

To calculate the number of patches, we divide the total size of the well by 160 and multiply this by the number of sections each well has. In the following, we have divided the wells into training and testing sets and summarized the information for each well accordingly.

![](/media/public/split patches_20241030183810661468.png)

1.Train and validation data:

- Well 1 : (5920, 272), 18 sections -> 666 patches
- Well 2 : (5201, 272) , 36 sections -> 1152 patches
- Well 3 : (4442, 160), 18 sections -> 486 patches
- Well 4 : (1774, 160), 18 sections -> 198 patches
- Well 5 : (2643, 272), 18 sections -> 288 patches
- Well 6 : (14456, 272), 18 sections -> 1620 patches

2.Test data:

- Well 7 : (896, 272), 36 sections -> 180 patches
- Well 8 : (906, 272), 18 sections -> 90 patches
- Well 9 : (1310, 272), 18 sections -> 144 patches
- Well 10 : (1322, 272), 18 sections -> 144 patches
- Well 11 : (3681, 272), 18 sections -> 414 patches

The different sets of data were created to have a similar distribution while avoiding data leakage by sharing the same well in different sets. We also provide you with **unlabeled data**, these images are provided in the **unlabeled_data** folder in the case that the participant decides to test techniques on self-supervised.

The masks are images with dimensions of either **160x272 or 160x160**, and they are stored in a CSV file. Since the images in the training partition have varying sizes, we have added padding with a value of $$-1$$ in the CSV file. To load the data, you need to read the CSV file and reshape the masks into the appropriate format.

##### Reading the file
```
y_train = pd.read_csv(Path('y_train.csv'), index_col=0) # Table with index being the name of the patch
```

##### Accessing a single patch label
```
np.array([v for v in y_train.loc['well_2_section_22_patch_1'] if v!=-1]).reshape(160,-1)
```

##### Accessing all labels at once
```
labels_patches = {}
for index_, values in y_train.iterrows():
    labels_patches[index_] = np.array([v for v in values if v!=-1]).reshape(160,-1)
```

##### Output Format
The output format is a CSV file where each row corresponds to a flattened patch.

Below is an example of how to create a CSV file from saved predictions:
```
size_labels = 272
predictions = {'test':{}}
for phase in predictions.keys():
    img_save_dir = Path(path_data/phase/'predictions')
    for img_path in img_save_dir.glob('*.npy'):
        name = img_path.stem
        if name in predictions[phase]:
            continue
        prediction = np.load(img_path)
        if prediction.shape[1]!=size_labels:
            prediction_aux = -1+np.zeros(160*size_labels) # Adding padding to ensure all masks have the same size
            prediction_aux[0:160*160] = prediction.flatten()
        else:
            prediction_aux = prediction.flatten()
        predictions[phase].update({name:prediction_aux})
 
pd.DataFrame(predictions['test'], dtype='int').T.to_csv(Path(f'../data/y_test_csv_file.csv'))
```

Benchmark description

The benchmark is based on an encoder-decoder architecture trained on patches of the well of size **160 x 160**.

1. Pre-processing:

- Resize images from 160 x 272 to 160 x 160.
	- Missing values replaced by 0.
	- Normalization min-max per patch.

2. Training:

- Encoder-Decoder– 2 levels: the architecture consists of a sequential convolutional layers, batch normalization, and ReLU activation functions for feature extraction, followed by max pooling for downsampling. The decoder uses max unpooling and transposed convolutions to upsample the feature maps, ultimately producing class probabilities through a softmax layer for semantic segmentation task.
	- Batch size 64.
	- Learning rate 0.0001.
	- 60 epochs.
	- Cross Entropy loss.
	- Adam Optimizer.

3. Post-processing:

- Resize images from 160 x 160 to 160 x 272, if the original size of the image is 160 x 272.
	- We used the following function to resize the image:

```
from torchvision.transforms import Resize, InterpolationMode
original_size = (160, 272)
image = Resize(original_size, interpolation=InterpolationMode.NEAREST)(image)
```

Files

Files are accessible when logged in and registered to the challenge

The challenge provider

Energy

PROVIDER WEBSITE

Challenge Data

EchoCem: Third Interface Echo Segmentation for Cement Quality Assessment
by SLB

Description

Dates

Challenge context

Challenge goals

Data description

Benchmark description

Files

The challenge provider

Challenge Data

EchoCem: Third Interface Echo Segmentation for Cement Quality Assessment by SLB

Description

Dates

Challenge context

Challenge goals

Data description

Benchmark description

Files

The challenge provider

EchoCem: Third Interface Echo Segmentation for Cement Quality Assessment
by SLB