Challenge Data

Sinusoid segmentation in subsurface images
by Schlumberger

Login to your account


Competitive challenge
10MB to 1GB
Advanced level


Started on Jan. 4, 2021

Challenge context

In the domain of interpretation of wellbore data (subsurface geological borehole data), a crucial information that experts need to capture accurately is the boundary between geological formations (also called dips). Using ultrasonic or electromagnetic tools, we can capture various information and data about the subsurface from which we can extract dips.

Challenge goals

In the input 2D wellbore data, the formation boundaries are represented by sinusoids capturing the azimuth* and amplitude* of the dip. Detecting and segmenting the sinusoid manually is a tedious, time-consuming task that can take experts up to several hours for one well. Therefore, we are aiming at leveraging the power of machine learning and deep learning to automatically detect those dips (sinusoids) and segment them not only to save time but also to increase the segmentation performances. In this project, we are suggesting the development of a deep learning approach to segment the dips given an input electromagnetic image map. The size of the model is important, so it will be an aspect to consider in this data challenge. * amplitude: is the magnitude of the sinusoid * azimuth: is the horizontal position of a given point counted from the left side of the image block. It is generally expressed as an angle (knowing that the entire width of the image represents 360° over the wellbore)

Data description

The database consists of electromagnetic data and respective labels. In inputs, 2 numpy arrays are available for both training and test, each array representing data for a well. Outputs are in csv format and represent the segmentation mask labels of dips in input data: for each corresponding input pixel, output label is either 1 or 0 if the pixel respectively belongs or does not belong to a dip. Outputs csv files are thus structured similarly as inputs: each row identified by its unique 'ID' contains 56 columns beyond the 'ID' column which represent the input image corresponding row and columns (56 azimuthal resolutions for each row). Because there are two input wells both for training and testing, they are stacked one after the other in the output training and test csv files. For example, for training output, the label_data_C4andC3_train.csv file contains 396321 rows with IDs ranging from 0 to 396320, corresponding first to the 195821 lines of well C3 and then to the 200500 lines of well C4. For test outputs, submitted prediction file shall contain in its first line the same header as in the training output file (i.e columns 'ID', and then '0' to '55'), and then 57072 segmentation prediction lines with a first column containing IDs ranging from 0 to 57071, stacking the segmentation masks of first the 5851 lines of test well C2 and then the 51221 lines of test well C4. An example submission file containing random predictions is provided for illustration and formatting purposes. No transformation or normalization was performed on the data. (Histogram equalization is strongly recommended). Note that the electromagnetic data consists of physical measurements of the rock properties which can then be viewed and normalized as images. We suggest you transform the wells into image patches (you can choose the patch shape), which is a more adequate format for training the model.

Benchmark description

The problem have been resolved in the first instance with CNN models.


Files are accessible when logged in and registered to the challenge

The challenge provider


Schlumberger is the world's largest oilfield services company. Conrad and Marcel Schlumberger to provide cabling services to the oil industry founded it in France in 1927. Since 1929, strong investments in research to develop new logging tools and acquisitions have positioned them in the market as one of the best companies in their area. Schlumberger is leading a digital transformation of the energy industry requiring the application of its operational track record and domain expertise—particularly in subsurface measurement—to every facet of the Exploration and Production life cycle