The challenge data is from the largest audio recordings set from the Antilles. These recordings have been made during the CARI'MAM European project 2017-2021. These audio have been recorded from 17 passive acoustic stations.
You can display a map of all recording stations localization here.
In total, data represent more than 25 To from the beginning of the project until today, it's still running.
Our team (LIS DYNI, Toulon University) leads the hardware and the IA part of the CARI'MAM project.
For this challenge, we have extracted the interesting part of the recordings.
The AI processing of the songs of the humpback whales in this data set are quite mature and show clues of cultural effects in these cetacean [1,2].
But due to the transient noises (reef, fish, or shrimp...), analysis of Odontocetes clicks (dolphins, sperm whales...) can not be processed yet.
Odontocetes emit click sequences for echolocation and communication.
In the supplementary files, you can find examples of Short-Term Fourier Transform (stft), one is positive (dolphins clicks) one other is a transient noise (reef noise).
Succeeding in this task may be useful for biodiversity monitoring on large-scale data. This is the bottleneck in the representation and ecological exploration of the 25 To (only on the CARI'MAM corpus).
The challenge's task can be summarized as: is this wav file contains or not biosonar click?
Challenge supported by the Interreg CARIMAM (OFB, AGOA), the ANR ULP-COCHLEA and the national program IA ADSIL de H. Glotin co-financed AID, DGA, ANR.
 Hervé Glotin, Maxence Ferrari, Paul Best, Marion Poupard, Nicolas Thellier, Audrey Monsimer, Pascale Giraudet, CARIMAM REPORT 1, BIOACOUSTIC DATA PROCESSING. Research Report DYNI LIS. 2021. hal-03629286, https://hal.archives-ouvertes.fr/hal-03629286/document
 Stéphane Chavin, Master thesis, Automatic classification of humpback whale (Megaptera novaeangliae) vocalization in the Caribbean, 2022. http://sabiod.lis-lab.fr/pub/Chavin_S_MasterThesis2022.pdf
The challenge goal is to determine if the audio file contains biosonars (dolphin clicks) or noises transients (reef, shrimp noises...).
The challenge dataset is composed of 8 recording sessions for the train set and 2 other recording sessions for the test set. No session is shared between the train and the test set, in order to incite participants to create session-independent models.
Each 200-millisecond wavfile contains, biosonar click or background noise with transients. The potential click/transient is centered in the signal. There can be only one or multiple.
Each wavfile is 16 bits dynamic, 256 000 Hz sampling rate. Train audio files are in the X_train.zip folder. This folder contains about 23 000 files. There are less than 1 000 files for the test.
During evaluation phase, the recording session name is "TEST" for the two sessions. Example : 25061-TEST.wav.
- pos_label = 1 : “X does contain detected biosonar (dolphin, delphinidea)”
- pos_label = 0 : “X does not contain detected biosonar (noise)"
The metric used in this challenge to rank participants is the ROC AUC score.
The challenge baseline, call benchmark, is obtained by a classifier with extracted features from the signal.
The feature extraction can be reproduced as:
Load one 200 ms audio signal.
Apply pass-band filtering between 5 kHz to 100 kHz.
For each 2048 samples frame, extract features :
For all frames, the algorithm computes the mean, standard deviation, minimum and maximum.
Finaly, the 16 features are used as classifier input. We used XGBClassifier from the XGBoost library.
The pass-band filter coefficients came from the function scipy.signal.butter. And we used scipy.signal.sosfiltfilt to apply them. The features are extracted from librosa.feature library.
An example of feature extraction is included in the Supplementary files.
The benchmark has obtained a ROC AUC score equal to 0.75.
Files are accessible when logged in and registered to the challenge