The goal of this year's challenge is to predict the volume (total value of stock exchanged) available for auction, for 900 stocks over about 350 days.
The goal of this challenge is to build a model to automatically detect sleep apnea events from PSG data.
You will have access to samples from 44 nights recorded with a polysomnography and scored for apnea events by a consensus of human experts. For each of the 44 nights, 200 windows (without intersection) are sampled with the associated labels (which are binary segmentation masks). Each of these windows contains 90 seconds of signal from 8 physiological signals sampled at 100Hz:
The segmentation mask is sampled at 1Hz and contains 90 labels (0 = No event, 1 = Apnea event). Both examples can be reproduced using visualization.py provided in the supplementary files.
The 8 PSG signals with the associated segment mask. The apnea events is visible with Abdominal belt, thoracic belt and airflow amplitude dropping sensibly below baseline. The SPO2 drops after the event.
The 8 PSG signals with the associated segment mask. Two short apnea events are visible with the associated breathing disruption. The SPO2 drops during the second event is likely to be a consequence of the first event.
We want to assess if the events detected by the algorithm are in agreement with the one detected by the sleep experts.
As we seek to evaluate events-wise agreement between the model and the scorers, the metric cannot be computed directly on the segmentation mask. First, events are extracted from the binary mask with the following rule:
An apnea event is a sequence of consecutive 1 in the binary mask.
For each apnea events from a window, we extract the start and end index to produce a list of events. This list can be empty if not events are found. The same processing is applied to the ground-truth masks to extract the ground-truth events.
In order to assess the agreements between the ground-truth and estimated events, the F1-score is computed. Two events match if their IoU (intersection over union or Jaccard Index) is above 0.3.
Hence a detected event is a True Positive if it matches with a ground-truth event, it's a False Positive otherwise. On the other hand, a ground-truth event without a matching detected event is a False Negative. TP, FP, FN are summed over all the windows to compute the F1-score.
The detailed implementation can be found in the metrics file.
The goal of this data challenge is to predict the "colour" of a product, given its image, title, and description. A product can be of multiple colours, making it a multi-label classification problem.
For example, in Rakuten Ichiba catalog, a product with a Japanese title タイトリスト プレーヤーズ ローラートラベルカバー (Titleist Players Roller Travel Cover) associated with an image and sometimes with an additional description. The colour of this product is annotated as Red and Black. There are other products with different titles, images, with possible descriptions, and associated colour attribute tags. Given these information on the products, like the example above, this challenge proposes to model a multi-label classifier to classify the products into its corresponding colour attributes.
The metric used in this challenge to rank the participants is the weighted-F1 score.
Scikit-Learn package has an F1 score implementation (link) and can be used for this challenge with its average
parameter set to "weighted"
.
This data challenge aims at introducing a new statistical model to predict and analyze air quality in big buildings using observations stored in the Oze-Energies database. Physics based approaches to build air quality simulation tool in order to simulate complex building behaviors are widespread in the most complex situations. The main drawbacks of such softwares to simulate the behavior of transient systems are:
In order to analyze and predict future air quality to alert and correct building management systems to ensure comfort and satisfactory sanitary conditions, this challenge aims at solving issue ii), i.e. at designing models which takes into account the uncertainty in the exogenous data describing external weather conditions and the occupation of the building. This will allow to provide confidence intervals on the air quality predictions, here on the humidity of the air inside the building.
The goal of the challenge is to classify traders within three categories, HFT, non HFT and MIX.
According to the AMF in-house expert-based classification, based on the knowledge that AMF has on the market players, market players are divided into three categories, HFT, MIX and non-HFT.
From a set of behavioural variables based on order and transaction data, the challenger is invited to predict the category to which a given participant belongs.
The proposed classification algorithm will then be applied to other data sources for which market players are currently not well known by the AMF.
The objective of this challenge is to design a model capable of predicting the usage of some EV charging stations in Paris, more specifically the times when they are available, actively charging a car, plugged, offline or down.
The goal of this challenge is to confirm the presence of defects on parts based on pictures taken during production of Power Module in Valeo plant in Sablé sur Sarthe.
During module assembly, an “automatic optical inspection” (AOI) is done after a wire bonding process to check the conformity and quality of the parts. This inspection is based on pictures taken by camera and basic algorithms used to measure some specific parameters on the parts. The AOI machine is efficient to measure dimensions on the parts (width of bonding wire for example) but much less for “aspect” defects. This difficulty to properly analyze this type of defect leads to a large number of parts that must be confirmed manually by operators. In certain conditions, the rate of “false defect” (parts considered KO by machine but OK by operator) could reach 10 or 20% of the production.
The target of this challenge is to define a model that could provide a better result than AOI to discriminate between good and bad parts for aspect defects. For this analysis, we would like to focus only on bonding with thin wire (200um).
To estimate the validity of the predictions we propose to use two different measures: the coefficient of determination (R2), which shows the skill of the mean prediction; and the reliability, which measures the accuracy of the spread in the prediction.
The mathematical details are available in the associated file.
More precisely, our thesaurus comprises few hundreds tags (e.g. blues-rock, electric-guitar, happy), regrouped in classes (Genres, Instruments or Moods), partitioned into categories (genres-orchestral, instruments-brass, mood-dramatic, etc.). Each audio track of our database may be tagged with one or more labels of each class so the auto-tagging process is a multi-label classification problem; we can train neural networks to learn from audio features and generate numerical predictions to minimise the binary cross entropy with respect to the one-hot encoded labelling of the dataset.
On the other hand, to display the tagging on our front-end, we require a discrete, tag-wise, labelling, so a further interpretation is nedded, to convert the predictions into decisions, and we can use more suitable metrics to evaluate the quality of the tagging. We want the participants of the challenge to optimise this decision problem, leveraging all the possible information available from the groundtruth and the global predictions to design a selection algorithm producing the most consistent labelling. In other words, build a multi-label classifier, receiving, as input, the predictions generated by our neural networks for all tags and their categories.
Our suggested benchmark is a column-wise thresholding (see details below) so this strategy uses neither the categorical predictions, nor the possible correlations between tags. For example, a more row-oriented approach (for each track, select a tag for its prediction value with respect to the predictions for the other tags) or a hierarchical strategy (decide on categories first, then chose tags among the selected categories) may improve the final classifications.
If we find an illiquid asset to be untradeable, then the signal of this asset should not result in a trading position. To counteract this difficulty, an alternative would be to project the signals from illiquid assets to liquid ones.
To do so, the proposed challenge aims at determining the link at a given time t between the returns of illiquid and liquid assets. The one-day return of a stock j at a time t with price Pjt (adjusted for dividends and stock splits) is defined as:
Rjt=Pjt−1Pjt−1
Let Yt=(R1t,…,RLt) be the returns of L liquid assets and Xt=(R1t,…,RNt) be the returns of N illiquid assets at a given time t . The objective of this challenge is to determine a mapping function η:RN→RL , that would link the returns of N illiquid assets to the returns of L liquid assets such that Yt=η(Xt) .
Since predictive signals can be seen as estimated returns, the signals generated by QRT on N illiquid assets, defined by X^t , can be mapped to projected signals Y^t on L liquid instruments such that Y^t=η(X^t) . However, since η is purely theoretical, the mapping must rely on approximations. Therefore, the idea would be to estimate a model η^ that would predict the returns of L=100 liquid instruments, using the returns of N=100 illiquid instruments, given historical data.
The model η^ can then be seen as a multi-output prediction of L returns, or as the combination of L models η^j , for j=1,…,N that would individually predict the return of each liquid instrument j .
For simplicity and practical reasons, we chose to transform this challenge into a classification problem. In practice, we are more interested into being right on the trend instead of the value. Thus, instead of predicting the returns of the liquid assets, the estimated model η^ shall be predicting the signs of the liquid assets.
The metric used to judge the quality of the predictions is a custom weighted accuracy defined by:
f(y,y^)=∥y∥11i=1∑n∣yi∣×1y^i=sign(yi)
where 1y^i=sign(yi) is equal to 1 if the i -th prediction y^i∈{−1,1} is of the same sign as the i -th true value yi . This metric gives more importance to the good classification of high value returns. Indeed, it can be more important to be right on a 7% move than on a 0.5% move.