[Announcement: Call for new challenges]

We are looking for new challenges for the next season, which will begin in January 2020. The official call for projects can be found here. The deadline to apply is October 21st 2019. An application template as well as a technical appendix are available to help you. The current season will remain open until January 1st 2020.

The Challenge Data team

Optimizing well-being at work
This challenge proposes to develop machine learning based approaches so as to predict individuals' comfort model using several time series of environmental data obtained from sensors in a large building. The objective is to learn a classifier that uses these time series as inputs to predict the associated comfort class computed as an average of the comfort classes of all individuals in the building, assumed to experience the same environmental conditions.
Prediction of daily stock movements on the US market
The goal of this challenge is to predict the sign of the returns (= price change over some time interval) at the end of about 700 days for about 700 stocks.
Dynamic Profile Forecasting
We would like to forecast 7 dynamic profile time-series, modelling the consumption shape of several mass-market customer groups (residential and small businesses with subscribed power up to 36 kVA) thanks to meteorological and calendar data, as well as any other real time dataset potentially correlated with consumption patterns. Those profiles are coefficients (without units) for each half-hour in the dataset. The dataset size depends on each specific profile (collected from Oct 13th, 2013 onwards for residential profiles and from Nov 1st, 2016 for commercial profiles). This challenge is about forecasting dynamic profiles values from their past values and all the components of Enedis’ Half hourly Electrical Balancing. The Testing period will be in the past, from July 1st, 2017 to June 30th, 2018. There are many possible explanatory variables since consumption patterns are linked to consumers’ behavior and economic activity. Weather conditions (cold spell / heat wave) and business holidays will impact energy consumption but some other factors may also contribute to modifying energy consumption.
Volatility prediction in financial markets
Use past volatilities and price changes of financial instruments to predict future volatility and control the risk of financial portfolios Community forum for sharing ideas and making faster progress: http://datachallenge.cfm.fr/ Additional information can also be found on this forum and after registering on the Challenge Data website. A video presentation of the challenge at Collège de France is available at: https://www.college-de-france.fr/site/stephane-mallat/Prediction-de-volatilite-de-marches-financiers-par-CFM.htm (in French).
Predicting response times of the Paris Fire Brigade vehicles
Your task will be to predict the delay between the selection of a rescue vehicle (the time when a rescue team is warned) and the time when it arrives at the scene of the rescue request (manual reporting via portable radio).
Solve 2x2x2 Rubik's cube
The goal is to design an automatic Rubik's analyzer that estimates the current length of the shortest path to the solution. Algebraic manipulations of this type could be used in different contexts and solve complex problems. Considering a new unseen configuration on the 2x2x2 Rubik's Cube, the goal of the challenge is to predict the length of the shortest path to the solution.
Drug-related questions classification
The goal of Posos challenge is to predict for each question the associated intent.
Exotic pricing with multidimensional non-linear interpolation
The purpose of the challenge is to use a training set of 1 million prices to learn how to price a specific type of instruments described by 23 parameters by nonlinear interpolation on these prices. The benefit would be to singularly accelerate computation time while retaining good pricing precision. The exotic option to price is one contained in a callable debt instrument whose final redemption amount, coupon payments and callability are conditional on the performance of a basket of three stocks or equity indices relatively to certain barriers. All parameters have been normalized to be between 0 and 1. The given price has also been normalized between 0 and 1. Because the 0 price is in the center of the set of pricings, most of the prices are around 0.5.
Historical consumption regression for electricity supply pricing
The goal of the challenge is to predict, based on the analysis of the correlation of a year of consumption and weather training data, the electricity consumption of two given sites for a test year. In operational conditions, the new consumption profiles would be integrated to electricity supply pricing analysis.
Building Claim Prediction
<p>The goal of the challenge is to predict if a building will have an insurance claim during a certain period. You will have to predict a probability of having at least one claim over the insured period of a building. The model will be based on the building characteristics. The target variable is a: </p> <ul> <p> <li> - 1 if the building has at least a claim over the insured period.</li></p> <p> <li> - 0 if the building doesn’t have a claim over the insured period.</li></p> </ul> <p>During this challenge, you are encouraged to use external data. For instance: shops number by INSEE code (geographical code), unemployment rate by INSEE code, weather…</p> <p>Some data can be found on the following website: <a href="https://www.data.gouv.fr/fr/">data.gouv.fr</a> </p>
Screening and Diagnosis of esophageal cancer from in-vivo microscopy images
The goal of this challenge is to build an image classifier to assist physicians in the screening and diagnosis of esophageal cancer. Such a tool would have a massive impact on patient management and patient lives.
Detecting breast cancer metastases
The goal of this challenge is to develop new algorithms to detect metastases in images of patients diagnosed with breast cancer.