Data Scientist
Started on Jan. 4, 2021
Reducing our overall energy consumption is not a choice anymore ... and towards this, being able to predict when and how we’re using the energy is a crucial knowledge. It lets the energy supplier anticipate the needs and adapt its strategy.
Today electric vehicles are being more and more ordinary, and EV charging stations are consequently being deployed, creating a significant need for electric energy. Being able to predict the usage of these stations and the behavior of the drivers is thus necessary to optimize the energy network. Planète OUI, as an energy supplier, have a great interest in this problematic which falls into the scope of the Consumption Prediction team.
Planète OUI is a French energy supplier, offering to individuals and professionals a 100% renewable energy mix, produced in France through wind turbines, solar panels and hydraulic turbines. Established in 2007 and based in Lyon and Lille, it is today a 120-person company furnishing energy to over 70000 electricity meters. The team operates on all levels of the renewable electricity market: aggregation, production assets management, energy trading, capacity certification, and green electricity supply to the final consumers. Planète OUI is one of the first French green electricity supplier and promotes an green energetic transition for all.
The objective of this challenge is to design a model capable of predicting the usage of some EV charging stations in Paris, more specifically the times when they are available, actively charging a car, plugged, offline or down.
Each terminal is identified by its id contructed as follow: S{sid}-T{tid}
with {sid}
the id of the station and {tid}
the terminal number in the station. So all terminals belonging to the same station share the same S{sid}
appendix.
This is a timeseries multiclass problem whose objective is to predict the state of each terminal during a period of time. The timseries data is given in Y_train and the candidate can choose to base his model on Y_train only. Some optionnal contextual information are given in X_train, X_test and Static_Info.
A dataframe with one column per terminal and one row per timestamps giving the evolution of the status of each terminal over the training period
X train and X test give additional information:
Static_Info gives some information about each terminal:
The data has been prepared based on several public datasets:
The metric used is a Weighted F1 score applied to a multiclass problem.
For each terminal and state:
For each state:
Overall:
The reference score will be computed by identifying the average behaviour over the week days and the weekend days (without considering additional factors) and applying those behaviours over the period of testing.
Files are accessible when logged in and registered to the challenge