Frequently Asked Questions about Challenge Data.
The Challenge Data initiative was started in order to provide an easy access to supervised machine learning datasets for educational purposes. The datasets are small enough so they can be studied on standard machines, and still retain the interesting features of real data.
Registration to Challenge Data is free and the platform is open to everyone, although it was primarily designed for machine learning students and professors.
Participant accounts are made for the actual competitors of the challenges. Students who participate to a challenge as part of a course project must choose this kind of account. Professor accounts allow professors to create course projects in which their students can enroll. Through this account, they can track their students’ activity and rankings, and easily download their scientific reports. It is also possible to take part in challenges as simple competitors with these accounts. Challenge provider accounts are made for our partners who provide the datasets. It allows to monitor the activity of their challenge through a private dashboard and access live public and private rankings as well as reports communicated by participants. Please contact us if you would like to propose a challenge before creating such an account.
The test set in split equally between a public part on which the public ranking is established and a private part on which final ranking and intermediary rankings including academic ranking are computed. Scores communicated after each submission are public scores computed on the public part of the test set. The best public score of each participant is used to establish the public ranking. In order to avoid overfitting, final ranking, academic ranking as well as any intermediary ranking are established using the private score associated to a submission chosen by participants computed on the private part of the test set. Thus, the private score of a participant used to establish its academic or final ranking can be different from its best private score according to the submission chosen. The default submission chosen to establish final and academic ranking is the one associated to their best public score. It can be changed on the “Submissions” tab of a challenge page or directly onto a participant’s MySpace.
Go to the sign up page and select the account type according to your needs. Use a password with letters, numbers and special characters. Check your mailbox (including your spam folder) for an email from Challenge Data, and click on the link contained in this email to activate your account. If you are a Professor or Challenge provider account, a validation by administrators is required to activate all your account’s functionalities.
Go to the Challenges page. Select the challenge you want to participate in. If you participate as part of a specific course in an affiliated academic program, you must select the appropriate course in the list. Note that you cannot affect multiple challenges to a given course, so choose this challenge carefully.
Yes, participations to a challenge is possible for teams of up to 5 participants. Participants of a team can be added at the time of registration of any participant of the team, or at any time after registration directly onto the challenge page, provided no member of the team has already submitted a solution file.
Once you are registered to a challenge, you can submit a prediction file from the "The challenge" tab of a challenge. The prediction file must be a .csv file with the correct number of lines and columns, and the correct IDs in the ‘ID’ column, otherwise the evaluation script will raise an error. Along with the prediction file, you can enter the name of the algorithm you used and the description of the parameters. These pieces of information will remain invisible to other participants, so do not hesitate to fill them precisely. They will be stored in your personal space, allowing you to track your progress on this challenge.
In order to avoid overfitting the test set, submissions are limited to 2 every 24 hours be it for a solo participant or a team.
In addition to the potential rewards offered by challenge providers on each challenge during private award ceremonies, the Challenge Data organization will reward the best participant (or team of participants) of each challenge at the end of a season with a prize during a public closing and award ceremony, whose details shall be timely published on the Home page.
According the free exchange spirit of our initiative, participants to a challenge are invited to write a report as a scientific article, analyzing and explaining the performance of their algorithms, and providing references of published articles used, and upload it on the platform at the end of a challenge. This report can be made public or available only to challenge providers. Students registered to a course can on top submit a report at any time during the course which will be available to professors, and can elect to make it as well public or available to challenge providers, fostering the free exchange principle of this initiative.
As a professor, once your account has been validated, you can create a course from your personal space “MySpace” by clicking on the link "Create a course"; you are then asked to enter the name of the course. After the creation of the course, you need to select which projects you want to link with the course. To access the list of the challenges, go to the Challenges page. For each challenge you want to link, click on this challenge. Once you are on the page of the challenge, click on "Link a course" and select the relevant course. Only the challenges linked with your course will be available to your students. All the challenges linked to your course are reported in your personal space, under the tab My Courses. If you ask students to upload reports related to their works, you will be able to download all reports in a single click in this section, with the link "Get all reports".
Once a challenge is linked to a course, two new tabs appear on the page of this challenge: "Current ranking" and "Participants information". The tab "Current ranking" gives you access to the ranking of the competitors obtained on the public half of the testing dataset, along with their scores. It is visible by all participants. The tab "Participants information" provides you with more specific information: how many times your students submitted new solutions, their scores, a contact link and a link to download individual reports. To download all reports in one click, go to My Courses and click on the link "Get all reports".