Automotive
The true predictions (y_test file) for this challenge were accidentally leaked on February the 10th 2021 against our will. As a few people had access to this file you can still participate but the winners at the end of 2021 won’t be rewarded for this challenge. The Challenge Data team.
Started on Jan. 1, 2021
Valeo is a French global automotive supplier headquartered in France, listed on the Paris Stock Exchange (CAC-40 Index). It supplies a wide range of products to automakers and the aftermarket. The Group employs 113,600 people in 33 countries worldwide. It has 186 production plants, 59 R&D centers and 15 distribution platforms. Its strategy is focused on innovation and development in high-growth potential regions and emerging countries. Valeo ranked as France's leading patent filer from 2016 to 2018.
The goal of this challenge is to confirm the presence of defects on parts based on pictures taken during production of Power Module in Valeo plant in Sablé sur Sarthe.
During module assembly, an “automatic optical inspection” (AOI) is done after a wire bonding process to check the conformity and quality of the parts. This inspection is based on pictures taken by camera and basic algorithms used to measure some specific parameters on the parts. The AOI machine is efficient to measure dimensions on the parts (width of bonding wire for example) but much less for “aspect” defects. This difficulty to properly analyze this type of defect leads to a large number of parts that must be confirmed manually by operators. In certain conditions, the rate of “false defect” (parts considered KO by machine but OK by operator) could reach 10 or 20% of the production.
The target of this challenge is to define a model that could provide a better result than AOI to discriminate between good and bad parts for aspect defects. For this analysis, we would like to focus only on bonding with thin wire (200um).
The dataset is composed of images captured by the AOI and details about the inspection process.
The inspection details are available in each image caption
Example of inspected part reference:
Ref-ID_Date_XX_Die_IML_Type => AE00354_115340_00_1_2_2001
XX: 2 digits deployed to manage duplicates.
Associated image:
Ref-ID_Date_Die_IML_Type.jpg => AE00354_115340_00_1_2_2001.jpg
The output is the result of inspection after confirmation by operator.
The target is to find the best prediction Outputs = f(Inputs)
Erratum: the test image 'AE00072_145326_00_1_2_2001.jpg' is missing. Please ignore it, your submission for this image will automatically be replaced by the correct one.
For our binary classifier problem, we will use a metric representing the industrial challenge of the application. As illustrated in the confusion matrix below, the target is to avoid “Scrap” (losing money when rejecting good parts) and “Quality issues” which are critical for customers (FP).
Predicted Negative | Predicted Positive | |
---|---|---|
Real Negative (bad part) | TN | FP |
Real Positive (good part) | FN | TP |
Therefore, the best model is the one that minimizes the “Scrap” and the “Critical Quality Issues”. The evaluation metric (score C) is the following:
C = (Fn + lambda * FP)/N, when lambda = 100 and is chosen to fit industrial impact of quality issues.
More details about the metric are available in the supplementary files.
The suggested models will be compared to CNN based approaches to evaluate their performances. The first tests allows to obtain a score of 0.502.
The split between the public and the private test may cause a a high gap between scores. Therefore, we added as a supplementary file the indexes of the public test images (public_test_indices.csv
).
Enjoy the challenge :)
Files are accessible when logged in and registered to the challenge