Challenge Data


Register or login to participate !



Started on Jan. 1, 2021

Challenge context

Valeo is a French global automotive supplier headquartered in France, listed on the Paris Stock Exchange (CAC-40 Index). It supplies a wide range of products to automakers and the aftermarket. The Group employs 113,600 people in 33 countries worldwide. It has 186 production plants, 59 R&D centers and 15 distribution platforms. Its strategy is focused on innovation and development in high-growth potential regions and emerging countries. Valeo ranked as France's leading patent filer from 2016 to 2018.

Challenge goals

The goal of this challenge is to confirm the presence of defects on parts based on pictures taken during production of Power Module in Valeo plant in Sablé sur Sarthe.

During module assembly, an “automatic optical inspection” (AOI) is done after a wire bonding process to check the conformity and quality of the parts. This inspection is based on pictures taken by camera and basic algorithms used to measure some specific parameters on the parts. The AOI machine is efficient to measure dimensions on the parts (width of bonding wire for example) but much less for “aspect” defects. This difficulty to properly analyze this type of defect leads to a large number of parts that must be confirmed manually by operators. In certain conditions, the rate of “false defect” (parts considered KO by machine but OK by operator) could reach 10 or 20% of the production.

The target of this challenge is to define a model that could provide a better result than AOI to discriminate between good and bad parts for aspect defects. For this analysis, we would like to focus only on bonding with thin wire (200um).

Data description

The dataset is composed of images captured by the AOI and details about the inspection process.


  • Inspection images: jpeg format
  • Inspection details:
    • Ref-ID: reference of the part
    • Date: inspection date
    • Die: die location
    • IML (Insulated Molded Layer): position of the leadframe on the carrier
    • Type: inspection type (2001)

The inspection details are available in each image caption

Example of inspected part reference:

Ref-ID_Date_XX_Die_IML_Type => AE00354_115340_00_1_2_2001

XX: 2 digits deployed to manage duplicates.

Associated image:

Ref-ID_Date_Die_IML_Type.jpg => AE00354_115340_00_1_2_2001.jpg


The output is the result of inspection after confirmation by operator.

  • 0: defect confirmed by operator
  • 1: defect not confirmed by operator

The target is to find the best prediction Outputs = f(Inputs)

Benchmark description

For our binary classifier problem, we will use a metric representing the industrial challenge of the application. As illustrated in the confusion matrix below, the target is to avoid “Scrap” (losing money when rejecting good parts) and “Quality issues” which are critical for customers (FP).

Predicted Negative Predicted Positive
Real Negative (bad part) TN FP
Real Positive (good part) FN TP

Therefore, the best model is the one that minimizes the “Scrap” and the “Critical Quality Issues”. The evaluation metric (score C) is the following:

C = (Fn + lambda * FP)/N, when lambda = 100 and is chosen to fit industrial impact of quality issues.

More details about the metric are available in the supplementary files.

The suggested models will be compared to CNN based approaches to evaluate their performances. The first tests allows to obtain a score of 0.502.

The split between the public and the private test may cause a a high gap between scores. Therefore, we added as a supplementary file the indexes of the public test images (public_test_indices.csv).

Enjoy the challenge :)



Files are accessible when logged in and registered to the challenge

The challenge provider