We are the editor of PhotoRoom, an iPhone image editor allowing users to work with physical objects instead of pixels. Our segmentation models automatically remove the background from images and suggest edits to users.
In our efforts to experiment with synthetic data, we generated a synthetic dataset made of 20,000 (+ 1,200 for the test set) images. The images are generated:
• From very high-quality 3D models
• Using high-quality textures
• Under various lighting conditions
• Using a physical renderer
We challenge students to tackle the task of segmenting objects from images.
Provided with an image of an object, the goal is to segment the salient (main) object in the scene. For each pixel of the input image, the model must categorize it as foreground or background.
Since the masks are binary masks (each pixel is either 0 (background) or 1(foreground)), we will score submissions using the Dice coefficient. It can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by:
• X is the predicted set of pixels
• Y is the ground truth
The leaderboard score is the mean of the Dice coefficients for each image in the test set.
The test set is made of 1,200 images. Some images represent objects used in the train set while others represent objects unavailable in the test set. The same goes for textures used for the "floor" of images.
The goal is to create a model that can generalize to new shapes and textures.
RGB image of size 1280 x 720 pixels in JPEG format
Binary mask 1280 x 720 pixels where each pixel has a value of 1 for foreground and 0 for background. PNG format.