PhotoRoom
Started on Jan. 6, 2020
We are the editor of PhotoRoom, an iPhone image editor allowing users to work with physical objects instead of pixels. Our segmentation models automatically remove the background from images and suggest edits to users.
In our efforts to experiment with synthetic data, we generated a synthetic dataset made of 20,000 (+ 1,200 for the test set) images. The images are generated:
We challenge students to tackle the task of segmenting objects from images.
Provided with an image of an object, the goal is to segment the salient (main) object in the scene. For each pixel of the input image, the model must categorize it as foreground or background.
Since the masks are binary masks (each pixel is either 0 (background) or 1(foreground)), we will score submissions using the Dice coefficient. It can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by:
where:
The leaderboard score is the mean of the Dice coefficients for each image in the test set.
The train set and images from the test set are available here
The test set is made of 1,200 images. Some images represent objects used in the train set while others represent objects unavailable in the test set. The same goes for textures used for the "floor" of images.
The goal is to create a model that can generalize to new shapes and textures.
RGB image of size 1280 x 720 pixels in JPEG format
Binary mask 1280 x 720 pixels where each pixel has a value of 1 for foreground and 0 for background. PNG format.
To reduce the size of the submission, the output will be encoded using run-length encoding (RLE).
The expected output is a csv with a header and two named columns:
img
: the id of the image. Example 1103_1103
rle_mask
: the RLE encoding of the maskA sample submission is provided as well as a repository containing:
Files are accessible when logged in and registered to the challenge