We release a dataset consisting of 101 concepts with 3-15 images in each concept for evaluating model customization methods. Target real images of each concept in the Dataset are shown below.
We introduce both single-concept and multi-concept settings with evaluation text prompts for each case. Below we show random samples with Ours, DreamBooth, and Textual Inversion method for each concept. Scroll horizontally to see all samples with different test prompts.
Dataset and prompt creation: we collected images from Unsplash or ourselves for concepts across a variety of categories, namely, toys, plushies, wearables, scenes, transport vehicles, furniture, home decor items, luggage, human faces, musical instruments, rare flowers, food items, pet animals. For creating evaluation prompts, we first used ChatGPT to generate 40 image captions for each concept with the instructions to either (1) change the background while keeping the main subject, (2) insert a new object/living thing in the scene along with the main subject, (3) style variation of the main subject, and (4) change the property or material of the main subject. The generated text prompts are manually filtered or modified to get the final 20 prompts for each concept. A similar strategy is applied for multiple concepts. Some of the prompts are also inspired by other concurrent works e.g. Perfusion, DreamBooth, SuTI, BLIP-Diffusion etc.
License: Images taken from UnSplash are under Unsplash License. Images collected by us are released under CC BY-SA 4.0 license. Flower category images are downloaded from Wikimedia/Flickr/Pixabay and the link to orginial images can also be found here
Please refer to our code for details regarding dataset download, text prompts, and evaluation code for single-concept and multi-concept customization.
We are grateful to Sheng-Yu Wang, Songwei Ge, Daohan Lu, Ruihan Gao, Roni Shechtman, Avani Sethi, Yijia Wang, Shagun Uppal, and Zhizhuo Zhou for helping with the dataset collection, and Nick Kolkin for the feedback.