K-Hairstyle: A Large-scale Korean hairstyle dataset for virtual hair editing and hairstyle classification

ICIP 2021

Taewoo Kim*
KAIST

Chaeyeon Chung*
KAIST

Sunghyun Park*
KAIST

Gyojung Gu
Nestyle

Keonmin Nam
Nestyle

Wonzo Choe
Smilegate AI

Jaesung Lee
Aiinplanet

Jaegul Choo
KAIST

Abstract

The hair and beauty industry is a fast-growing industry. This led to the development of various applications, such as virtual hair dyeing or hairstyle translations, to satisfy the customer needs. Although several hairstyle datasets are available for these applications, they often consist of a relatively small number of images with low resolution, thus limiting their performance on high-quality hair editing. In response, we introduce a novel large-scale Korean hairstyle dataset, Khairstyle, containing 500,000 high-resolution images. In addition, K-hairstyle includes various hair attributes annotated by Korean expert hairstylists as well as hair segmentation masks. We validate the effectiveness of our dataset via several applications, such as hair dyeing, hairstyle translation, and hairstyle classification.

Overview

K-hairstyle provides 500,000 high-resolution images with a rich set of annotations, such as hairstyle classes, hair segmentation masks, and various attributes.

High-resolution image. The images with a maximum resolution of 4032×3024 are collected using high-end cameras.
Large-scale dataset. We provide 500,000 images, more than any other existing hairstyle datasets.
Multi-view image. The dataset contains multi-view images that are captured from various camera angles for each person. The angles include 2 different vertical camera angles and about 10 to 60 different horizontal angles.
Segmentation mask. The hair regions of images are manually labeled in the form of a polygon. The blurred face regions are also labeled in the same way.
Hairstyle attributes. Various hairstyle-related attributes are annotated by Korean expert hairstylists. In detail, different hairstyles are categorized into 31 types, and 63 additional attributes, such as hair color, length, and curl, are also labeled.
Blurred face. Due to the privacy issue, we made the facial region blurry.

Annotations

K-hairstyle includes 11 hairstyle-related annotations and 9 additional annotations including segmentation masks. The details of 11 hairstyle-related annotations are as follows:

Basestyle: type of hairstyles, categorical data.
Basestyle_type: type of hair length, categorical data.
Length: detailed type of hair length, categorical data.
Curl: type of curl, categorical data.
Bang: type of bangs, categorical data.
Loss: degree of hair loss, categorical data.
Side: type of side hair, categorical data.
Color: type of hair color, categorical data.
Exceptional: type of exceptional hairstyle, categorical data.
Rgb: mean rgb value of hair region based on its hair segmentation mask, 3-dim array of float.
Before-after: whether a photo is taken before styling or after styling, categorical data.

The details of 12 additional annotations including segmentation masks are as follows:

Id: unique id for each image, categorical data.
Path: path of an image, string data.
Source: identity of an image (not unique for each image since we have multiple images for each identity), categorical data.
Age: age of a person in an image, int data.
Gender: gender of a person in an image, categorical data.
Height: image height, int data.
Width: image width, int data.
Front: whether the image is facing the front, boolean data.
Horizontal: horizontal camera angle which ranges from 0 (the front) to 360, int data.
Vertical: vertical camera angle, categorical data.
Hair mask: polygon coordinates of hair segmentation mask, string data.
Face mask: polygon coordinates of blurred face segmentation mask, string data.

The examples of multi-view image are as follows:

Downloads (220618 updated)

We provide mqset, hqset, and rawset each of which contains 512x512, 1024x1024, and 4032×3024 images, respectively. In case of mqset and hqset, the images are cropped, including its entire hair. Note that the provided annotation files are for mqset. The annotations for hqset and rawset can be obtained using the file name in path attribute. For instance, if there is 'ABC1234.jpg' in hqset or rawset, the data instance whose path contains 'ABC1234.jpg' can be matched to the image. Both the image files and the annotation files can be downloaded via the links below.

Training: [mqset] [hqset] [rawset]
Validation: [mqset] [hqset] [rawset]

Paper and Supplementary Material

[Paper]

ICIP 2021.
Taewoo Kim*, Chaeyeon Chung*, Sunghyun Park*, Gyojung Gu, Keonmin Nam, Wonzo Choe, Jaesung Lee, Jaegul Choo.
"K-Hairstyle: A Large-scale Korean hairstyle dataset for virtual hair editing and hairstyle classification"