SYSU-Clothes Dataset

sample_sysu_cloth

Description

SYSU-Clothes dataset is a new clothing database including elaborately annotated clothing items.

  • 2, 098 high-resolution street fashion photos with totally 59 tags
  • Wide range of styles, accessaries, garments, and pose
  • All images are with image-level annotations
  • 1000+ images are with pixel-level annotations

Citation

“Clothes Co-Parsing via Joint Image Segmentation and Labeling with Application to Clothing Retrieval”, Xiaodan Liang, Liang Lin*, Wei Yang, Ping Luo, Junshi Huang, and Shuicheng Yan,IEEE Transactions on Multimedia (T-MM), 18(6): 1175-1186, 2016.(A shorter previous version was published in CVPR 2014.) 

Downloads


Kinect2 Human Pose Dataset

dataset_K2HGD

Description

Kinect2 Human Gesture Dataset (K2HGD) includes about 100K depth images with various human poses under challenging scenarios. As illustrated in Fig.~\ref{fig:dataset}, this dataset consists of 19 body joints of 30 subjects under ten different challenging scenes. The subject is asked to perform both normal daily poses and unusal poses. The human body joints are defined as follows: \emph{Head, Neck, MiddleSpine, RightShoulder, RightElbow, RightHand, LeftShoulder, LeftElbow, LeftHand, RightHip, RightKnee, RightFoot, LeftHip, LeftKnee, LeftFoot.}. The ground truth body joints are firstly estimated via the Kinect SDK, and further refined by active users.

Citation

Keze Wang, Shengfu Zhai, Hui Cheng, Xiaodan Liang, and Liang Lin. Human Pose Estimation from Depth Images via Inference Embedded Multi-task Learning. In Proceedings of the ACM International Conference on Multimedia (ACM MM), 2016.

 

Downloads


Object Extraction Dataset

dataset_K2HGD

Description

This Object Extraction newly collected by us contains 10183 images with groundtruth segmentation masks. We selected the images from the PASCAL[1], iCoseg[2], Internet [3] dataset as well as other data (most of them are about people and clothes) from the web. We randomly split the dataset with 8230 images for training and 1953 images for testing.

Citation

Xiaolong Wang, Liliang Zhang, Liang Lin*, Zhujin Liang, Wangmeng Zuo, “Deep Joint Task Learning for Generic Object Extraction”, NIPS 2014.

 

Downloads


SYSU-Shape Dataset

SYSU-Shape-Dataset

Description

SYSU-Shapes dataset is a new shape database including elaborately annotated shape contours. Compared with the existing shape databases, this database includes more realistic challenges in shape detection and localization, e.g., cluttered backgrounds, large intraclass variations, and different poses/views, in which part of the instances were originally used for appearance-based object detection.

There are 5 categories, i.e. airplanes, boats, cars, motorbikes, and bicycles, and each category contains 200~500 images.

Citation

Liang Lin, Xiaolong Wang, Wei Yang, and JianHuang Lai, Discriminatively Trained And-Or Graph Models for Object Shape Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), DOI: 10.1109/TPAMI.2014.2359888, 2014.

Downloads


Taobao Commodity Dataset (TCD)

TaobaoCommodity

Description

TCD contains 800 commodity images (dresses, jeans, T-shirts, shoes and hats) from the shops on the Taobao website. The ground truth masks of the TCD dataset are obtained by inviting common sellers of Taobao website to annotate their commodities, i.e., masking salient objects that they want to show from their exhibition. These images include all kinds of commodity with and without human models, thus having complex backgrounds and scenes with highly complex foregrounds. Pixel-accurate ground truth masks are given.

Citation

 Keze Wang, Liang Lin, Jiangbo Lu, Chenglong Li, Keyang Shi. “PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Edge-Preserving Coherence.”, in IEEE Trans. Image Process., vol. 24, no. 10, pp. 3019-3033, 2015. (A shorter previous version was published in CVPR 2013.)

Downloads


SYSU-FLL-CEUS Dataset

TaobaoCommodity

Description

The dataset consists of CEUS data of FLLs in three types: 186 HCC, 109 HEM and 58 FNH instances (i.e. 186 malignant and 167 benign instances). The equipment used was Aplio SSA-770A (Toshiba Medical System), and all videos included in the dataset are collected from pre-operative scans

Citation

Xiaodan Liang, Liang Lin, Qingxing Cao, Rui Huang, Yongtian Wang, “Recognizing Focal Liver Lesions in CEUS with Dynamically Trained Latent Structured Models”. IEEE TRANSACTIONS ON MEDICAL IMAGING (T-MI), 2015

Downloads


HumanParsing-Dataset

Description

This human parsing dataset includes the detailed pixel-wise annotations for fashion images, which is proposed in our TPAMI paper “Deep Human Parsing with Active Template Regression”, and ICCV 2015 paper “Human Parsing with Contextualized Convolutional Neural Network”. This dataset contains 7700 images.We use 6000 images for training,1000 for testing and 700 as the validation set.

Citation

Xiaodan Liang, Si Liu,  Xiaohui Shen ,  Jianchao Yang,  Luoqi Liu, Jian Dong,  Liang Lin, Shuicheng Yan, “Deep Human Parsing with Active Template Regression” , IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), in press, 2015.  

Downloads


CUHK-SYSU

Dataset

Description

The dataset is a large scale benchmark for person search, containing 18,184 images and 8,432 identities.
The dataset can be divided into two parts according to the image sources: street snap and movie: In street snap, images were collected with hand-held cameras across hundreds of scenes and tried to include variations of view-points, lighting, resolutions, occlusions, and background as much as possible.We choose movies and TV dramas as another source for collecting images, because they provide more diversified scenes and more challenging viewpoints.

Citation

Tong Xiao*, Shuang Li*, Bochao Wang, Liang Lin, Xiaogang Wang. Joint Detection and Identification Feature Learning for Person Search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Spotlight, 2017

Downloads


SYSUCT SYSUUS

subCT subUS

Description

The SYSU-CT and SYSU-US are both provided by the first affiliated hospital, Sun Yat-sen University. The SYSU-CT data set is constructed by seven CT volumetric images of liver tumor from different patients:all the patients were scanned using a 64 detector row CT machine (Aquilion64, Toshiba Medical System). The SYSU-US data set consists of 20 US image sequences of abdomen with liver tumor.

Citation

Liang Lin, Wei Yang, Chenglong Li, Jin Tang, and Xiaochun Cao. Inference with Collaborative Model for Interactive Tumor Segmentation in Medical Image Sequences. IEEE Transactions on Cybernetics (T-Cybernetics), 2015

Downloads


CAMPUS-Human Dataset

Description

This database includes general and realistic challenges for people re-identification in surveillance. We record videos using 3 cameras from different views and extract individuals as well as video shots within the videos. We also annotate the body part configurations for each query instance and annotate ID and locations (bounding box) for each video shot. In total, there are 370 reference images (normalized to 175 pixels in height), for 74 individuals, with IDs and locations provided. We extract 214 shots (640 x 360) containing 1519 target individuals. Note that the targets often appear with diverse poses/views or occluded by other people within the scenarios.

Citation

Yuanlu Xu, Liang Lin*, Wei-Shi Zheng, and Xiaobai Liu, “Human Re-identification by Matching Compositional Template with Cluster Sampling”, Proc. of IEEE International Conference on Computer Vision (ICCV), 2013

 

Downloads


Office Activity (OA) Dataset

Description

The Office Activity (OA) dataset collected by us is a more complex activity dataset which covers the common daily activities happened in office. It is a large dataset with 1180 RGB-D activity sequences. To capture human activities in multi-views, we set three RGB-D sensors in different viewpoints for recording and each subject is asked to perform twice in one activity. To increase the variability of the activities, we record them in two different scenes, i.e., two different offices. More importantly, we not only consider the single human activity, but also deal with the problem with more than one people.

Citation

Liang Lin, Keze Wang, Wangmeng Zuo, Meng Wang, Jiebo Luo, and Lei Zhang, “A Deep Structured Model with Radius-Margin Bound for 3D Human Activity Recognition”, International Journal of Computer Vision (IJCV), 118(2): 256-273, 2016.

 

Downloads


Grayscale-Thermal Foreground Detection Dataset

Description

It is urgent need to study the multi-model moving object detection due to its own shortness of inadequate of single model videos. However, almost no complete good multi-model datasets to use, thus, we proposed a multi-model moving object detection dataset and the specific details as followings.Our multi-model moving object detection dataset mainly considered 7 challenges, i.e. interminttent motion, low illumination, bad weather, intense shadow, dynamic scene, background clutter, thermal crossover et al.

The following main aspects are taken into account in creating the grayscale-thermal video:

1. Scene category. Including laboratory rooms, campus roads, playgrounds and water pools et al.

2. Object category. Including rigid and non-rigid objects, such as vehicles, pedestrians and animals.

3. Intermittent motion.

4. Shadow effect.

5. Illumination condition.

6. Background factor.

 

Citation

Chenglong Li, Xiao Wang, Lei Zhang, Jin Tang, Hejun Wu, Liang Lin*, “WELD: Weighted Low-rank Decomposition for Robust Grayscale-Thermal Foreground Detection”, IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), DOI: 10.1109/TCSVT.2016.2556586, 2016.

 

Downloads


Grayscale-Thermal Object Tracking (GTOT) Benchmark

Description

We collected 50 grayscale-thermal video clips under different scenarios and conditions, e.g., office areas, public roads, and water pool, etc. Each grayscale video is paired with one thermal video. We manually annotated them with ground truth bounding boxes. All annotations are done by a full-time annotator, to guarantee consistency.

 

Citation

Chenglong Li, Hui Cheng, Shiyi Hu, Xiaobai Liu, Jin Tang, and Liang Lin, “Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking”, IEEE Transactions on Image Processing, 2016.

 

Downloads