RGBDデータセットのお勉強 - 空飛ぶロボットのつくりかた

参考：List of RGBD datasets

INDOOR
OUTDOOR
- KITTI
- CITYSCAPES

Gruond Truthとは : 正確さや整合性をチェックするためのデータ。各部分の真のカテゴリー。

【所感】

NYU Dataset
SUN 系
ScanNet 系

がSemanticSegmentation x Indoorのデータセットとして良さそう。

http://www.cs.toronto.edu/~urtasun/courses/CSC2541/08_instance.pdf

360度でのデータセットという意味で、

Stanford 2D-3D-Semantics Dataset

がすごかった。

以下、要チェックなものに☆マーク。

INDOOR

NYU Dataset v1 ☆

Around 51,000 RGBD frames from indoor scenes such as bedrooms and living rooms.

f:id:robonchu:20170611152124p:plain

NYU Depth V1 « Nathan Silberman

NYU Dataset v2 ☆

~408,000 RGBD images from 464 indoor scenes, of a somewhat larger diversity than NYU v1. Per-frame accelerometer data.

NYU Depth V2 « Nathan Silberman

SUN 3D ☆

Labelling: Polygons of semantic class and instance labels on frames propagated through video.

インスタンスを色で分けている

SUN3D Database

SUN RGB-D ☆

Introduced: CVPR 2015
Device: Kinect v1, Kinect v2, Intel RealSense and Asus Xtion Live Pro
Description: New images, plus images taken from NYUv2, B3DO and SUN3D. All of indoor scenes.
Labelling: 10,335 images with polygon annotation, and 3D bounding boxes around objects
The dataset contains RGB-D images from NYU depth v2 [1], Berkeley B3DO [2], and SUN3D [3]. Besides this paper, you are required to also cite the following papers if you use this dataset.

SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite

ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset ☆

Introduced: IJRR 2015
Device: Kinect v1
Description: Five sequences (total 22454 frames) captured from a robot moving through an office environment
Labelling: Scene type of each frame, plus presence/absence of each of a set of 15 objects.

ViDRILO

SceneNN: A Scene Meshes Dataset with aNNotations ☆

We introduce an RGB-D scene dataset consisting of more than 100 indoor scenes. Our scenes are captured at various places, e.g., offices, dormitory, classrooms, pantry, etc., from University of Massachusetts Boston and Singapore University of Technology and Design.

SceneNN: A Scene Meshes Dataset with aNNotations

f:id:robonchu:20170611155625p:plain

Stanford 2D-3D-Semantics Dataset ☆

これすごい…

Device: Matterport Camera (360 degree rotation RGBD sensor)
Description: 360 degree RGBD images captured from 6 large areas in municipal buildings, together with mesh and point cloud reconstructions.
Labelling: Semantic labelling on the mesh (13 classes, plus instance labels), and 3D volumentric reconstruction labels

f:id:robonchu:20170611160112p:plain

Large Scale Parsing

ScanNet ☆

Description: 2.5 million frames from 1513 scenes
Labelling: Automatically computed (and human verified) camera poses and surface reconstructions. Instance and semantic segmentations provided on reconstructed mesh. 3D CAD models + alignment also provided for each scene.

ScanNet

ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes (CVPR 2017 Spotlight) - YouTube

SceneNet RGB-D ☆

Description: 5 million images rendered of 16,895 indoor scenes. Room configuration randomly generated with physics simulator.
Labelling: Camera pose, plus per-pixel instance, class labelling and optical flow.

SceneNet RGB-D: Photorealistic Rendering of 5M Images with Perfect Ground Truth

SUNCG ☆

Description: 45,622 scenes with manually created room and furniture layouts. Images can be rendered from the geometry, but are not provided by default.
Labelling: Object semantic class and instance labelling.

f:id:robonchu:20170611161827p:plain

SUNCG dataset

‘Object Detection and Classification from Large-Scale Cluttered Indoor Scans’

List of RGBD datasets

Cornell-RGBD-Dataset

Scene Understanding for Personal Robots

Active Vision Dataset (AVD)

Description: Dense sampling of images in home and office scenes, captured from a robot. Dataset designed for simulation of motion and instance detection.
Labelling: Per-frame camera pose, object instance bounding boxes, movement pointers between images.

Active Vision Dataset