2017-06-19

画像読み込み、表示のお勉強

OpenCV ROS

例えばros kinect のdepth imageは

Data published on /camera/depth/image_raw is the depth in millimeters as a 16 bit unsigned integer.

のようにパブされている。

[PARTLY UNSOLVED] Raw Kinect Depth Data - ROS Answers: Open Source Q&A Forum

グレースケール16bit画像の読み込み

python - OpenCV - Reading a 16 bit grayscale image - Stack Overflow

OpenCV

#! /usr/bin/env python

import sys
import numpy
import cv2

filename = sys.argv[1]
im = cv2.imread(filename, flags = 2)  # そのまま読み込み
#im = cv2.imread(filename, flags = -1)

imgArray = numpy.asarray(im)

print imgArray

画像とビデオの読み込みと書き込み — opencv v2.1 documentation

Pillow

16bitから8bit画像へ変換

#! /usr/bin/env python

from PIL import Image
import sys
import numpy


filename = sys.argv[1]

im = Image.open(filename)
table=[ i/256 for i in range(65536) ]

im2 = im.point(table,'L')

imgArray1 = numpy.asarray(im)
imgArray2 = numpy.asarray(im2)

print imgArray1
print imgArray2

[SOLVED] PIL convert 16bit grayscale to 8 bit

表示

opencvのimshowで画像が表示されないことがある。以下参考。

参考：ROS×Python勉強会： cv_bridge | demura.net

2017-06-18

機械学習のお勉強（ChainerCV）

Machine Learning

ChainerCV↓

コード

GitHub - chainer/chainercv: ChainerCV: a Library for Computer Vision in Deep Learning

ドキュメント

ChainerCV — ChainerCV 0.2.1 documentation

Detection Models

Faster R-CNN
Single Shot Multibox Detector (SSD)

Semantic Segmentation

SegNet

が実装されている。

コードを読んで理解を深めたい。

2017-06-17

機械学習のお勉強（SVM,ニューラルネット、CNN、FCN,YOLO,SegNet etc ...）~参考まとめ~

Machine Learning

SVM
NN
CNN
AlexNet
VGG
FCN
YOLO
SSD
SegNet
3D-CNN
chainer sample
Fine-tuning
インデックスカラー
画像のセグメンテーション

keras2とchainerが使いやすそう

PASCALのセグメンテーションデータはインデックスカラー(.png)で作られている。

なので、以下のように呼びだせば、例えば人ならば15という値で取り出すことができる。

f:id:robonchu:20170618120932p:plain

import numpy as np
from PIL import Image
import csv

path = '2007_000129.png'
img = Image.open(path)
 
img_array = np.asarray(img, dtype=np.int32)
mask = img_array == 255
img_array[mask] = -1

with open('file.csv', 'wt') as f:
    writer = csv.writer(f)
    writer.writerows(img_array)

左上の配列はこのようになっている。-1はchainerではクラスから無視されるため境界の白色は-1に変換している。

f:id:robonchu:20170618121047p:plain

ImageMagick で PNG の形式を変換 - awm-Tech

インデックスカラー - Wikipedia

「画像変換101」#2: ダイレクトカラー画像とインデックスカラー画像 | OPTPiX Labs Blog

chainerに復帰したくてFCN実装した - MATHGRAM

画像のセグメンテーション

K-Means クラスタリングを使った色ベースのセグメンテーション - MATLAB & Simulink Example - MathWorks 日本

kmeans を使った画像のセグメンテーション - Qiita

参考：

【機械学習】ディープラーニングフレームワークChainerを試しながら解説してみる。 - Qiita

機械学習によるデータ分析まわりのお話

http://www.vision.cs.chubu.ac.jp/flabresearcharchive/bachelor/B13/Paper/fukui.pdf

https://www.morikita.co.jp/data/mkj/084921mkj.pdf

サルでもわかるディープラーニング入門 (2017年) (In Japanese)

Chainerによる畳み込みニューラルネットワークの実装 - 人工知能に関する断創録

chainerの畳み込みニューラルネットワークで10種類の画像を識別（CIFAR-10） - AI-Programming

chainer初心者が畳み込みニューラルネット試してみた - 技術系メモ

http://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/42237.pdf

【初めて使う人向け】Chainerでニューラルネットを学習する手順を整理してみた | 自調自考の旅

Chainer 1.11.0 で畳み込みニューラルネットワークを試してみる - Gunosyデータ分析ブログ

Convolutional Neural Networkを実装する - Qiita

Chainerによる畳み込みニューラルネットワークの実装 - 人工知能に関する断創録

numpyだけでCNN実装 - Qiita

chainerのサンプルコードを集めてみた(チュートリアルも追加) - studylog/北の雲

CNNの学習に最高の性能を示す最適化手法はどれか - 俺とプログラミング

Chainerを使って畳み込みを実装する | JProgramer

怪我をしても歩ける6足歩行ロボットの学習 | Preferred Research

深層強化学習ライブラリChainerRL | Preferred Research

Convolutional Neural Networkとは何なのか - Qiita

定番のConvolutional Neural Networkをゼロから理解する - DeepAge

http://www.nlab.ci.i.u-tokyo.ac.jp/pdf/20150717SP.pdf

http://www.nlab.ci.i.u-tokyo.ac.jp/pdf/CNN_survey.pdf

【深層学習】畳み込みニューラルネットで画像分類 [DW 4日目] - Qiita

Chainerのサンプルコードを集めてみた（メモ） - あおのたすのブログ

2017-06-11

RGBDデータセットのお勉強

Machine Learning Programming

参考：List of RGBD datasets

INDOOR
OUTDOOR
- KITTI
- CITYSCAPES

Gruond Truthとは : 正確さや整合性をチェックするためのデータ。各部分の真のカテゴリー。

【所感】

NYU Dataset
SUN 系
ScanNet 系

がSemanticSegmentation x Indoorのデータセットとして良さそう。

http://www.cs.toronto.edu/~urtasun/courses/CSC2541/08_instance.pdf

360度でのデータセットという意味で、

Stanford 2D-3D-Semantics Dataset

がすごかった。

以下、要チェックなものに☆マーク。

INDOOR

NYU Dataset v1 ☆

Around 51,000 RGBD frames from indoor scenes such as bedrooms and living rooms.

f:id:robonchu:20170611152124p:plain

NYU Depth V1 « Nathan Silberman

NYU Dataset v2 ☆

~408,000 RGBD images from 464 indoor scenes, of a somewhat larger diversity than NYU v1. Per-frame accelerometer data.

NYU Depth V2 « Nathan Silberman

SUN 3D ☆

Labelling: Polygons of semantic class and instance labels on frames propagated through video.

インスタンスを色で分けている

SUN3D Database

SUN RGB-D ☆

Introduced: CVPR 2015
Device: Kinect v1, Kinect v2, Intel RealSense and Asus Xtion Live Pro
Description: New images, plus images taken from NYUv2, B3DO and SUN3D. All of indoor scenes.
Labelling: 10,335 images with polygon annotation, and 3D bounding boxes around objects
The dataset contains RGB-D images from NYU depth v2 [1], Berkeley B3DO [2], and SUN3D [3]. Besides this paper, you are required to also cite the following papers if you use this dataset.

SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite

ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset ☆

Introduced: IJRR 2015
Device: Kinect v1
Description: Five sequences (total 22454 frames) captured from a robot moving through an office environment
Labelling: Scene type of each frame, plus presence/absence of each of a set of 15 objects.

ViDRILO

SceneNN: A Scene Meshes Dataset with aNNotations ☆

We introduce an RGB-D scene dataset consisting of more than 100 indoor scenes. Our scenes are captured at various places, e.g., offices, dormitory, classrooms, pantry, etc., from University of Massachusetts Boston and Singapore University of Technology and Design.

SceneNN: A Scene Meshes Dataset with aNNotations

f:id:robonchu:20170611155625p:plain

Stanford 2D-3D-Semantics Dataset ☆

これすごい…

Device: Matterport Camera (360 degree rotation RGBD sensor)
Description: 360 degree RGBD images captured from 6 large areas in municipal buildings, together with mesh and point cloud reconstructions.
Labelling: Semantic labelling on the mesh (13 classes, plus instance labels), and 3D volumentric reconstruction labels

f:id:robonchu:20170611160112p:plain

Large Scale Parsing

ScanNet ☆

Description: 2.5 million frames from 1513 scenes
Labelling: Automatically computed (and human verified) camera poses and surface reconstructions. Instance and semantic segmentations provided on reconstructed mesh. 3D CAD models + alignment also provided for each scene.

ScanNet

ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes (CVPR 2017 Spotlight) - YouTube

SceneNet RGB-D ☆

Description: 5 million images rendered of 16,895 indoor scenes. Room configuration randomly generated with physics simulator.
Labelling: Camera pose, plus per-pixel instance, class labelling and optical flow.

SceneNet RGB-D: Photorealistic Rendering of 5M Images with Perfect Ground Truth

SUNCG ☆

Description: 45,622 scenes with manually created room and furniture layouts. Images can be rendered from the geometry, but are not provided by default.
Labelling: Object semantic class and instance labelling.

f:id:robonchu:20170611161827p:plain

SUNCG dataset

‘Object Detection and Classification from Large-Scale Cluttered Indoor Scans’

List of RGBD datasets

Cornell-RGBD-Dataset

Scene Understanding for Personal Robots

Active Vision Dataset (AVD)

Description: Dense sampling of images in home and office scenes, captured from a robot. Dataset designed for simulation of motion and instance detection.
Labelling: Per-frame camera pose, object instance bounding boxes, movement pointers between images.

Active Vision Dataset

RGB-D Semantic Segmentation Dataset

.ply: the 3D mesh; can be viewed by means of , e.g., MeshLab.

RGBD Scenes dataset v2

Description: A second set of real indoor scenes featuring objects from the RGBD object dataset.

Object Disappearance for Object Discovery

Papers/IROS2012_Mason_Marthi_Parr - ROS Wiki

OUTDOOR

KITTI

The KITTI Vision Benchmark Suite

CITYSCAPES

Volume

5 000 annotated images with fine annotations
20 000 annotated images with coarse annotations

f:id:robonchu:20170611164535p:plain

Cityscapes Dataset

2017-06-11

ROS message_filtersのお勉強

ROS Programming Python C++

複数のトピックの時間の同期を取りたいときなどに使用する。

Time Synchronizer
ApproximateTime Policy

Time Synchronizer

imageとcamera_infoの同期をとっている

The TimeSynchronizer filter synchronizes incoming channels by the timestamps contained in their headers, and outputs them in the form of a single callback that takes the same number of channels. The C++ implementation can synchronize up to 9 channels.

python

import message_filters
from sensor_msgs.msg import Image, CameraInfo

def callback(image, camera_info):
  # Solve all of perception here...

image_sub = message_filters.Subscriber('image', Image)
info_sub = message_filters.Subscriber('camera_info', CameraInfo)

ts = message_filters.TimeSynchronizer([image_sub, info_sub], 10)
ts.registerCallback(callback)
rospy.spin()

c++

#include <message_filters/subscriber.h>
#include <message_filters/time_synchronizer.h>
#include <sensor_msgs/Image.h>
#include <sensor_msgs/CameraInfo.h>

using namespace sensor_msgs;
using namespace message_filters;

void callback(const ImageConstPtr& image, const CameraInfoConstPtr& cam_info)
{
  // Solve all of perception here...
}

int main(int argc, char** argv)
{
  ros::init(argc, argv, "vision_node");

  ros::NodeHandle nh;

  message_filters::Subscriber<Image> image_sub(nh, "image", 1);
  message_filters::Subscriber<CameraInfo> info_sub(nh, "camera_info", 1);
  TimeSynchronizer<Image, CameraInfo> sync(image_sub, info_sub, 10);
  sync.registerCallback(boost::bind(&callback, _1, _2));

  ros::spin();

  return 0;
}

ApproximateTime Policy

The message_filters::sync_policies::ApproximateTime policy uses an adaptive algorithm to match messages based on their timestamp.

python

import message_filters
from std_msgs.msg import Int32, Float32

def callback(mode, penalty):
  # The callback processing the pairs of numbers that arrived at approximately the same time

mode_sub = message_filters.Subscriber('mode', Int32)
penalty_sub = message_filters.Subscriber('penalty', Float32)

ts = message_filters.ApproximateTimeSynchronizer([mode_sub, penalty_sub], 10, 0.1, allow_headerless=True)
ts.registerCallback(callback)
rospy.spin()

c++

#include <message_filters/subscriber.h>
#include <message_filters/synchronizer.h>
#include <message_filters/sync_policies/approximate_time.h>
#include <sensor_msgs/Image.h>

using namespace sensor_msgs;
using namespace message_filters;

void callback(const ImageConstPtr& image1, const ImageConstPtr& image2)
{
  // Solve all of perception here...
}

int main(int argc, char** argv)
{
  ros::init(argc, argv, "vision_node");

  ros::NodeHandle nh;
  message_filters::Subscriber<Image> image1_sub(nh, "image1", 1);
  message_filters::Subscriber<Image> image2_sub(nh, "image2", 1);

  typedef sync_policies::ApproximateTime<Image, Image> MySyncPolicy;
  // ApproximateTime takes a queue size as its constructor argument, hence MySyncPolicy(10)
  Synchronizer<MySyncPolicy> sync(MySyncPolicy(10), image1_sub, image2_sub);
  sync.registerCallback(boost::bind(&callback, _1, _2));

  ros::spin();

  return 0;
}

f:id:robonchu:20170620220212p:plain

参考：

http://wiki.ros.org/message_filters

message_filtersでタイムスタンプがおおよそ一致した際にコールバックさせる方法 - ゼロから始めるロボットプログラミング入門講座

2017-06-10

フィルタのお勉強

Python 信号処理

Finite Impulse Resposeフィルタ（移動平均）
Infinite Impulse Responseフィルタ
双2次フィルタ
逆フーリエ＆ローパス
カルマンフィルタ
すごくわかりやすい資料

Finite Impulse Resposeフィルタ（移動平均）

y[n] = 1/2 * (x[n] + x[n-1])

Infinite Impulse Responseフィルタ

例：ローパスフィルタ

y[n] = r*x[n] + (1-r)*y[n-1]

yが出力、xが入力、rは係数。

参考：ディジタル制御の基礎

双2次フィルタ

以下の式の係数を調整するだけで、ハイパスやローパスなど様々なフィルタを作成できて便利。

y[n] = (b0/a0)*x[n] + (b1/a0)*x[n-1] + (b2/a0)*x[n-2]
                        - (a1/a0)*y[n-1] - (a2/a0)*y[n-2

yが出力、xが入力、a,bは係数。

で説明してくださっている。

逆フーリエ＆ローパス

【NumPy】高速逆フーリエ変換とローパスフィルタでノイズ除去

カルマンフィルタ

シンプルなモデルとイラストでカルマンフィルタを直感的に理解してみる - Qiita

すごくわかりやすい資料

FIRフィルタ - 人工知能に関する断創録

参考：

プログラムでデジタルフィルタ

Python NumPy SciPy : デジタルフィルタ(ローパスフィルタ)による波形整形 | org-技術

【NumPy】高速逆フーリエ変換とローパスフィルタでノイズ除去

簡単なデジタルフィルタの実装 | C++でVST作り

http://android.ohwada.jp/archives/334

http://www.mech.tohoku-gakuin.ac.jp/rde/contents/sendai/mechatro/archive/RMSeminar_No07_s8.pdf

伝達関数ってなに？ - 制御工学（制御理論）の基礎

http://www12.plala.or.jp/mz80k2/control/control_4.pdf

ディジタル制御の基礎

2017-06-04

Deep learningの論文まとめサイト＆キャッチアップ方法

Machine Learning

論文まとめ

aonotas.hateblo.jp

2016年のディープラーニング論文100選 - Qiita

DeepLearning研究 2016年のまとめ - Qiita

Deep Learningの理論的論文リスト - Obey Your MATHEMATICS.

わかりやすいブログ

Twitter社が発表した超解像ネットワークをchainerで再実装 - 人工言語処理入門

グレースケール16bit画像の読み込み

Pillow

表示

コード

ドキュメント

NN

CNN

AlexNet

VGG

FCN

YOLO

SegNet

3D-CNN

chainer sample

Fine-tuning

インデックスカラー

画像のセグメンテーション

INDOOR

NYU Dataset v1 ☆

NYU Dataset v2 ☆

SUN 3D ☆

SUN RGB-D ☆

ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset ☆

SceneNN: A Scene Meshes Dataset with aNNotations ☆

Stanford 2D-3D-Semantics Dataset ☆

ScanNet ☆

SceneNet RGB-D ☆

SUNCG ☆

‘Object Detection and Classification from Large-Scale Cluttered Indoor Scans’

Cornell-RGBD-Dataset

Active Vision Dataset (AVD)

RGB-D Semantic Segmentation Dataset

RGBD Scenes dataset v2

Object Disappearance for Object Discovery

OUTDOOR

KITTI

CITYSCAPES

Time Synchronizer

ApproximateTime Policy

Finite Impulse Resposeフィルタ（移動平均）

Infinite Impulse Responseフィルタ

双2次フィルタ

逆フーリエ＆ローパス

カルマンフィルタ

すごくわかりやすい資料

論文まとめ

わかりやすいブログ