tf-pose-estimationのコード理解 - 空飛ぶロボットのつくりかた

やりたいこと

tf-pose-estimationを用いた面白いタスクを作るため、tf-pose-estimationを理解する

GitHub - ildoonet/tf-pose-estimation: Openpose from CMU implemented using Tensorflow with Custom Architecture for fast inference.

いつのまにかROS対応してる。素晴らしい。

rosでの動かし方

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install ros-kinetic-video-stream-opencv
sudo apt-get install ros-kinetic-image-view
git clone https://github.com/ildoonet/ros-video-recorder.git
git clone https://github.com/ildoonet/tf-pose-estimation.git
pip install -U setuptools
pip install -r tf-pose-estimation/requirements.txt
catkin_make
roslaunch tfpose_ros demo_video.launch

Tips

launchのモデルをmobilenetに変更するとCPU only でもかろうじて動く！

msg

BodyPartElm.msg

int32 part_id
float32 x
float32 y
float32 confidence

Person,msg

BodyPartElm[] body_part

Persons.msg

Person[] persons
uint32 image_w
uint32 image_h
Header header

broadcaster_ros.py

TfPoseEstimatorROSのノード

Personsのmsg型に変換

def humans_to_msg(humans):
    persons = Persons()

    for human in humans:
        person = Person()

        for k in human.body_parts:
            body_part = human.body_parts[k]

            body_part_msg = BodyPartElm()
            body_part_msg.part_id = body_part.part_idx
            body_part_msg.x = body_part.x
            body_part_msg.y = body_part.y
            body_part_msg.confidence = body_part.score
            person.body_part.append(body_part_msg)
        persons.persons.append(person)

    return persons

以下のようにしてpathを取得することができる

        w, h = model_wh(model)
        graph_path = get_graph_path(model)

        rospack = rospkg.RosPack()
        graph_path = os.path.join(rospack.get_path('tfpose_ros'), graph_path)

estimator.py

from estimator import TfPoseEstimator

画像から予測するスクリプトで

class BodyPart:
    """                                                                                                                                        
    part_idx : part index(eg. 0 for nose)                                                                                                      
    x, y: coordinate of body part                                                                                                              
    score : confidence score                                                                                                                   
    """
    __slots__ = ('uidx', 'part_idx', 'x', 'y', 'score')

    def __init__(self, uidx, part_idx, x, y, score):
        self.uidx = uidx
        self.part_idx = part_idx
        self.x, self.y = x, y
        self.score = score

    def get_part_name(self):
        return CocoPart(self.part_idx)

    def __str__(self):
        return 'BodyPart:%d-(%.2f, %.2f) score=%.2f' % (self.part_idx, self.x, self.y, self.score)

で構成されたhuman情報を返す

新しいpythonの知識

slots： ``__slots__``を使ってメモリを節約 - Qiita

Pythonで__slots__を使う - StoryEdit 開発日誌

@staticmethod： Pythonの「@staticmethod」はどのように役立つのか - モジログ

def str_(self): str()メソッドは、print(x)するときにも呼び出される。

networks.py

from networks import get_graph_path, model_wh

def get_graph_path(model_name):
    return {
        'cmu_640x480': './models/graph/cmu_640x480/graph_opt.pb',
        'cmuq_640x480': './models/graph/cmu_640x480/graph_q.pb',

        'cmu_640x360': './models/graph/cmu_640x360/graph_opt.pb',
        'cmuq_640x360': './models/graph/cmu_640x360/graph_q.pb',

        'mobilenet_thin_432x368': './models/graph/mobilenet_thin_432x368/graph_opt.pb',
    }[model_name]


def model_wh(model_name):
    width, height = model_name.split('_')[-1].split('x')
    return int(width), int(height)

run_webcam.py

tf-pose-estimation/run_webcam.py at master · ildoonet/tf-pose-estimation · GitHub

引数

parser.add_argument('--camera', type=int, default=0) ：カメラ番号の選択
parser.add_argument('--zoom', type=float, default=1.0) ： zoom
parser.add_argument('--model', type=str, default='mobilenet_thin_432x368', help='cmu_640x480 / cmu_640x360 / mobilenet_thin_432x368') : モデルを選択できる。mobilenetが高速。

zoom

        if args.zoom < 1.0:
            canvas = np.zeros_like(image)
            img_scaled = cv2.resize(image, None, fx=args.zoom, fy=args.zoom, interpolation=cv2.INTER_LINEAR)
            dx = (canvas.shape[1] - img_scaled.shape[1]) // 2
            dy = (canvas.shape[0] - img_scaled.shape[0]) // 2
            canvas[dy:dy + img_scaled.shape[0], dx:dx + img_scaled.shape[1]] = img_scaled
            image = canvas
        elif args.zoom > 1.0:
            img_scaled = cv2.resize(image, None, fx=args.zoom, fy=args.zoom, interpolation=cv2.INTER_LINEAR)
            dx = (img_scaled.shape[1] - image.shape[1]) // 2
            dy = (img_scaled.shape[0] - image.shape[0]) // 2
image = img_scaled[dy:image.shape[0], dx:image.shape[1]]