Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera

Dingli Luo, Songlin Du*, Takeshi Ikenaga

*この研究の対応する著者

研究成果: Article査読

2 被引用数 (Scopus)

抄録

3D human pose estimation has many important applications in human-computer interaction and human action recognition. Simultaneously achieving real-time speed, varying human number, and high accuracy from a single RGB image is a challenging problem. To this end, this paper proposes a multi-task and multi-level neural network structure with physical constraint. The unique network structure estimates 3D human poses from single RGB image in an end-to-end way and achieves both high accuracy and high speed. Experimental results shows that the proposed system achieves 21 fps on RTX 2080 GPU with only 33 mm accuracy loss compared with conventional works. The mechanism of the network is also analyzed through network visualization. This work shows the possibility of estimating 3D human pose from a single RGB monocular camera with real-time speed.

本文言語English
ページ(範囲)27223-27244
ページ数22
ジャーナルMultimedia Tools and Applications
80
18
DOI
出版ステータスPublished - 2021 7月

ASJC Scopus subject areas

  • ソフトウェア
  • メディア記述
  • ハードウェアとアーキテクチャ
  • コンピュータ ネットワークおよび通信

フィンガープリント

「Multi-task neural network with physical constraint for real-time multi-person 3D pose estimation from monocular camera」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル