TY - GEN
T1 - Hetero complementary networks with hard-wired condensing binarization for high frame rate and ultra-low delay dual-hand tracking
AU - Zhang, Peiqi
AU - Luo, Dingli
AU - Du, Songlin
AU - Ikenaga, Takeshi
N1 - Funding Information:
This work was supported by Waseda University Grant for Special Research Projects (2019C-581).
Publisher Copyright:
© 2020 IEEE.
PY - 2020/6
Y1 - 2020/6
N2 - High frame rate, ultra-low delay yet accurate hand tracking system provides a seamless and intuitive interface for Human Computer Interaction (HCI). Tracking multi-person's dual-hand from monocular RGB camera is challenging for hand's variant image feature. Although many CNN based trackers have been proposed on general hardware, they cannot address this challenge with ultra-high speed. This paper proposes: (A) Hetero complementary networks for ultra-high speed dual-hand tracking, where the quick primary result from an FPGA network is intermittently combined with delayed accurate result from a GPU network. (B) Hard-wired condensing binarization for ultrahigh speed network implementation on FPGA. The network is able to be directly mapped as hardware resource because complex computation is condensed into binary layers. The proposed method achieves 69.8% accuracy on test sequences, which is only 4.7% lower compared with the general method. Meanwhile, the estimated FPGA resource utilization is tremendously reduced to 54.7% on the target platform. This work shows the potential to track multi-person's dual-hand at millisecond-level speed.
AB - High frame rate, ultra-low delay yet accurate hand tracking system provides a seamless and intuitive interface for Human Computer Interaction (HCI). Tracking multi-person's dual-hand from monocular RGB camera is challenging for hand's variant image feature. Although many CNN based trackers have been proposed on general hardware, they cannot address this challenge with ultra-high speed. This paper proposes: (A) Hetero complementary networks for ultra-high speed dual-hand tracking, where the quick primary result from an FPGA network is intermittently combined with delayed accurate result from a GPU network. (B) Hard-wired condensing binarization for ultrahigh speed network implementation on FPGA. The network is able to be directly mapped as hardware resource because complex computation is condensed into binary layers. The proposed method achieves 69.8% accuracy on test sequences, which is only 4.7% lower compared with the general method. Meanwhile, the estimated FPGA resource utilization is tremendously reduced to 54.7% on the target platform. This work shows the potential to track multi-person's dual-hand at millisecond-level speed.
KW - Binary neural network
KW - Hetero architecture
KW - High frame rate ultra-low delay
KW - Multi-person dual-hand tracking
UR - http://www.scopus.com/inward/record.url?scp=85091399571&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091399571&partnerID=8YFLogxK
U2 - 10.1109/HSI49210.2020.9142660
DO - 10.1109/HSI49210.2020.9142660
M3 - Conference contribution
AN - SCOPUS:85091399571
T3 - International Conference on Human System Interaction, HSI
SP - 82
EP - 87
BT - Proceedings - 2020 13th International Conference on Human System Interaction, HSI 2020
PB - IEEE Computer Society
T2 - 13th International Conference on Human System Interaction, HSI 2020
Y2 - 6 June 2020 through 8 June 2020
ER -