TY - GEN
T1 - Multimodal integration learning of object manipulation behaviors using deep neural networks
AU - Noda, Kuniaki
AU - Arie, Hiroaki
AU - Suga, Yuki
AU - Ogata, Testuya
PY - 2013/12/1
Y1 - 2013/12/1
N2 - This paper presents a novel computational approach for modeling and generating multiple object manipulation behaviors by a humanoid robot. The contribution of this paper is that deep learning methods are applied not only for multimodal sensor fusion but also for sensory-motor coordination. More specifically, a time-delay deep neural network is applied for modeling multiple behavior patterns represented with multi-dimensional visuomotor temporal sequences. By using the efficient training performance of Hessian-free optimization, the proposed mechanism successfully models six different object manipulation behaviors in a single network. The generalization capability of the learning mechanism enables the acquired model to perform the functions of cross-modal memory retrieval and temporal sequence prediction. The experimental results show that the motion patterns for object manipulation behaviors are successfully generated from the corresponding image sequence, and vice versa. Moreover, the temporal sequence prediction enables the robot to interactively switch multiple behaviors in accordance with changes in the displayed objects.
AB - This paper presents a novel computational approach for modeling and generating multiple object manipulation behaviors by a humanoid robot. The contribution of this paper is that deep learning methods are applied not only for multimodal sensor fusion but also for sensory-motor coordination. More specifically, a time-delay deep neural network is applied for modeling multiple behavior patterns represented with multi-dimensional visuomotor temporal sequences. By using the efficient training performance of Hessian-free optimization, the proposed mechanism successfully models six different object manipulation behaviors in a single network. The generalization capability of the learning mechanism enables the acquired model to perform the functions of cross-modal memory retrieval and temporal sequence prediction. The experimental results show that the motion patterns for object manipulation behaviors are successfully generated from the corresponding image sequence, and vice versa. Moreover, the temporal sequence prediction enables the robot to interactively switch multiple behaviors in accordance with changes in the displayed objects.
UR - http://www.scopus.com/inward/record.url?scp=84893733631&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893733631&partnerID=8YFLogxK
U2 - 10.1109/IROS.2013.6696582
DO - 10.1109/IROS.2013.6696582
M3 - Conference contribution
AN - SCOPUS:84893733631
SN - 9781467363587
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 1728
EP - 1733
BT - IROS 2013
T2 - 2013 26th IEEE/RSJ International Conference on Intelligent Robots and Systems: New Horizon, IROS 2013
Y2 - 3 November 2013 through 8 November 2013
ER -