TY - GEN
T1 - Achieving Human–Robot Collaboration with Dynamic Goal Inference by Gradient Descent
AU - Murata, Shingo
AU - Masuda, Wataru
AU - Chen, Jiayi
AU - Arie, Hiroaki
AU - Ogata, Tetsuya
AU - Sugano, Shigeki
N1 - Funding Information:
This work was supported in part by JST CREST (JPMJCR 15E3), JSPS KAKENHI (JP16H05878), and the Research Institute for Science and Engineering, Waseda University.
Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - Collaboration with a human partner is a challenging task expected of intelligent robots. To realize this, robots need the ability to share a particular goal with a human and dynamically infer whether the goal state is changed by the human. In this paper, we propose a neural network-based computational framework with a gradient-based optimization of the goal state that enables robots to achieve this ability. The proposed framework consists of convolutional variational autoencoders (ConvVAEs) and a recurrent neural network (RNN) with a long short-term memory (LSTM) architecture that learns to map a given goal image for collaboration to visuomotor predictions. More specifically, visual and goal feature states are first extracted by the encoder of the respective ConvVAEs. Visual feature and motor predictions are then generated by the LSTM based on their current state and are conditioned according to the extracted goal feature state. During collaboration after the learning process, the goal feature state is optimized by gradient descent to minimize errors between the predicted and actual visual feature states. This enables the robot to dynamically infer situational (goal) changes of the human partner from visual observations alone. The proposed framework is evaluated by conducting experiments on a human–robot collaboration task involving object assembly. Experimental results demonstrate that a robot equipped with the proposed framework can collaborate with a human partner through dynamic goal inference even when the situation is ambiguous.
AB - Collaboration with a human partner is a challenging task expected of intelligent robots. To realize this, robots need the ability to share a particular goal with a human and dynamically infer whether the goal state is changed by the human. In this paper, we propose a neural network-based computational framework with a gradient-based optimization of the goal state that enables robots to achieve this ability. The proposed framework consists of convolutional variational autoencoders (ConvVAEs) and a recurrent neural network (RNN) with a long short-term memory (LSTM) architecture that learns to map a given goal image for collaboration to visuomotor predictions. More specifically, visual and goal feature states are first extracted by the encoder of the respective ConvVAEs. Visual feature and motor predictions are then generated by the LSTM based on their current state and are conditioned according to the extracted goal feature state. During collaboration after the learning process, the goal feature state is optimized by gradient descent to minimize errors between the predicted and actual visual feature states. This enables the robot to dynamically infer situational (goal) changes of the human partner from visual observations alone. The proposed framework is evaluated by conducting experiments on a human–robot collaboration task involving object assembly. Experimental results demonstrate that a robot equipped with the proposed framework can collaborate with a human partner through dynamic goal inference even when the situation is ambiguous.
KW - Deep learning
KW - Human–robot collaboration
KW - Prediction error minimization
KW - Predictive coding
KW - Robot learning
UR - http://www.scopus.com/inward/record.url?scp=85076913647&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076913647&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-36711-4_49
DO - 10.1007/978-3-030-36711-4_49
M3 - Conference contribution
AN - SCOPUS:85076913647
SN - 9783030367107
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 579
EP - 590
BT - Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
A2 - Gedeon, Tom
A2 - Wong, Kok Wai
A2 - Lee, Minho
PB - Springer
T2 - 26th International Conference on Neural Information Processing, ICONIP 2019
Y2 - 12 December 2019 through 15 December 2019
ER -