TY - JOUR
T1 - Tool-Use Model to Reproduce the Goal Situations Considering Relationship Among Tools, Objects, Actions and Effects Using Multimodal Deep Neural Networks
AU - Saito, Namiko
AU - Ogata, Tetsuya
AU - Mori, Hiroki
AU - Murata, Shingo
AU - Sugano, Shigeki
N1 - Funding Information:
This research was partially supported by JST Moonshot R and D No. JPMJMS 2031, JSPS Grant-in-Aid for Scientific Research (A) No. 19H01130, and by the Research Institute for Science and Engineering of Waseda University.
Publisher Copyright:
© Copyright © 2021 Saito, Ogata, Mori, Murata and Sugano.
PY - 2021/9/28
Y1 - 2021/9/28
N2 - We propose a tool-use model that enables a robot to act toward a provided goal. It is important to consider features of the four factors; tools, objects actions, and effects at the same time because they are related to each other and one factor can influence the others. The tool-use model is constructed with deep neural networks (DNNs) using multimodal sensorimotor data; image, force, and joint angle information. To allow the robot to learn tool-use, we collect training data by controlling the robot to perform various object operations using several tools with multiple actions that leads different effects. Then the tool-use model is thereby trained and learns sensorimotor coordination and acquires relationships among tools, objects, actions and effects in its latent space. We can give the robot a task goal by providing an image showing the target placement and orientation of the object. Using the goal image with the tool-use model, the robot detects the features of tools and objects, and determines how to act to reproduce the target effects automatically. Then the robot generates actions adjusting to the real time situations even though the tools and objects are unknown and more complicated than trained ones.
AB - We propose a tool-use model that enables a robot to act toward a provided goal. It is important to consider features of the four factors; tools, objects actions, and effects at the same time because they are related to each other and one factor can influence the others. The tool-use model is constructed with deep neural networks (DNNs) using multimodal sensorimotor data; image, force, and joint angle information. To allow the robot to learn tool-use, we collect training data by controlling the robot to perform various object operations using several tools with multiple actions that leads different effects. Then the tool-use model is thereby trained and learns sensorimotor coordination and acquires relationships among tools, objects, actions and effects in its latent space. We can give the robot a task goal by providing an image showing the target placement and orientation of the object. Using the goal image with the tool-use model, the robot detects the features of tools and objects, and determines how to act to reproduce the target effects automatically. Then the robot generates actions adjusting to the real time situations even though the tools and objects are unknown and more complicated than trained ones.
KW - deep neural networks
KW - manipulation
KW - multimodal learning
KW - recurrent neural networks
KW - tool-use
UR - http://www.scopus.com/inward/record.url?scp=85117100309&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85117100309&partnerID=8YFLogxK
U2 - 10.3389/frobt.2021.748716
DO - 10.3389/frobt.2021.748716
M3 - Article
AN - SCOPUS:85117100309
SN - 2296-9144
VL - 8
JO - Frontiers in Robotics and AI
JF - Frontiers in Robotics and AI
M1 - 748716
ER -