Tool-Use Model to Reproduce the Goal Situations Considering Relationship Among Tools, Objects, Actions and Effects Using Multimodal Deep Neural Networks

Namiko Saito*, Tetsuya Ogata, Hiroki Mori, Shingo Murata, Shigeki Sugano

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

We propose a tool-use model that enables a robot to act toward a provided goal. It is important to consider features of the four factors; tools, objects actions, and effects at the same time because they are related to each other and one factor can influence the others. The tool-use model is constructed with deep neural networks (DNNs) using multimodal sensorimotor data; image, force, and joint angle information. To allow the robot to learn tool-use, we collect training data by controlling the robot to perform various object operations using several tools with multiple actions that leads different effects. Then the tool-use model is thereby trained and learns sensorimotor coordination and acquires relationships among tools, objects, actions and effects in its latent space. We can give the robot a task goal by providing an image showing the target placement and orientation of the object. Using the goal image with the tool-use model, the robot detects the features of tools and objects, and determines how to act to reproduce the target effects automatically. Then the robot generates actions adjusting to the real time situations even though the tools and objects are unknown and more complicated than trained ones.

Original languageEnglish
Article number748716
JournalFrontiers in Robotics and AI
Volume8
DOIs
Publication statusPublished - 2021 Sept 28

Keywords

  • deep neural networks
  • manipulation
  • multimodal learning
  • recurrent neural networks
  • tool-use

ASJC Scopus subject areas

  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Tool-Use Model to Reproduce the Goal Situations Considering Relationship Among Tools, Objects, Actions and Effects Using Multimodal Deep Neural Networks'. Together they form a unique fingerprint.

Cite this