Guided Visual Attention Model Based on Interactions Between Top-down and Bottom-up Prediction for Robot Pose Prediction

Hyogo Hiruma, Hiroki Mori, Hiroshi Ito, Tetsuya Ogata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep robot vision models are widely used for recognizing objects from camera images, but shows poor performance when detecting objects at untrained positions. Although such problem can be alleviated by training with large datasets, the dataset collection cost cannot be ignored. Existing visual attention models tackled the problem by employing a data efficient structure which learns to extract task relevant image areas. However, since the models cannot modify attention targets after training, it is difficult to apply to dynamically changing tasks. This paper proposed a novel Key-Query-Value formulated visual attention model. This model is capable of switching attention targets by externally modifying the Query representations, namely top-down attention. The proposed model is experimented on a simulator and a real-world environment. The model was compared to existing end-to-end robot vision models in the simulator experiments, showing higher performance and data efficiency. In the real-world robot experiments, the model showed high precision along with its scalability and extendibility.

Original languageEnglish
Title of host publicationIECON 2022 - 48th Annual Conference of the IEEE Industrial Electronics Society
PublisherIEEE Computer Society
ISBN (Electronic)9781665480253
DOIs
Publication statusPublished - 2022
Event48th Annual Conference of the IEEE Industrial Electronics Society, IECON 2022 - Brussels, Belgium
Duration: 2022 Oct 172022 Oct 20

Publication series

NameIECON Proceedings (Industrial Electronics Conference)
Volume2022-October

Conference

Conference48th Annual Conference of the IEEE Industrial Electronics Society, IECON 2022
Country/TerritoryBelgium
CityBrussels
Period22/10/1722/10/20

Keywords

  • neural networks
  • robotics
  • visual attention

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Guided Visual Attention Model Based on Interactions Between Top-down and Bottom-up Prediction for Robot Pose Prediction'. Together they form a unique fingerprint.

Cite this