Policy Advisory Module for Exploration Hindrance Problem in Multi-agent Deep Reinforcement Learning

Jiahao Peng*, Toshiharu Sugawara

*この研究の対応する著者

研究成果: Conference contribution

抄録

This paper proposes a method to improve the policies trained with multi-agent deep learning by adding a policy advisory module (PAM) in the testing phase to relax the exploration hindrance problem. Cooperation and coordination are central issues in the study of multi-agent systems, but agents’ policies learned in slightly different contexts may lead to ineffective behavior that reduces the quality of cooperation. For example, in a disaster rescue scenario, agents with different functions must work cooperatively as well as avoid collision. In the early stages, all agents work effectively, but when only a few tasks remain with the passage of time, agents are likely to focus more on avoiding negative rewards brought about by collision, but this avoidance behavior may hinder cooperative actions. For this problem, we propose a PAM that navigates agents in the testing phase to improve performance. Using an example problem of disaster rescue, we investigated whether the PAM could improve the entire performance by comparing cases with and without it. Our experimental results show that the PAM could break the exploration hindrance problem and improve the entire performance by navigating the trained agents.

本文言語English
ホスト出版物のタイトルPRIMA 2020
ホスト出版物のサブタイトルPrinciples and Practice of Multi-Agent Systems - 23rd International Conference, 2020, Proceedings
編集者Takahiro Uchiya, Quan Bai, Iván Marsá Maestre
出版社Springer Science and Business Media Deutschland GmbH
ページ133-149
ページ数17
ISBN(印刷版)9783030693213
DOI
出版ステータスPublished - 2021
イベント23rd International Conference on Principles and Practice of Multi-Agent Systems, PRIMA 2020 - Virtual, Online
継続期間: 2020 11月 182020 11月 20

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
12568 LNAI
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference23rd International Conference on Principles and Practice of Multi-Agent Systems, PRIMA 2020
CityVirtual, Online
Period20/11/1820/11/20

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Policy Advisory Module for Exploration Hindrance Problem in Multi-agent Deep Reinforcement Learning」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル