TY - JOUR
T1 - Coordinated behavior of cooperative agents using deep reinforcement learning
AU - Diallo, Elhadji Amadou Oury
AU - Sugiyama, Ayumi
AU - Sugawara, Toshiharu
N1 - Funding Information:
This work is partly supported by JSPS KAKENHI grant number 17KT0044 .
Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2020/7/5
Y1 - 2020/7/5
N2 - In this work, we focus on an environment where multiple agents with complementary capabilities cooperate to generate non-conflicting joint actions that achieve a specific target. The central problem addressed is how several agents can collectively learn to coordinate their actions such that they complete a given task together without conflicts. However, sequential decision-making under uncertainty is one of the most challenging issues for intelligent cooperative systems. To address this, we propose a multi-agent concurrent framework where agents learn coordinated behaviors in order to divide their areas of responsibility. The proposed framework is an extension of some recent deep reinforcement learning algorithms such as DQN, double DQN, and dueling network architectures. Then, we investigate how the learned behaviors change according to the dynamics of the environment, reward scheme, and network structures. Next, we show how agents behave and choose their actions such that the resulting joint actions are optimal. We finally show that our method can lead to stable solutions in our specific environment.
AB - In this work, we focus on an environment where multiple agents with complementary capabilities cooperate to generate non-conflicting joint actions that achieve a specific target. The central problem addressed is how several agents can collectively learn to coordinate their actions such that they complete a given task together without conflicts. However, sequential decision-making under uncertainty is one of the most challenging issues for intelligent cooperative systems. To address this, we propose a multi-agent concurrent framework where agents learn coordinated behaviors in order to divide their areas of responsibility. The proposed framework is an extension of some recent deep reinforcement learning algorithms such as DQN, double DQN, and dueling network architectures. Then, we investigate how the learned behaviors change according to the dynamics of the environment, reward scheme, and network structures. Next, we show how agents behave and choose their actions such that the resulting joint actions are optimal. We finally show that our method can lead to stable solutions in our specific environment.
KW - Cooperation
KW - Coordination
KW - Deep reinforcement learning
KW - Multi-agent systems
UR - http://www.scopus.com/inward/record.url?scp=85065027331&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065027331&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2018.08.094
DO - 10.1016/j.neucom.2018.08.094
M3 - Article
AN - SCOPUS:85065027331
SN - 0925-2312
VL - 396
SP - 230
EP - 240
JO - Neurocomputing
JF - Neurocomputing
ER -