TY - GEN
T1 - Learning Efficient Coordination Strategy for Multi-step Tasks in Multi-agent Systems using Deep Reinforcement Learning
AU - Zhu, Zean
AU - Oury Diallo, Elhadji Amadou
AU - Sugawara, Toshiharu
N1 - Funding Information:
This work is partly supported by JSPS KAKENHI, Grant number 17KT0044.
Publisher Copyright:
Copyright © 2020 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
PY - 2020
Y1 - 2020
N2 - We investigated whether a group of agents could learn the strategic policy with different sizes of input by deep Q-learning in a simulated takeout platform environment. Agents are often required to cooperate and/or coordinate with each other to achieve their goals, but making appropriate sequential decisions for coordinated behaviors based on dynamic and complex states is one of the challenging issues for the study of multi-agent systems. Although it is already investigated that intelligent agents could learn the coordinated strategies using deep Q-learning to efficiently execute simple one-step tasks, they are also expected to generate a certain coordination regime for more complex tasks, such as multi-step coordinated ones, in dynamic environments. To solve this problem, we introduced the deep reinforcement learning framework with two kinds of distributions of the neural networks, centralized and decentralized deep Q-networks (DQNs). We examined and compared the performances using these two DQN network distributions with various sizes of the agents’ views. The experimental results showed that these networks could learn coordinated policies to manage agents by using local view inputs, and thus, could improve their entire performance. However, we also showed that their behaviors of multiple agents seemed quite different depending on the network distributions.
AB - We investigated whether a group of agents could learn the strategic policy with different sizes of input by deep Q-learning in a simulated takeout platform environment. Agents are often required to cooperate and/or coordinate with each other to achieve their goals, but making appropriate sequential decisions for coordinated behaviors based on dynamic and complex states is one of the challenging issues for the study of multi-agent systems. Although it is already investigated that intelligent agents could learn the coordinated strategies using deep Q-learning to efficiently execute simple one-step tasks, they are also expected to generate a certain coordination regime for more complex tasks, such as multi-step coordinated ones, in dynamic environments. To solve this problem, we introduced the deep reinforcement learning framework with two kinds of distributions of the neural networks, centralized and decentralized deep Q-networks (DQNs). We examined and compared the performances using these two DQN network distributions with various sizes of the agents’ views. The experimental results showed that these networks could learn coordinated policies to manage agents by using local view inputs, and thus, could improve their entire performance. However, we also showed that their behaviors of multiple agents seemed quite different depending on the network distributions.
KW - Cooperation
KW - Coordination
KW - Deep Reinforcement Learning
KW - Multi-agent System
UR - http://www.scopus.com/inward/record.url?scp=85083252576&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083252576&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85083252576
T3 - ICAART 2020 - Proceedings of the 12th International Conference on Agents and Artificial Intelligence
SP - 287
EP - 294
BT - ICAART 2020 - Proceedings of the 12th International Conference on Agents and Artificial Intelligence
A2 - Rocha, Ana
A2 - Steels, Luc
A2 - van den Herik, Jaap
PB - SciTePress
T2 - 12th International Conference on Agents and Artificial Intelligence, ICAART 2020
Y2 - 22 February 2020 through 24 February 2020
ER -