TY - JOUR
T1 - A graph-based evolutionary algorithm
T2 - Genetic Network Programming (GNP) and its extension using reinforcement learning
AU - Mabu, Shingo
AU - Hirasawa, Kotaro
AU - Hu, Jinglu
PY - 2007/9
Y1 - 2007/9
N2 - This paper proposes a graph-based evolutionary algorithm called Genetic Network Programming (GNP). Our goal is to develop GNP, which can deal with dynamic environments efficiently and effectively, based on the distinguished expression ability of the graph (network) structure. The characteristics of GNP are as follows. 1) GNP programs are composed of a number of nodes which execute simple judgment/processing, and these nodes are connected by directed links to each other. 2) The graph structure enables GNP to re-use nodes, thus the structure can be very compact. 3) The node transition of GNP is executed according to its node connections without any terminal nodes, thus the pst history of the node transition affects the current node to be used and this characteristic works as an implicit memory function. These structural characteristics are useful for dealing with dynamic environments. Furthermore, we propose an extended algorithm, "GNP with Reinforcement Learning (GNP-RL)" which combines evolution and reinforcement learning in order to create effective graph structures and obtain better results in dynamic environments. In this paper, we applied GNP to the problem of determining agents' behavior to evaluate its effectiveness. Tileworld was used as the simulation environment. The results show some advantages for GNP over conventional methods.
AB - This paper proposes a graph-based evolutionary algorithm called Genetic Network Programming (GNP). Our goal is to develop GNP, which can deal with dynamic environments efficiently and effectively, based on the distinguished expression ability of the graph (network) structure. The characteristics of GNP are as follows. 1) GNP programs are composed of a number of nodes which execute simple judgment/processing, and these nodes are connected by directed links to each other. 2) The graph structure enables GNP to re-use nodes, thus the structure can be very compact. 3) The node transition of GNP is executed according to its node connections without any terminal nodes, thus the pst history of the node transition affects the current node to be used and this characteristic works as an implicit memory function. These structural characteristics are useful for dealing with dynamic environments. Furthermore, we propose an extended algorithm, "GNP with Reinforcement Learning (GNP-RL)" which combines evolution and reinforcement learning in order to create effective graph structures and obtain better results in dynamic environments. In this paper, we applied GNP to the problem of determining agents' behavior to evaluate its effectiveness. Tileworld was used as the simulation environment. The results show some advantages for GNP over conventional methods.
KW - Agent
KW - Evolutionary computation
KW - Graph structure
KW - Reinforcement learning
KW - Tileworld
UR - http://www.scopus.com/inward/record.url?scp=35748930908&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35748930908&partnerID=8YFLogxK
U2 - 10.1162/evco.2007.15.3.369
DO - 10.1162/evco.2007.15.3.369
M3 - Article
C2 - 17705783
AN - SCOPUS:35748930908
SN - 1063-6560
VL - 15
SP - 369
EP - 398
JO - Evolutionary Computation
JF - Evolutionary Computation
IS - 3
ER -