TY - GEN
T1 - SvgAI - Training Methods Analysis of Artificial Intelligent Agent to use SVG Editor
AU - Dang, Anh H.
AU - Kameyama, Wataru
PY - 2019/4/29
Y1 - 2019/4/29
N2 - Deep reinforcement learning has been successfully used to train artificial intelligent (AI) agents, which outperforms humans in many tasks. The objective of this research is to train an AI agent to draw SVG images by using scalable vector graphic (SVG) editor with deep reinforcement learning, where the AI agent is to draw SVG images that are similar as much as possible to the given target raster images. In this paper, we propose framework to train the AI agent by value-function based Q-learning and policy-gradient based learning methods. With Q-learning based method, we find that it is crucial to distinguish the action space into two sets to apply a different exploration policy on each set during the training process. Evaluations show that our proposed dual ϵ-greedy exploration policy greatly stabilizes the training process and increases the accuracy of the AI agent. On the other hand, policy-gradient based training does not depend on external reward function. However, it is hard to implement especially in the environment with a large action space. To overcome this difficulty, we propose a strategy similar to the dynamic programming method to allow the agent to generate training samples by itself. In our evaluation, the highest score is archived by the agent trained by this proposed method. SVG images produced by the proposed AI agent have also superior quality compared to popular raster-to-SVG conversion softwares.
AB - Deep reinforcement learning has been successfully used to train artificial intelligent (AI) agents, which outperforms humans in many tasks. The objective of this research is to train an AI agent to draw SVG images by using scalable vector graphic (SVG) editor with deep reinforcement learning, where the AI agent is to draw SVG images that are similar as much as possible to the given target raster images. In this paper, we propose framework to train the AI agent by value-function based Q-learning and policy-gradient based learning methods. With Q-learning based method, we find that it is crucial to distinguish the action space into two sets to apply a different exploration policy on each set during the training process. Evaluations show that our proposed dual ϵ-greedy exploration policy greatly stabilizes the training process and increases the accuracy of the AI agent. On the other hand, policy-gradient based training does not depend on external reward function. However, it is hard to implement especially in the environment with a large action space. To overcome this difficulty, we propose a strategy similar to the dynamic programming method to allow the agent to generate training samples by itself. In our evaluation, the highest score is archived by the agent trained by this proposed method. SVG images produced by the proposed AI agent have also superior quality compared to popular raster-to-SVG conversion softwares.
KW - Exploration Policy
KW - Q-learning
KW - Reinforcement Learning
KW - SVG
UR - http://www.scopus.com/inward/record.url?scp=85065642084&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065642084&partnerID=8YFLogxK
U2 - 10.23919/ICACT.2019.8702041
DO - 10.23919/ICACT.2019.8702041
M3 - Conference contribution
AN - SCOPUS:85065642084
T3 - International Conference on Advanced Communication Technology, ICACT
SP - 1159
EP - 1166
BT - 21st International Conference on Advanced Communication Technology
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 21st International Conference on Advanced Communication Technology, ICACT 2019
Y2 - 17 February 2019 through 20 February 2019
ER -