CAJ | 학술논문

在多机器人系统中,协作环境探索的强化学习的空间规模是机器人个数的指数函数,学习空间非常庞大造成收敛速度极慢.为了解决这个问题,将基于动作预测的强化学习方法及动作选择策略应用于多机器人协作研究中,通过预测机器人可能执行动作的概率以加快学习算法的收敛速度.实验结果表明,基于动作预测的强化学习方法能够比原始算法更快速地获取多机器人的协作策略.
재다궤기인계통중,협작배경탐색적강화학습적공간규모시궤기인개수적지수함수,학습공간비상방대조성수렴속도겁만.위료해결저개문제,장기우동작예측적강화학습방법급동작선택책략응용우다궤기인협작연구중,통과예측궤기인가능집행동작적개솔이가쾌학습산법적수렴속도.실험결과표명,기우동작예측적강화학습방법능구비원시산법경쾌속지획취다궤기인적협작책략.
In multi-robot systems, the spatial scale of reinforcement learning of the cooperation environment exploration is made up of the exponential function of the number of robots. And the enormous learning space results in the slow convergence rate. To solve this problem, a prediction-based reinforcement learning algorithm and the action selection strategy are applied to the research on multi-robot cooperation. By predicting the probability of actions that other robots may execute, the convergence rate of this algorithm is accelerated. The experimental results show that reinforcement learning algorithm based-on action predic-tion can achieve the multi-robot’s cooperation strategy much faster, compared to the primitive algorithm.