Indexed by:
Abstract:
Reinforcement learning behavioral control (RLBC) is limited to an individual agent without any swarm mission, because it models the behavior priority learning as a Markov decision process. In this paper, a novel multi-agent reinforcement learning behavioral control (MARLBC) method is proposed to overcome such limitations by implementing joint learning. Specifically, a multi-agent reinforcement learning mission supervisor (MARLMS) is designed for a group of nonlinear second-order systems to assign the behavior priorities at the decision layer. Through modeling behavior priority switching as a cooperative Markov game, the MARLMS learns an optimal joint behavior priority to reduce dependence on human intelligence and high-performance computing hardware. At the control layer, a group of second-order reinforcement learning controllers are designed to learn the optimal control policies to track position and velocity signals simultaneously. In particular, input saturation constraints are strictly implemented via designing a group of adaptive compensators. Numerical simulation results show that the proposed MARLBC has a lower switching frequency and control cost than finite-time and fixed-time behavioral control and RLBC methods.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING
ISSN: 2095-9184
Year: 2024
Issue: 6
Volume: 25
Page: 869-886
2 . 7 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: