当前位置: X-MOL 学术Appl. Math. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal bipartite consensus control for heterogeneous unknown multi-agent systems via reinforcement learning
Applied Mathematics and Computation ( IF 4 ) Pub Date : 2024-05-06 , DOI: 10.1016/j.amc.2024.128785
Hao Meng , Denghao Pang , Jinde Cao , Yechen Guo , Azmat Ullah Khan Niazi

This study focuses on addressing optimal bipartite consensus control (OBCC) problems in heterogeneous multi-agent systems (MASs) without relying on the agents' dynamics. Motivated by the need for model-free and optimal consensus control in complex MASs, a novel distributed scheme utilizing reinforcement learning (RL) is proposed to overcome these challenges. The MAS network is randomly partitioned into sub-networks where agents collaborate within each subgroup to attain tracking control and ensure convergence of positions and speeds to a common value. However, agents from distinct subgroups compete to achieve diverse tracking objectives. Furthermore, the heterogeneous MASs considered have unknown first and second-order dynamics, adding to the complexity of the problem. To address the OBCC issue, the policy iteration (PI) algorithm is used to acquire solutions for discrete-time Hamilton-Jacobi-Bellman (HJB) equations while implementing a data-driven actor-critic neural network (ACNN) framework. Ultimately, the accuracy of our proposed approach is confirmed through the presentation of numerical simulations.

中文翻译:


通过强化学习实现异构未知多智能体系统的最优二方共识控制



本研究重点解决异构多智能体系统(MAS)中的最优二方共识控制(OBCC)问题,而不依赖于智能体的动态。由于复杂 MAS 中对无模型和最优共识控制的需求,提出了一种利用强化学习 (RL) 的新型分布式方案来克服这些挑战。 MAS 网络被随机划分为子网络,其中代理在每个子组内协作以实现跟踪控制并确保位置和速度收敛到共同值。然而,来自不同子群体的智能体相互竞争以实现不同的跟踪目标。此外,所考虑的异构 MAS 具有未知的一阶和二阶动力学,这增加了问题的复杂性。为了解决 OBCC 问题,策略迭代 (PI) 算法用于获取离散时间 Hamilton-Jacobi-Bellman (HJB) 方程的解,同时实现数据驱动的行动者批评神经网络 (ACNN) 框架。最终,我们提出的方法的准确性通过数值模拟的呈现得到了证实。
更新日期:2024-05-06
down
wechat
bug