[关键词]
[摘要]
【目的】深度强化学习(DRL)作为一种新兴的智能控制技术,在电机驱动系统控制领域展现出显著潜力。对此,本文研究并设计了一种先进的基于DRL的永磁同步电机(PMSM)驱动控制架构,旨在不依赖电机物理参数精确辨识的情况下,实现高精度、无模型的鲁棒控制。【方法】本文将深度Q网络与有限控制集转矩控制结合,通过在线学习直接输出逆变器的开关状态,使智能体能够通过与电机环境的持续在线学习与交互,直接确定逆变器的最优开关状态。首先,设计了一个综合性多层次奖励函数以反映PMSM的复杂特性,同时兼顾了高保真转矩跟踪、定子电流幅值最小化以及系统整体能量效率最大化等多个优化目标。其次,为了弥补理论探索与实际安全需求之间的差距,建立了一种基于电流约束的新型安全保护与评估机制。该机制确保了DRL固有的随机探索过程不会导致系统过流或硬件损坏。最后,通过引入Q学习结构和自动化超参数优化方法,有效提高了算法的收敛性和控制性能。【结果】仿真结果表明,在训练400个回合后平均奖励值稳定于1附近,证明了算法优异的收敛性。所提算法能够精准跟踪转矩指令,在不同转速及负载阶跃工况下均保持了较快的响应速度与极小的稳态误差。通过合理的权重配置,系统有效实现了转矩精度与运行效率的平衡。此外,安全保护机制通过done信号实时截断高风险状态的预期收益,确保定子电流始终严格约束在安全阈值内,验证了模型在小样本场景下的稳健性。【结论】所提方案实现了无模型的高性能转矩控制,其引入的安全评估机制为强化学习在电力电子领域的应用提供了科学依据与预防性运维的新思路,为电机智能控制提供了新的研究方向。
[Key word]
[Abstract]
[Objective] As an emerging intelligent control technology, deep reinforcement learning (DRL) has demonstrated remarkable potential in the field of motor drive system control. In this regard, this paper researches and designs an advanced DRL-based drive control architecture for permanent magnet synchronous motor (PMSM), aiming to achieve high-precision, model-free robust control without relying on the accurate identification of motor physical parameters. [Methods] This paper combined the deep Q-network with finite control set torque control, directly outputting the switching states of the inverter through online learning, thus enabling the agent to determine the optimal switching states of the inverter directly via continuous online learning and interaction with the motor environment. Firstly, a comprehensive multi-level reward function was designed to reflect the complex characteristics of the PMSM, simultaneously accommodating multiple optimization objectives including high-fidelity torque tracking, stator current amplitude minimization, and overall energy efficiency maximization. Secondly, to bridge the gap between theoretical exploration and practical safety requirements, a novel safety protection and evaluation mechanism based on current constraints was established. This mechanism ensured that the inherent random exploration process of DRL did not lead to system overcurrent or hardware damage. Finally, the convergence and control performance of the algorithm were effectively improved by introducing the Q-learning structure and an automated hyperparameter optimization method. [Results] The simulation results showed that the average reward value stabilized at approximately 1 after 400 training episodes, which verified the excellent convergence of the proposed algorithm. The algorithm accurately tracked the torque commands and maintained fast response speeds with minimal steady-state errors under various speed and load step conditions. With the valid weight coefficients, the system successfully balanced torque precision and operational efficiency. Furthermore, the safety protection mechanism effectively truncated the expected future returns of high-risk states via the done signal, ensuring that the stator current was strictly confined within the safety threshold, which validated the robustness of the model even in small-sample scenarios. [Conclusion] The proposed scheme achieves high-performance model-free torque control, and its integrated safety assessment mechanism provides a scientific foundation and novel insights for preventive operation and maintenance alongside the application of reinforcement learning in the power electronics field, as well as a new research direction for intelligent motor control.
[中图分类号]
[基金项目]