中国电机工程学报
中國電機工程學報
중국전궤공정학보
ZHONGGUO DIANJI GONGCHENG XUEBAO
2014年
4期
620-627
,共8页
智能体%自动发电控制%控制性能标准%相关均衡%强化学习%随机最优控制%资格迹
智能體%自動髮電控製%控製性能標準%相關均衡%彊化學習%隨機最優控製%資格跡
지능체%자동발전공제%공제성능표준%상관균형%강화학습%수궤최우공제%자격적
multi-agent%automatic generation control (AGC)%control performance standard (CPS)%correlated Q learning%correlated equilibrium%stochastic optimal control%eligibility trace
提出了一种分散式多智能体均衡算法(decentralized correlated equilibrium Q(λ),DCEQ(λ))以解决新能源接入所带来的强随机环境下的互联电网自动发电控制。该算法以相关均衡概率选择机制平衡利用与探索,是一种典型的试错寻优且与模型无关的智能算法。在综合考虑分散式多智能体均衡算法在自动发电控制(automatic generation control,AGC)系统设计适用性的基础上,改进了多智能体算法的奖励函数;以区域控制偏差(area control error,ACE)实时绝对值赋予公平系数的方法设计了均衡选择函数;在分析了3种常用资格迹算法特点的基础上,融入了SARSA(λ)资格迹以有效解决火电机组等大延时环节所带来的时间信度分配问题。IEEE 标准两区域频率响应模型与南方电网模型仿真研究表明,所提出的 DCEQ(λ)控制器相对于单智能体 Q(λ)控制器具有更好的控制性能,在控制过程中能有效消除ACE与控制性能标准(control performance standard,CPS)中的实时毛刺,显著提高互联电力系统的稳定性与鲁棒性。
提齣瞭一種分散式多智能體均衡算法(decentralized correlated equilibrium Q(λ),DCEQ(λ))以解決新能源接入所帶來的彊隨機環境下的互聯電網自動髮電控製。該算法以相關均衡概率選擇機製平衡利用與探索,是一種典型的試錯尋優且與模型無關的智能算法。在綜閤攷慮分散式多智能體均衡算法在自動髮電控製(automatic generation control,AGC)繫統設計適用性的基礎上,改進瞭多智能體算法的獎勵函數;以區域控製偏差(area control error,ACE)實時絕對值賦予公平繫數的方法設計瞭均衡選擇函數;在分析瞭3種常用資格跡算法特點的基礎上,融入瞭SARSA(λ)資格跡以有效解決火電機組等大延時環節所帶來的時間信度分配問題。IEEE 標準兩區域頻率響應模型與南方電網模型倣真研究錶明,所提齣的 DCEQ(λ)控製器相對于單智能體 Q(λ)控製器具有更好的控製性能,在控製過程中能有效消除ACE與控製性能標準(control performance standard,CPS)中的實時毛刺,顯著提高互聯電力繫統的穩定性與魯棒性。
제출료일충분산식다지능체균형산법(decentralized correlated equilibrium Q(λ),DCEQ(λ))이해결신능원접입소대래적강수궤배경하적호련전망자동발전공제。해산법이상관균형개솔선택궤제평형이용여탐색,시일충전형적시착심우차여모형무관적지능산법。재종합고필분산식다지능체균형산법재자동발전공제(automatic generation control,AGC)계통설계괄용성적기출상,개진료다지능체산법적장려함수;이구역공제편차(area control error,ACE)실시절대치부여공평계수적방법설계료균형선택함수;재분석료3충상용자격적산법특점적기출상,융입료SARSA(λ)자격적이유효해결화전궤조등대연시배절소대래적시간신도분배문제。IEEE 표준량구역빈솔향응모형여남방전망모형방진연구표명,소제출적 DCEQ(λ)공제기상대우단지능체 Q(λ)공제기구유경호적공제성능,재공제과정중능유효소제ACE여공제성능표준(control performance standard,CPS)중적실시모자,현저제고호련전력계통적은정성여로봉성。
This paper proposed a multi-agent decentralized correlated equilibrium Q(λ) (DCEQ(λ)) learning algorithm to tackle automatic generation control (AGC) under strong random gird environment considering emerging renewable energy sources. This algorithm does not need to consider the tradeoffs between exploitation and exploration, it also does not need any knowledge of the system model and uses the trial and error methods to find the most desired policy. After the adaptive problem of this algorithm in AGC fields had been figured out, an improved reward function and an equilibrium selected function integrated with fair factor were proposed. Three kinds of eligibility traces were also analyzed and SARSA(λ) was introduced in this algorithm to reassign the delayed reward appropriately due to the long time-delay control link such as AGC thermal plants. Simulation tests on a two-area load frequency control (LFC) power system model and China Southern Power Grid demonstrated that DCEQ(λ) controller has better control performance than Q(λ) controller, and can effectively smooth the instantaneous value of automatic generation control (ACE) and control performance standard (CPS), and thus improve the stability and robustness of interconnected power systems.