機器學習控制

機器學習控制（Machine learning control、MLC）是機器學習、智能控制及控制理論中的一部份，是由機器學習的方式來求解最優控制問題。主要的應用是一些不適用控制系統方法的複雜非線性系統。

問題和任務的分類

以下是四種常用機器學習控制來處理的問題。

控制參數識別；若控制律的結構已知，但其參數未知，機器學習控制會轉換為參數識別^[1]。其中一個例子是PID控制器的參數利用遺傳算法進行最佳化^[2]，或是離散時間最佳控制的相關應用^[3]。
第一類回歸問題的控制設計：只要每一個狀態的感測器訊號以及最佳的致動器命令是已知的，機器學習控制可以針對感測器訊號到致動器命令之間關係，近似一個泛用的非線性映射。例子是從已知的全狀態回授計算感測器回授。此應用中常會用到神經網絡^[4]。
第二類回歸問題的控制設計：機器學習控制也可以識別將受控體的支出函數最小化的任意非線性控制律。此情形下，不需要知道模型，也不用知道控制律結構或是最佳的致動器命令。此最佳化只以受控體量測到的控制性能為其基礎。遺傳編程是這種應用的有力回歸工具^[5]。
強化學習控制：可以透過強化學習，依量測到的性能變化（獎賞）持續的更新控制律^[6]。

機器學習控制包括神經網絡控制、基於遺傳算法的控制、遺傳編程控制、強化學習控制等，和其他資料驅動的控制（例如人工智能及機械人控制（英語：robot control））在方向論上有重疊之處。

應用

機器學習控制已應用在許多非線性控制問題上，探索許多未知且未預期的動作機制。以下是一些應用案例：

衛星姿態控制^[7]。
大樓溫度控制^[8]。
回授紊流控制^[2]^[9]。
水下載具遙控^[10]。
在PJ Fleming和RC Purshouse 2002年發表的回顧論文中有許多機器學習控制應用在工程上的例子^[11]。

機器學習控制有些方向類似其他非線性方法：對於在許多不同的應用條件下，無法保證收斂性、最佳解或是強健性。

參考資料

^ Thomas Bäck & Hans-Paul Schwefel (Spring 1993) "An overview of evolutionary algorithms for parameter optimization", Journal of Evolutionary Computation (MIT Press), vol. 1, no. 1, pp. 1-23
^ ^2.0 ^2.1 N. Benard, J. Pons-Prats, J. Periaux, G. Bugeda, J.-P. Bonnet & E. Moreau, (2015) "Multi-Input Genetic Algorithm for Experimental Optimization of the Reattachment Downstream of a Backward-Facing Step with Surface Plasma Actuator", Paper AIAA 2015-2957 at 46th AIAA Plasmadynamics and Lasers Conference, Dallas, TX, USA, pp. 1-23.
^ Zbigniew Michalewicz, Cezary Z. Janikow & Jacek B. Krawczyk (July 1992) "A modified genetic algorithm for optimal control problems", [Computers & Mathematics with Applications], vol. 23, no 12, pp. 83-94.
^ C. Lee, J. Kim, D. Babcock & R. Goodman (1997) "Application of neural networks to turbulence control for drag reduction", Physics of Fluids, vol. 6, no. 9, pp. 1740-1747
^ D. C. Dracopoulos & S. Kent (December 1997) "Genetic programming for prediction and control", Neural Computing & Applications (Springer), vol. 6, no. 4, pp. 214-228.
^ Andrew G. Barto (December 1994) "Reinforcement learning control", Current Opinion in Neurobiology, vol. 6, no. 4, pp. 888–893
^ Dimitris. C. Dracopoulos & Antonia. J. Jones (1994) Neuro-genetic adaptive attitude control, Neural Computing & Applications (Springer), vol. 2, no. 4, pp. 183-204.
^ Jonathan A. Wright, Heather A. Loosemore & Raziyeh Farmani (2002) "Optimization of building thermal design and control by multi-criterion genetic algorithm, [Energy and Buildings], vol. 34, no. 9, pp. 959-972.
^ Steven J. Brunton & Bernd R. Noack (2015) Closed-loop turbulence control: Progress and challenges, Applied Mechanics Reviews, vol. 67, no. 5, article 050801, pp. 1-48.
^ J. Javadi-Moghaddam, & A. Bagheri (2010 "An adaptive neuro-fuzzy sliding mode based genetic algorithm control system for under water remotely operated vehicle", Expert Systems with Applications （頁面存檔備份，存於互聯網檔案館）, vol. 37 no. 1, pp. 647-660.
^ Peter J. Fleming, R. C. Purshouse (2002 "Evolutionary algorithms in control systems engineering: a survey" Control Engineering Practice （頁面存檔備份，存於互聯網檔案館）, vol. 10, no. 11, pp. 1223-1241

延伸閱讀

Dimitris C Dracopoulos （頁面存檔備份，存於互聯網檔案館） (August 1997) "Evolutionary Learning Algorithms for Neural Adaptive Control" （頁面存檔備份，存於互聯網檔案館）, Springer. ISBN 978-3-540-76161-7.
Thomas Duriez （頁面存檔備份，存於互聯網檔案館）, Steven L. Brunton （頁面存檔備份，存於互聯網檔案館） & Bernd R. Noack (November 2016) "Machine Learning Control - Taming Nonlinear Dynamics and Turbulence" （頁面存檔備份，存於互聯網檔案館）, Springer. ISBN 978-3-319-40624-4.

[Baeck1993-1] Thomas Bäck & Hans-Paul Schwefel (Spring 1993) "An overview of evolutionary algorithms for parameter optimization", Journal of Evolutionary Computation (MIT Press), vol. 1, no. 1, pp. 1-23

[Benard2015aiaa-2] 2.0 ^2.1 N. Benard, J. Pons-Prats, J. Periaux, G. Bugeda, J.-P. Bonnet & E. Moreau, (2015) "Multi-Input Genetic Algorithm for Experimental Optimization of the Reattachment Downstream of a Backward-Facing Step with Surface Plasma Actuator", Paper AIAA 2015-2957 at 46th AIAA Plasmadynamics and Lasers Conference, Dallas, TX, USA, pp. 1-23.

[3] Zbigniew Michalewicz, Cezary Z. Janikow & Jacek B. Krawczyk (July 1992) "A modified genetic algorithm for optimal control problems", [Computers & Mathematics with Applications], vol. 23, no 12, pp. 83-94.

[4] C. Lee, J. Kim, D. Babcock & R. Goodman (1997) "Application of neural networks to turbulence control for drag reduction", Physics of Fluids, vol. 6, no. 9, pp. 1740-1747

[5] D. C. Dracopoulos & S. Kent (December 1997) "Genetic programming for prediction and control", Neural Computing & Applications (Springer), vol. 6, no. 4, pp. 214-228.

[6] Andrew G. Barto (December 1994) "Reinforcement learning control", Current Opinion in Neurobiology, vol. 6, no. 4, pp. 888–893

[7] Dimitris. C. Dracopoulos & Antonia. J. Jones (1994) Neuro-genetic adaptive attitude control, Neural Computing & Applications (Springer), vol. 2, no. 4, pp. 183-204.

[8] Jonathan A. Wright, Heather A. Loosemore & Raziyeh Farmani (2002) "Optimization of building thermal design and control by multi-criterion genetic algorithm, [Energy and Buildings], vol. 34, no. 9, pp. 959-972.

[9] Steven J. Brunton & Bernd R. Noack (2015) Closed-loop turbulence control: Progress and challenges, Applied Mechanics Reviews, vol. 67, no. 5, article 050801, pp. 1-48.

[10] J. Javadi-Moghaddam, & A. Bagheri (2010 "An adaptive neuro-fuzzy sliding mode based genetic algorithm control system for under water remotely operated vehicle", Expert Systems with Applications （頁面存檔備份，存於互聯網檔案館）, vol. 37 no. 1, pp. 647-660.

[11] Peter J. Fleming, R. C. Purshouse (2002 "Evolutionary algorithms in control systems engineering: a survey" Control Engineering Practice （頁面存檔備份，存於互聯網檔案館）, vol. 10, no. 11, pp. 1223-1241

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]