隨機鸚鵡

「隨機鸚鵡」（英語：stochastic parrot）是在機器學習中一個理論的比喻，指大型語言模型雖然能夠生成合理的文句，但其實不能理解所處理的語句。^[1]^[2]它由艾米麗·本德（英語：Emily M. Bender）、蒂默妮特·格伯魯、安傑利娜·麥克米倫-梅傑和瑪格麗特·米切爾（英語：Margaret Mitchell (scientist)）（以Shmargaret Shmitchell名義）^[2]^[3]在2021年人工智慧研究論文《論隨機鸚鵡的危害：語言模型太大有壞處嗎？🦜》（On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜）中提出。^[4]

起源和定義

「隨機鸚鵡」最初出現於論文《論隨機鸚鵡的危害：語言模型太大有壞處嗎？🦜》。^[4]他們認為大型語言模型帶來的害處包括環境損害和金融損失、訓練資料中難以察覺的偏見，以及誤導大眾和科學家，又說它們不能明白所學習事物的深層概念。^[5]

「stochastic」源自古希臘語單詞「stokhastikos」，意為「基於猜測的」或「隨機決定的」。^[6]這單詞也出現於概率論中的隨機過程（stochastic process）。「鸚鵡」是指大型語言模型只會「鸚鵡學舌」，不能理解句子的意思。^[6]

在論文中，本德等人認為大型語言模型只是根據概率連接字句，不會考慮含義，所以說這些模型只是「隨機鸚鵡」。^[4]

根據機器學習專家林霍爾姆、瓦爾斯特倫、林斯滕和舍恩所述，這個比喻強調兩個重要限制：^[1]^[7]

大型語言模型只由訓練所用的資料受限，只是根據隨機概率重複資料集的內容。
因為大型語言模型只是基於訓練資料輸出文字，所以它們不會知道自己所輸出的字句是否錯誤或不妥。

林霍爾姆等人指出，如果使用劣質資料集，加上其他限制，學習機器可能會產生「又錯誤又危險的」結果。^[1]

後續使用

2021年7月，艾倫·圖靈研究院（英語：Alan Turing Institute）就艾米麗的論文舉辦主題演講和專家小組討論。^[8]截至2023年5月 (2023-05)^[update]，它已由1529篇文章參照。^[9]「隨機鸚鵡」已出現於法律、^[10]語法、^[11]敘事、^[12]和人文學^[13]等領域的文章。艾米麗等人依然對基於大型語言模型的聊天機器人（比如GPT-4）的危害表示憂慮。^[14]

如今，批評人工智慧的人會用「隨機鸚鵡」這個新詞，表示機器不理解自己輸出的含義，令它類似於「辱罵AI的歧視詞」。^[6]OpenAI執行長山姆·阿特曼在一篇諷刺推文中使用這個用語：「我是一隻隨機鸚鵡，你也一樣」，讓它廣泛流傳。^[6]「隨機鸚鵡」隨後獲美國方言學會（英語：American Dialect Society）選2023年年度AI詞語，勝過「ChatGPT」和「LLM」。^[6]^[15]

一些研究人員會用「隨機鸚鵡」，把大型語言模型描述為拼接規律的機器，通過海量訓練資料，表面上能產生類比人類語言的合理文字，但只不過是以隨機的方式鸚鵡學舌。然而，其他研究人員聲稱大型語言模型的確能明白語言。^[16]

爭論

ChatGPT等部分大型語言模型能與使用者產生模仿人類的逼真互動。^[16]隨著這些新系統的開發，大型語言模型「鸚鵡學舌」的程度成為越來越多探討的焦點。

在人類思維中，文字和語言與經歷相對應。^[17]但在大型語言模型的演算法中，字詞可以只會與訓練資料中的其他字詞和使用規律對應。^[18]^[19]^[4]因此，支持「隨機鸚鵡」論的人認為大型語言模型不能真正理解語言。^[18]^[4]

支持者認為大型語言模型傾向於把虛假資訊當作事實（稱為幻覺），正論證了這一點。^[17]大型語言模型有時會合成資訊，雖然能符合某些規律，但其實不切實際。^[18]^[19]^[17]由於大型語言模型無法區分真偽，支持者聲稱它們不能像語言一樣，把字詞和對世界的理解連接。^[18]^[17]此外，大型語言模型一般無法理解上下文，從而辨認複雜或有歧義的文法。^[18]^[19]例如，在薩巴等人的論文中：^[18]

The wet newspaper that fell down off the table is my favorite newspaper. But now that my favorite newspaper fired the editor I might not like reading it anymore.

從桌上掉下的濕濕的報紙是我最喜歡的報紙。但我最喜歡的報社解僱了編輯，所以我可能不會再讀了。

Can I replace 'my favorite newspaper' by 'the wet newspaper that fell down off the table' in the second sentence?

我可不可以把第二句的「my favorite newspaper」替換成「the wet newspaper that fell down off the table」？

大型語言模型回答說可以，但不明白「newspaper」在這兩個語境中的意思不同：第一個是報紙，第二個是報社。^[18]一些AI專家由此認為它們僅僅是隨機鸚鵡。^[18]^[17]^[4]

然而，也有論點認為大型語言模型不只是隨機鸚鵡。它們能通過許多理解能力測驗，包括超級通用語言理解評估（Super General Language Understanding Evaluation；SuperGLUE）。^[19]^[20]由於許多大型語言模型都能作出通順回應，加上此類測驗的支持，根據一份2022年的調查，51%的AI專家認為只要有足夠資料，這些模型可以真正理解語言。^[19]

探究大型語言模型是否具備理解能力時，「機制可理解性」這個技巧也可以運用，原理是把模型逆向工程，分析它的神經網路如何處理資訊。Othello-GPT便是其中一例。這個小型Transformer模型經過訓練可以預測黑白棋的合法走棋方式。它的網路裡有黑白棋棋盤的表示式，而修改這個表示式後，合法走棋方式也會改為正確的組合。因此，有論點稱大型語言模型具有「世界模型」，不只是依賴表面統計。^[21]^[22]

又例如，一個小型Transformer模型用Karel程式（英語：Karel (programming language)）訓練。這個模型和Othello-GPT一樣，可以在網路裡產生Karel程式語法的表示式。改變這個表達後，輸出也會改正。這個模型也能生產正確的程式，平均長度比訓練集中的程式的短。^[23]

然而，如果將人類語文理解的測驗用於大型語言模型，它們有時會因在資料文字中建立錯誤聯繫而產生假陽性結果。^[24]模型有時會進行快捷學習（shortcut learning），即是不使用類似人類的理解，從資料中作出不相關的聯繫。^[25]一項實驗測驗Google的BERT大型語言模型的論點推理技巧。模型需要從兩個陳述句中選擇更符合論點的句子。以下是其中一個論點的例子：^[19]^[26]

論點：重罪犯應該有投票權。我們不應該禁止17歲時就偷了一輛車的人終身享有普通市民的一切權利。
陳述句甲：盜竊汽車是重罪。
陳述句乙：盜竊汽車不是重罪。

研究人員發現，「不」等特定單詞能引導模型答對問題。如果加入這些單詞，則模型會幾乎獲得滿分；如果不加入它們，則模型會傾向於隨機選擇。^[19]^[26]基於這個問題，加上智力的定義中的已知難題，有論點稱所有探究大型語言模型理解的基準都有缺陷，而且都讓這些模型有捷徑產生錯誤理解。

如果沒有可靠的基準，研究人員則難以分辨隨機鸚鵡和真正具備理解能力的模型。在一項實驗中，一位科學家認為ChatGPT-3一時具備人類理解能力，一時變成隨機鸚鵡。^[16]他發現這個模型根據提示詞中的資料預測將來事件時，可以給出通順而資訊量豐富的回答。^[16]ChatGPT-3也經常能從文字提示詞解析潛在資訊。但它往往無法正確回答有關邏輯和推理的問題，尤其是涉及空間知覺的問題。^[16]模型的回答品質不一，表示大型語言模型遇到某些類別的問題時具備某種形式上的「理解」，遇到其他問題時則會變成隨機鸚鵡。^[16]

另外，由於動物研究顯示鸚鵡不只是會模仿人類說話，甚至有時能理解語言，一些研究鸚鵡的科學家可能認為「隨機鸚鵡」具冒犯性。^[27]

參見

參考文獻

^ ^1.0 ^1.1 ^1.2 Lindholm et al. 2022，第322–3頁.
^ ^2.0 ^2.1 Uddin, Muhammad Saad. Stochastic Parrots: A Novel Look at Large Language Models and Their Limitations. Towards AI. 2023-04-20 [2023-05-12] （美國英語）.
^ Weil, Elizabeth. You Are Not a Parrot. 紐約. 2023-03-01 [2023-05-12].
^ ^4.0 ^4.1 ^4.2 ^4.3 ^4.4 ^4.5 Bender, Emily M.; Gebru, Timnit; McMillan-Major, Angelina; Shmitchell, Shmargaret. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT '21. New York, NY, USA: Association for Computing Machinery. 2021-03-01: 610–623. ISBN 978-1-4503-8309-7. S2CID 232040593. doi:10.1145/3442188.3445922 .
^ Haoarchive, Karen. We read the paper that forced Timnit Gebru out of Google. Here's what it says.. 麻省理工科技評論. 2020-12-04 [2022-01-19]. （原始內容存檔於2021-10-06）（英語）.
^ ^6.0 ^6.1 ^6.2 ^6.3 ^6.4 Zimmer, Ben. 'Stochastic Parrot': A Name for AI That Sounds a Bit Less Intelligent. WSJ. [2024-04-01] （美國英語）.
^ Uddin, Muhammad Saad. Stochastic Parrots: A Novel Look at Large Language Models and Their Limitations. Towards AI. 2023-04-20 [2023-05-12] （美國英語）.
^ Weller (2021).
^ Bender: On the Dangers of Stochastic Parrots. Google學術搜尋. [2023-05-12].
^ Arnaudo, Luca. Artificial Intelligence, Capabilities, Liabilities: Interactions in the Shadows of Regulation, Antitrust – And Family Law. SSRN. 2023-04-20. S2CID 258636427. doi:10.2139/ssrn.4424363.
^ Bleackley, Pete; BLOOM. In the Cage with the Stochastic Parrot. Speculative Grammarian. 2023, CXCII (3) [2023-05-13].
^ Gáti, Daniella. Theorizing Mathematical Narrative through Machine Learning.. Journal of Narrative Theory (Project MUSE). 2023, 53 (1): 139–165. S2CID 257207529. doi:10.1353/jnt.2023.0003.
^ Rees, Tobias. Non-Human Words: On GPT-3 as a Philosophical Laboratory. Daedalus. 2022, 151 (2): 168–82. JSTOR 48662034. S2CID 248377889. doi:10.1162/daed_a_01908 .
^ Goldman, Sharon. With GPT-4, dangers of 'Stochastic Parrots' remain, say researchers. No wonder OpenAI CEO is a 'bit scared'. VentureBeat. 2023-03-20 [2023-05-09] （美國英語）.
^ Corbin, Sam. Among Linguists, the Word of the Year Is More of a Vibe. The New York Times. 2024-01-15 [2024-04-01]. ISSN 0362-4331 （美國英語）.
^ ^16.0 ^16.1 ^16.2 ^16.3 ^16.4 ^16.5 Arkoudas, Konstantine. ChatGPT is no Stochastic Parrot. But it also Claims that 1 is Greater than 1. Philosophy & Technology. 2023-08-21, 36 (3): 54. ISSN 2210-5441. doi:10.1007/s13347-023-00619-6 （英語）.
^ ^17.0 ^17.1 ^17.2 ^17.3 ^17.4 Fayyad, Usama M. From Stochastic Parrots to Intelligent Assistants—The Secrets of Data and Human Interventions. IEEE Intelligent Systems. 2023-05-26, 38 (3): 63–67. ISSN 1541-1672. doi:10.1109/MIS.2023.3268723.
^ ^18.0 ^18.1 ^18.2 ^18.3 ^18.4 ^18.5 ^18.6 ^18.7 Saba, Walid S. Stochastic LLMS do not Understand Language: Towards Symbolic, Explainable and Ontologically Based LLMS. Almeida, João Paulo A.; Borbinha, José; Guizzardi, Giancarlo; Link, Sebastian; Zdravkovic, Jelena (編). Conceptual Modeling. Lecture Notes in Computer Science 14320. Cham: Springer Nature Switzerland. 2023: 3–19. ISBN 978-3-031-47262-6. arXiv:2309.05918 . doi:10.1007/978-3-031-47262-6_1 （英語）.
^ ^19.0 ^19.1 ^19.2 ^19.3 ^19.4 ^19.5 ^19.6 Mitchell, Melanie; Krakauer, David C. The debate over understanding in AI's large language models. Proceedings of the National Academy of Sciences. 2023-03-28, 120 (13): e2215907120. Bibcode:2023PNAS..12015907M. ISSN 0027-8424. PMC 10068812 . PMID 36943882. arXiv:2210.13966 . doi:10.1073/pnas.2215907120 （英語）.
^ Wang, Alex; Pruksachatkun, Yada; Nangia, Nikita; Singh, Amanpreet; Michael, Julian; Hill, Felix; Levy, Omer; Bowman, Samuel R. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. 2019-05-02. arXiv:1905.00537  [cs.CL] （英語）.
^ Li, Kenneth; Hopkins, Aspen K.; Bau, David; Viégas, Fernanda; Pfister, Hanspeter; Wattenberg, Martin, Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task, 2023-02-27, arXiv:2210.13382 
^ Li, Kenneth. Large Language Model: world models or surface statistics?. The Gradient. 2023-01-21 [2024-04-04] （英語）.
^ Jin, Charles; Rinard, Martin, Evidence of Meaning in Language Models Trained on Programs, 2023-05-24, arXiv:2305.11169 
^ Choudhury, Sagnik Ray; Rogers, Anna; Augenstein, Isabelle, Machine Reading, Fast and Slow: When Do Models "Understand" Language?, 2022-09-15, arXiv:2209.07430 
^ Geirhos, Robert; Jacobsen, Jörn-Henrik; Michaelis, Claudio; Zemel, Richard; Brendel, Wieland; Bethge, Matthias; Wichmann, Felix A. Shortcut learning in deep neural networks. Nature Machine Intelligence. 2020-11-10, 2 (11): 665–673. ISSN 2522-5839. arXiv:2004.07780 . doi:10.1038/s42256-020-00257-z （英語）.
^ ^26.0 ^26.1 Niven, Timothy; Kao, Hung-Yu, Probing Neural Network Comprehension of Natural Language Arguments, 2019-09-16, arXiv:1907.07355 
^ Shute, Nancy. What a parrot knows, and what a chatbot doesn’t (paper). Science News. Vol. 205 no. 2. 2024-01-27.

引述文獻

Lindholm, A.; Wahlström, N.; Lindsten, F. ; Schön, T. B. Machine Learning: A First Course for Engineers and Scientists. Cambridge University Press. 2022. ISBN 978-1108843607.
Weller, Adrian. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 (video). Alan Turing Institute. 2021-07-13. 主題演講由艾米麗·本德主持。專家小組討論在簡報結束後開始。

延伸閱讀

Bogost, Ian. ChatGPT Is Dumber Than You Think: Treat it like a toy, not a tool. 大西洋. 2022-12-07 [2024-01-17].
Chomsky, Noam. The False Promise of ChatGPT. 紐約時報. 2023-03-08 [2024-01-17].
Glenberg, Arthur; Jones, Cameron Robert. It takes a body to understand the world – why ChatGPT and other language AIs don't know what they're saying. The Conversation. 2023-04-06 [2024-01-17].
McQuillan, D. Resisting AI: An Anti-fascist Approach to Artificial Intelligence. 布里斯托大學出版社. 2022. ISBN 978-1-5292-1350-8.
Thompson, E. . Escape from Model Land: How Mathematical Models Can Lead Us Astray and What We Can Do about It. Basic Books. 2022. ISBN 978-1-5416-0098-0.
Zhong, Qihuang; Ding, Liang; Liu, Juhua; Du, Bo; Tao, Dacheng. Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT. 2023. arXiv:2302.10198  [cs.CL].

外部連結

《論隨機鸚鵡的危害：語言模型太大有壞處嗎？🦜》在維基共享資源

[FOOTNOTELindholmWahlströmLindstenSchön2022322–3-1] 1.0 ^1.1 ^1.2 Lindholm et al. 2022，第322–3頁.

[Uddin-2] 2.0 ^2.1 Uddin, Muhammad Saad. Stochastic Parrots: A Novel Look at Large Language Models and Their Limitations. Towards AI. 2023-04-20 [2023-05-12] （美國英語）.

[Weil-3] Weil, Elizabeth. You Are Not a Parrot. 紐約. 2023-03-01 [2023-05-12].

[parrot-paper-4] 4.0 ^4.1 ^4.2 ^4.3 ^4.4 ^4.5 Bender, Emily M.; Gebru, Timnit; McMillan-Major, Angelina; Shmitchell, Shmargaret. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT '21. New York, NY, USA: Association for Computing Machinery. 2021-03-01: 610–623. ISBN 978-1-4503-8309-7. S2CID 232040593. doi:10.1145/3442188.3445922 .

[5] Haoarchive, Karen. We read the paper that forced Timnit Gebru out of Google. Here's what it says.. 麻省理工科技評論. 2020-12-04 [2022-01-19]. （原始內容存檔於2021-10-06）（英語）.

[Zimmer-6] 6.0 ^6.1 ^6.2 ^6.3 ^6.4 Zimmer, Ben. 'Stochastic Parrot': A Name for AI That Sounds a Bit Less Intelligent. WSJ. [2024-04-01] （美國英語）.

[Uddin2-7] Uddin, Muhammad Saad. Stochastic Parrots: A Novel Look at Large Language Models and Their Limitations. Towards AI. 2023-04-20 [2023-05-12] （美國英語）.

[FOOTNOTEWeller2021-8] Weller (2021).

[9] Bender: On the Dangers of Stochastic Parrots. Google學術搜尋. [2023-05-12].

[10] Arnaudo, Luca. Artificial Intelligence, Capabilities, Liabilities: Interactions in the Shadows of Regulation, Antitrust – And Family Law. SSRN. 2023-04-20. S2CID 258636427. doi:10.2139/ssrn.4424363.

[11] Bleackley, Pete; BLOOM. In the Cage with the Stochastic Parrot. Speculative Grammarian. 2023, CXCII (3) [2023-05-13].

[12] Gáti, Daniella. Theorizing Mathematical Narrative through Machine Learning.. Journal of Narrative Theory (Project MUSE). 2023, 53 (1): 139–165. S2CID 257207529. doi:10.1353/jnt.2023.0003.

[13] Rees, Tobias. Non-Human Words: On GPT-3 as a Philosophical Laboratory. Daedalus. 2022, 151 (2): 168–82. JSTOR 48662034. S2CID 248377889. doi:10.1162/daed_a_01908 .

[14] Goldman, Sharon. With GPT-4, dangers of 'Stochastic Parrots' remain, say researchers. No wonder OpenAI CEO is a 'bit scared'. VentureBeat. 2023-03-20 [2023-05-09] （美國英語）.

[15] Corbin, Sam. Among Linguists, the Word of the Year Is More of a Vibe. The New York Times. 2024-01-15 [2024-04-01]. ISSN 0362-4331 （美國英語）.

[Arkoudas-2023-16] 16.0 ^16.1 ^16.2 ^16.3 ^16.4 ^16.5 Arkoudas, Konstantine. ChatGPT is no Stochastic Parrot. But it also Claims that 1 is Greater than 1. Philosophy & Technology. 2023-08-21, 36 (3): 54. ISSN 2210-5441. doi:10.1007/s13347-023-00619-6 （英語）.

[Fayyad-2023-17] 17.0 ^17.1 ^17.2 ^17.3 ^17.4 Fayyad, Usama M. From Stochastic Parrots to Intelligent Assistants—The Secrets of Data and Human Interventions. IEEE Intelligent Systems. 2023-05-26, 38 (3): 63–67. ISSN 1541-1672. doi:10.1109/MIS.2023.3268723.

[Saba-2023-18] 18.0 ^18.1 ^18.2 ^18.3 ^18.4 ^18.5 ^18.6 ^18.7 Saba, Walid S. Stochastic LLMS do not Understand Language: Towards Symbolic, Explainable and Ontologically Based LLMS. Almeida, João Paulo A.; Borbinha, José; Guizzardi, Giancarlo; Link, Sebastian; Zdravkovic, Jelena (編). Conceptual Modeling. Lecture Notes in Computer Science 14320. Cham: Springer Nature Switzerland. 2023: 3–19. ISBN 978-3-031-47262-6. arXiv:2309.05918 . doi:10.1007/978-3-031-47262-6_1 （英語）.

[Mitchell-2023-19] 19.0 ^19.1 ^19.2 ^19.3 ^19.4 ^19.5 ^19.6 Mitchell, Melanie; Krakauer, David C. The debate over understanding in AI's large language models. Proceedings of the National Academy of Sciences. 2023-03-28, 120 (13): e2215907120. Bibcode:2023PNAS..12015907M. ISSN 0027-8424. PMC 10068812 . PMID 36943882. arXiv:2210.13966 . doi:10.1073/pnas.2215907120 （英語）.

[20] Wang, Alex; Pruksachatkun, Yada; Nangia, Nikita; Singh, Amanpreet; Michael, Julian; Hill, Felix; Levy, Omer; Bowman, Samuel R. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. 2019-05-02. arXiv:1905.00537  [cs.CL] （英語）.

[21] Li, Kenneth; Hopkins, Aspen K.; Bau, David; Viégas, Fernanda; Pfister, Hanspeter; Wattenberg, Martin, Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task, 2023-02-27, arXiv:2210.13382 

[22] Li, Kenneth. Large Language Model: world models or surface statistics?. The Gradient. 2023-01-21 [2024-04-04] （英語）.

[23] Jin, Charles; Rinard, Martin, Evidence of Meaning in Language Models Trained on Programs, 2023-05-24, arXiv:2305.11169 

[24] Choudhury, Sagnik Ray; Rogers, Anna; Augenstein, Isabelle, Machine Reading, Fast and Slow: When Do Models "Understand" Language?, 2022-09-15, arXiv:2209.07430 

[25] Geirhos, Robert; Jacobsen, Jörn-Henrik; Michaelis, Claudio; Zemel, Richard; Brendel, Wieland; Bethge, Matthias; Wichmann, Felix A. Shortcut learning in deep neural networks. Nature Machine Intelligence. 2020-11-10, 2 (11): 665–673. ISSN 2522-5839. arXiv:2004.07780 . doi:10.1038/s42256-020-00257-z （英語）.

[Niven-2019-26] 26.0 ^26.1 Niven, Timothy; Kao, Hung-Yu, Probing Neural Network Comprehension of Natural Language Arguments, 2019-09-16, arXiv:1907.07355 

[27] Shute, Nancy. What a parrot knows, and what a chatbot doesn’t (paper). Science News. Vol. 205 no. 2. 2024-01-27.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]