特權自我訪問對人工智慧中的內省很重要

2508.14802v1

中文标题#

特權自我訪問對人工智慧中的內省很重要

英文标题#

Privileged Self-Access Matters for Introspection in AI

中文摘要#

AI 模型能否進行內省是一個日益重要的實際問題。但關於內省如何定義尚無共識。從一個最近提出的 “輕量級” 定義出發，我們認為應採用更厚重的定義。根據我們的提議，AI 中的內省是指任何通過比第三方可用的計算成本相等或更低的過程更可靠的過程，從而獲得內部狀態信息的過程。通過實驗，我們發現當大語言模型對其內部溫度參數進行推理時，它們似乎表現出輕量級的內省，但根據我們提出的定義，它們並未真正實現有意義的內省。

英文摘要#

Whether AI models can introspect is an increasingly important practical question. But there is no consensus on how introspection is to be defined. Beginning from a recently proposed ''lightweight'' definition, we argue instead for a thicker one. According to our proposal, introspection in AI is any process which yields information about internal states through a process more reliable than one with equal or lower computational cost available to a third party. Using experiments where LLMs reason about their internal temperature parameters, we show they can appear to have lightweight introspection while failing to meaningfully introspect per our proposed definition.

文章页面#

特權自我訪問對人工智慧中的內省很重要

PDF 獲取#

查看中文 PDF - 2508.14802v1

智能達人抖店二維碼

抖音掃碼查看更多精彩內容