響應和提示評估以防止與聊天機器人的擬社會關係

2508.15748v1

中文标题#

響應和提示評估以防止與聊天機器人的擬社會關係

英文标题#

Response and Prompt Evaluation to Prevent Parasocial Relationships with Chatbots

中文摘要#

與 AI 代理建立的準社會關係對人類福祉有嚴重的影響，有時甚至會導致悲劇性後果。然而阻止這種動態過程具有挑戰性：準社會線索通常在私人對話中逐漸出現，並非所有形式的情感參與都是有害的。我們通過引入一個簡單的響應評估框架來解決這一挑戰，該框架通過重新利用最先進的語言模型，能夠實時評估正在進行的對話中的準社會線索。為了測試這種方法的可行性，我們構建了一個包含三十個對話的小型合成數據集，涵蓋了準社會、阿諛奉承和中性對話。通過五階段測試的迭代評估，在寬容的一致性規則下成功識別了所有準社會對話，同時避免了誤報，檢測通常在前幾次交流中發生。這些發現提供了初步證據，表明評估代理可以為預防準社會關係提供可行的解決方案。

英文摘要#

The development of parasocial relationships with AI agents has severe, and in some cases, tragic effects for human well-being. Yet preventing such dynamics is challenging: parasocial cues often emerge gradually in private conversations, and not all forms of emotional engagement are inherently harmful. We address this challenge by introducing a simple response evaluation framework, created by repurposing a state-of-the-art language model, that evaluates ongoing conversations for parasocial cues in real time. To test the feasibility of this approach, we constructed a small synthetic dataset of thirty dialogues spanning parasocial, sycophantic, and neutral conversations. Iterative evaluation with five stage testing successfully identified all parasocial conversations while avoiding false positives under a tolerant unanimity rule, with detection typically occurring within the first few exchanges. These findings provide preliminary evidence that evaluation agents can provide a viable solution for the prevention of parasocial relations.

文章页面#

響應和提示評估以防止與聊天機器人的擬社會關係

PDF 获取#

查看中文 PDF - 2508.15748v1

智能達人抖店二維碼

抖音掃碼查看更多精彩內容