中文标题#
響應和提示評估以防止與聊天機器人的擬社會關係
英文标题#
Response and Prompt Evaluation to Prevent Parasocial Relationships with Chatbots
中文摘要#
與 AI 代理建立的準社會關係對人類福祉有嚴重的影響,有時甚至會導致悲劇性後果。 然而阻止這種動態過程具有挑戰性:準社會線索通常在私人對話中逐漸出現,並非所有形式的情感參與都是有害的。 我們通過引入一個簡單的響應評估框架來解決這一挑戰,該框架通過重新利用最先進的語言模型,能夠實時評估正在進行的對話中的準社會線索。 為了測試這種方法的可行性,我們構建了一個包含三十個對話的小型合成數據集,涵蓋了準社會、阿諛奉承和中性對話。 通過五階段測試的迭代評估,在寬容的一致性規則下成功識別了所有準社會對話,同時避免了誤報,檢測通常在前幾次交流中發生。 這些發現提供了初步證據,表明評估代理可以為預防準社會關係提供可行的解決方案。
英文摘要#
The development of parasocial relationships with AI agents has severe, and in some cases, tragic effects for human well-being. Yet preventing such dynamics is challenging: parasocial cues often emerge gradually in private conversations, and not all forms of emotional engagement are inherently harmful. We address this challenge by introducing a simple response evaluation framework, created by repurposing a state-of-the-art language model, that evaluates ongoing conversations for parasocial cues in real time. To test the feasibility of this approach, we constructed a small synthetic dataset of thirty dialogues spanning parasocial, sycophantic, and neutral conversations. Iterative evaluation with five stage testing successfully identified all parasocial conversations while avoiding false positives under a tolerant unanimity rule, with detection typically occurring within the first few exchanges. These findings provide preliminary evidence that evaluation agents can provide a viable solution for the prevention of parasocial relations.
文章页面#
PDF 获取#
抖音掃碼查看更多精彩內容