中文标题#
响应和提示评估以防止与聊天机器人的拟社会关系
英文标题#
Response and Prompt Evaluation to Prevent Parasocial Relationships with Chatbots
中文摘要#
与 AI 代理建立的准社会关系对人类福祉有严重的影响,有时甚至会导致悲剧性后果。 然而阻止这种动态过程具有挑战性:准社会线索通常在私人对话中逐渐出现,并非所有形式的情感参与都是有害的。 我们通过引入一个简单的响应评估框架来解决这一挑战,该框架通过重新利用最先进的语言模型,能够实时评估正在进行的对话中的准社会线索。 为了测试这种方法的可行性,我们构建了一个包含三十个对话的小型合成数据集,涵盖了准社会、阿谀奉承和中性对话。 通过五阶段测试的迭代评估,在宽容的一致性规则下成功识别了所有准社会对话,同时避免了误报,检测通常在前几次交流中发生。 这些发现提供了初步证据,表明评估代理可以为预防准社会关系提供可行的解决方案。
英文摘要#
The development of parasocial relationships with AI agents has severe, and in some cases, tragic effects for human well-being. Yet preventing such dynamics is challenging: parasocial cues often emerge gradually in private conversations, and not all forms of emotional engagement are inherently harmful. We address this challenge by introducing a simple response evaluation framework, created by repurposing a state-of-the-art language model, that evaluates ongoing conversations for parasocial cues in real time. To test the feasibility of this approach, we constructed a small synthetic dataset of thirty dialogues spanning parasocial, sycophantic, and neutral conversations. Iterative evaluation with five stage testing successfully identified all parasocial conversations while avoiding false positives under a tolerant unanimity rule, with detection typically occurring within the first few exchanges. These findings provide preliminary evidence that evaluation agents can provide a viable solution for the prevention of parasocial relations.
文章页面#
PDF 获取#
抖音扫码查看更多精彩内容