摘要
While advanced Large Language Models(LLMs)can simulate human-like prosocial behaviors,the degree to which they align with human prosocial values and the underlying afective mechanisms remain unclear.This study addressed these gaps using the third-party punishment(TPP)paradigm,comparing LLM agents(GPT and DeepSeek series)with human participants(n=100).The LLM agents(n=500,100 agents per model)were one-to-one constructed based on the demographic and psychological features of human participants.Prompt engineering was employed to initiate TPP games and record punitive decisions and afective responses in LLM agents.Results revealed that:(1)GPT-4o,DeepSeek-V3,and DeepSeek-R1 models demonstrated stronger fairness value alignment,choosing punitive options more frequently than humans in TPP games;(2)all LLMs replicated the human pathway from unfairness through negative afective response to punitive decisions,with stronger mediation efects of negative emotions observed in DeepSeek models than GPT models;(3)only DeepSeek-R1 exhibited the human-like positive feedback loop from previous punitive decisions to positive afective feedback and subsequent punitive choices;(4)most LLMs(excluding GPT-3.5)showed signifcant representational similarity to human afect-decision patterns;(5)notably,all LLMs displayed rigid afective dynamics,characterized by lower afective variability and higher afective inertia than the fexible,contextsensitive fuctuations observed in humans.These fndings highlight notable advances in prosocial value alignment but underscore the necessity to enhance their afective dynamics to foster robust,adaptive prosocial LLMs.Such advancements could not only accelerate LLMs'alignment with human values but also provide empirical support for the broader applicability of prosocial theories to LLM agents.
基金
supported by the National Natural Science Foundation of China(Grant Nos.32271110,62441614)
the Tsinghua University Initiative Scientific Research Program(Grant No.20235080047)。
作者简介
Corresponding author: Zhen WU,email:zhen-wu@tsinghua.edu.cn。