当前位置: X-MOL 学术European Journal for Philosophy of Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficiency and fairness trade-offs in two player bargaining games
European Journal for Philosophy of Science ( IF 1.5 ) Pub Date : 2023-10-24 , DOI: 10.1007/s13194-023-00553-6
David Freeborn

Recent work on the evolution of social contracts and conventions has often used models of bargaining games, with reinforcement learning. A recent innovation is the requirement that every strategy must be invented either through through learning or reinforcement. However, agents frequently get stuck in highly-reinforced “traps” that prevent them from arriving at outcomes that are efficient or fair to the both players. Agents face a trade-off between exploration and exploitation, i.e. between continuing to invent new strategies and reinforcing strategies that have already become highly reinforced by yielding high rewards. In this paper I systematically study the relationship between rates of invention and the efficiency and fairness of outcomes in two-player, repeated bargaining games. I use a basic reinforcement learning model with invention, and five variations of this model, designed introduce various forms of forgetting, to prioritize more recent reinforcement, or to maintain a higher rate of invention. I use computer simulations to investigate the outcomes of each model. Each models shows qualitative similarities in the relationship between the efficiency and fairness of outcomes, and the relative amount of exploration or exploitation that takes place. Surprisingly, there are often trade-offs between the efficiency and the fairness of the outcomes.



中文翻译:

两人讨价还价博弈中效率与公平的权衡

最近关于社会契约和惯例演变的研究经常使用带有强化学习的讨价还价博弈模型。最近的一项创新是要求每项策略都必须通过学习或强化来发明。然而,代理人经常陷入高度强化的“陷阱”,从而阻止他们达到对双方都有效或公平的结果。智能体面临着探索和利用之间的权衡,即在继续发明新策略和强化已经通过产生高回报而得到高度强化的策略之间进行权衡。在本文中,我系统地研究了两人重复讨价还价游戏中发明率与结果的效率和公平性之间的关系。我使用带有发明的基本强化学习模型,以及该模型的五个变体,旨在引入各种形式的遗忘,以优先考虑最近的强化,或保持较高的发明率。我使用计算机模拟来研究每个模型的结果。每个模型都显示了结果的效率和公平性以及所发生的探索或开发的相对数量之间的关系的定性相似性。令人惊讶的是,效率和结果的公平性之间经常存在权衡。

更新日期:2023-10-24
down
wechat
bug