TY - GEN
T1 - On the Pitfalls of Learning to Cooperate with Self Play Agents Checkpointed to Capture Humans of Diverse Skill Levels
AU - Biswas, Upasana
AU - Guan, Lin
AU - Kambhampati, Subbarao
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s)
PY - 2024/3/11
Y1 - 2024/3/11
N2 - When engaging in collaborative tasks with unknown team members, humans demonstrate the ability to predict the behavior of their partners and adapt to it. Autonomous agents do not exhibit such adaptability, often struggling to integrate with new partners in multi-agent cooperative scenarios. Past work towards tackling this problem includes sampling from a population of diverse training partners. This consists of self-play agents at various skill levels, generated by checkpointing at various points throughout their training. In this work, we show that such a set of agents isn't representative of human skill levels by evaluating their qualitative and quantitative performance on the Overcooked Domain. Our results demonstrate that self-play agents exhibit distinct learning patterns in contrast to humans and a partially trained self-play agent demonstrates behaviors that diverges significantly from that of a lower-skilled human counterpart.
AB - When engaging in collaborative tasks with unknown team members, humans demonstrate the ability to predict the behavior of their partners and adapt to it. Autonomous agents do not exhibit such adaptability, often struggling to integrate with new partners in multi-agent cooperative scenarios. Past work towards tackling this problem includes sampling from a population of diverse training partners. This consists of self-play agents at various skill levels, generated by checkpointing at various points throughout their training. In this work, we show that such a set of agents isn't representative of human skill levels by evaluating their qualitative and quantitative performance on the Overcooked Domain. Our results demonstrate that self-play agents exhibit distinct learning patterns in contrast to humans and a partially trained self-play agent demonstrates behaviors that diverges significantly from that of a lower-skilled human counterpart.
KW - Ad Hoc Teaming
KW - Human Agent Collaboration
KW - Mutual Adaptation
KW - Zero-Shot Coordination
UR - http://www.scopus.com/inward/record.url?scp=85188074066&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85188074066&partnerID=8YFLogxK
U2 - 10.1145/3610978.3640692
DO - 10.1145/3610978.3640692
M3 - Conference contribution
AN - SCOPUS:85188074066
T3 - ACM/IEEE International Conference on Human-Robot Interaction
SP - 252
EP - 256
BT - HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction
PB - IEEE Computer Society
T2 - 19th Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2024
Y2 - 11 March 2024 through 15 March 2024
ER -