Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning

Yantian Zha, Lin Guan, Subbarao Kambhampati

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Our work aims at efficiently leveraging ambiguous demonstrations for the training of a reinforcement learning (RL) agent. An ambiguous demonstration can usually be interpreted in multiple ways, which severely hinders the RL agent from learning stably and efficiently. Since an optimal demonstration may also suffer from being ambiguous, previous works that combine RL and learning from demonstration (RLfD works) may not work well. Inspired by how humans handle such situations, we propose to use self-explanation (an agent generates explanations for itself) to recognize valuable high-level relational features as an interpretation of why a successful trajectory is successful. This way, the agent can leverage the explained important relations as guidance for its RL learning. Our main contribution is to propose the Self-Explanation for RL from Demonstrations (SERLfD) framework, which can overcome the limitations of existing RLfD works. Our experimental results show that an RLfD model can be improved by using our SERLfD framework in terms of training stability and performance. To foster further research in self-explanation-guided robot learning, we have made our demonstrations and code publicly accessible at https://github.com/YantianZha/SERLfD. For a deeper understanding of our work, interested readers can refer to our arXiv version at https://arxiv.org/pdf/2110.05286.pdf, including an accompanying appendix.

Original languageEnglish (US)
Title of host publicationTechnical Tracks 14
EditorsMichael Wooldridge, Jennifer Dy, Sriraam Natarajan
PublisherAssociation for the Advancement of Artificial Intelligence
Pages10395-10403
Number of pages9
Edition9
ISBN (Electronic)1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 1577358872, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879, 9781577358879
DOIs
StatePublished - Mar 25 2024
Event38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada
Duration: Feb 20 2024Feb 27 2024

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
Number9
Volume38
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference38th AAAI Conference on Artificial Intelligence, AAAI 2024
Country/TerritoryCanada
CityVancouver
Period2/20/242/27/24

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning'. Together they form a unique fingerprint.

Cite this