Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Ben Amor

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process. Code is available at https://github.com/ir-lab/ModAttn.

Original languageEnglish (US)
Pages (from-to)1684-1695
Number of pages12
JournalProceedings of Machine Learning Research
Volume205
StatePublished - 2023
Event6th Conference on Robot Learning, CoRL 2022 - Auckland, New Zealand
Duration: Dec 14 2022Dec 18 2022

Keywords

  • Attention
  • Imitation
  • Language-Conditioned Learning
  • Modularity

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation'. Together they form a unique fingerprint.

Cite this