Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Yifan Zhou; Shubham Sonawani; Mariano Phielipp; Simon Stepputtis; Heni Ben Amor

Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Ben Amor

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Conference article › peer-review

1 Scopus citations

Abstract

Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process. Code is available at https://github.com/ir-lab/ModAttn.

Original language	English (US)
Pages (from-to)	1684-1695
Number of pages	12
Journal	Proceedings of Machine Learning Research
Volume	205
State	Published - 2023
Event	6th Conference on Robot Learning, CoRL 2022 - Auckland, New Zealand Duration: Dec 14 2022 → Dec 18 2022

Keywords

Attention
Imitation
Language-Conditioned Learning
Modularity

ASJC Scopus subject areas

Artificial Intelligence
Software
Control and Systems Engineering
Statistics and Probability

Cite this

@article{b65fdd1a53ae4a2c8e7f624e839c6e81,

title = "Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation",

abstract = "Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process. Code is available at https://github.com/ir-lab/ModAttn.",

keywords = "Attention, Imitation, Language-Conditioned Learning, Modularity",

author = "Yifan Zhou and Shubham Sonawani and Mariano Phielipp and Simon Stepputtis and Amor, {Heni Ben}",

note = "Funding Information: This research was partially funded by grants NSF CNS 1932068 and IIS 1749783. Publisher Copyright: {\textcopyright} 2023 Proceedings of Machine Learning Research. All rights reserved.; 6th Conference on Robot Learning, CoRL 2022 ; Conference date: 14-12-2022 Through 18-12-2022",

year = "2023",

language = "English (US)",

volume = "205",

pages = "1684--1695",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

}

TY - JOUR

T1 - Modularity through Attention

T2 - 6th Conference on Robot Learning, CoRL 2022

AU - Zhou, Yifan

AU - Sonawani, Shubham

AU - Phielipp, Mariano

AU - Stepputtis, Simon

AU - Amor, Heni Ben

PY - 2023

Y1 - 2023

N2 - Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process. Code is available at https://github.com/ir-lab/ModAttn.

AB - Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process. Code is available at https://github.com/ir-lab/ModAttn.

KW - Attention

KW - Imitation

KW - Language-Conditioned Learning

KW - Modularity

UR - http://www.scopus.com/inward/record.url?scp=85164950868&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85164950868&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85164950868

SN - 2640-3498

VL - 205

SP - 1684

EP - 1695

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

Y2 - 14 December 2022 through 18 December 2022

ER -

Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Abstract

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this