TY - GEN
T1 - Sparse latent space policy search
AU - Luck, Kevin Sebastian
AU - Pajarinen, Joni
AU - Berger, Erik
AU - Kyrki, Ville
AU - Ben Amor, Hani
N1 - Publisher Copyright:
© Copyright 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2016
Y1 - 2016
N2 - Computational agents often need to learn policies that involve many control variables, e.g., a robot needs to control several joints simultaneously. Learning a policy with a high number of parameters, however, usually requires a large number of training samples. We introduce a reinforcement learning method for sampleefficient policy search that exploits correlations between control variables. Such correlations are particularly frequent in motor skill learning tasks. The introduced method uses Variational Inference to estimate policy parameters, while at the same time uncovering a lowdimensional latent space of controls. Prior knowledge about the task and the structure of the learning agent can be provided by specifying groups of potentially correlated parameters. This information is then used to impose sparsity constraints on the mapping between the high-dimensional space of controls and a lowerdimensional latent space. In experiments with a simulated bi-manual manipulator, the new approach effectively identifies synergies between joints, performs efficient low-dimensional policy search, and outperforms state-of-the-art policy search methods.
AB - Computational agents often need to learn policies that involve many control variables, e.g., a robot needs to control several joints simultaneously. Learning a policy with a high number of parameters, however, usually requires a large number of training samples. We introduce a reinforcement learning method for sampleefficient policy search that exploits correlations between control variables. Such correlations are particularly frequent in motor skill learning tasks. The introduced method uses Variational Inference to estimate policy parameters, while at the same time uncovering a lowdimensional latent space of controls. Prior knowledge about the task and the structure of the learning agent can be provided by specifying groups of potentially correlated parameters. This information is then used to impose sparsity constraints on the mapping between the high-dimensional space of controls and a lowerdimensional latent space. In experiments with a simulated bi-manual manipulator, the new approach effectively identifies synergies between joints, performs efficient low-dimensional policy search, and outperforms state-of-the-art policy search methods.
UR - http://www.scopus.com/inward/record.url?scp=85007210846&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85007210846&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85007210846
T3 - 30th AAAI Conference on Artificial Intelligence, AAAI 2016
SP - 1911
EP - 1918
BT - 30th AAAI Conference on Artificial Intelligence, AAAI 2016
PB - AAAI press
T2 - 30th AAAI Conference on Artificial Intelligence, AAAI 2016
Y2 - 12 February 2016 through 17 February 2016
ER -