Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems

Rushang Karia; Rashmeet Kaur Nayyar; Siddharth Srivastava

Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems

Rushang Karia, Rashmeet Kaur Nayyar, Siddharth Srivastava

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Several goal-oriented problems in the real-world can be naturally expressed as Stochastic Shortest Path problems (SSPs). However, the computational complexity of solving SSPs makes finding solutions to even moderately sized problems intractable. State-of-the-art SSP solvers are unable to learn generalized solutions or policies that would solve multiple problem instances with different object names and/or quantities. This paper presents an approach for learning Generalized Policy Automata (GPAs): non-deterministic partial policies that can be used to catalyze the solution process. GPAs are learned using relational, feature-based abstractions, which makes them applicable on broad classes of related problems with different object names and quantities. Theoretical analysis of this approach shows that it guarantees completeness and hierarchical optimality. Empirical analysis shows that this approach effectively learns broadly applicable policy knowledge in a few-shot fashion and significantly outperforms state-of-the-art SSP solvers on test problems whose object counts are far greater than those used during training.

Original language	English (US)
Title of host publication	Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
Editors	S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
Publisher	Neural information processing systems foundation
ISBN (Electronic)	9781713871088
State	Published - 2022
Externally published	Yes
Event	36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States Duration: Nov 28 2022 → Dec 9 2022

Publication series

Name	Advances in Neural Information Processing Systems
Volume	35
ISSN (Print)	1049-5258

Conference

Conference	36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/Territory	United States
City	New Orleans
Period	11/28/22 → 12/9/22

ASJC Scopus subject areas

Computer Networks and Communications
Information Systems
Signal Processing

Cite this

Karia, R., Nayyar, R. K., & Srivastava, S. (2022). Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Eds.), Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022 (Advances in Neural Information Processing Systems; Vol. 35). Neural information processing systems foundation.

Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems. / Karia, Rushang; Nayyar, Rashmeet Kaur; Srivastava, Siddharth.
Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022. ed. / S. Koyejo; S. Mohamed; A. Agarwal; D. Belgrave; K. Cho; A. Oh. Neural information processing systems foundation, 2022. (Advances in Neural Information Processing Systems; Vol. 35).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Karia, R, Nayyar, RK & Srivastava, S 2022, Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems. in S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho & A Oh (eds), Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022. Advances in Neural Information Processing Systems, vol. 35, Neural information processing systems foundation, 36th Conference on Neural Information Processing Systems, NeurIPS 2022, New Orleans, United States, 11/28/22.

Karia R, Nayyar RK, Srivastava S. Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems. In Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, editors, Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022. Neural information processing systems foundation. 2022. (Advances in Neural Information Processing Systems).

Karia, Rushang ; Nayyar, Rashmeet Kaur ; Srivastava, Siddharth. / Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems. Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022. editor / S. Koyejo ; S. Mohamed ; A. Agarwal ; D. Belgrave ; K. Cho ; A. Oh. Neural information processing systems foundation, 2022. (Advances in Neural Information Processing Systems).

@inproceedings{a35a48f1d2a54d2db87e22c738e2ecfa,

title = "Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems",

abstract = "Several goal-oriented problems in the real-world can be naturally expressed as Stochastic Shortest Path problems (SSPs). However, the computational complexity of solving SSPs makes finding solutions to even moderately sized problems intractable. State-of-the-art SSP solvers are unable to learn generalized solutions or policies that would solve multiple problem instances with different object names and/or quantities. This paper presents an approach for learning Generalized Policy Automata (GPAs): non-deterministic partial policies that can be used to catalyze the solution process. GPAs are learned using relational, feature-based abstractions, which makes them applicable on broad classes of related problems with different object names and quantities. Theoretical analysis of this approach shows that it guarantees completeness and hierarchical optimality. Empirical analysis shows that this approach effectively learns broadly applicable policy knowledge in a few-shot fashion and significantly outperforms state-of-the-art SSP solvers on test problems whose object counts are far greater than those used during training.",

author = "Rushang Karia and Nayyar, {Rashmeet Kaur} and Siddharth Srivastava",

note = "Funding Information: We would like to thank Deepak Kala Vasuvadnefor help with a prototype implementation of the sourcecode.WewouldliketothanktheResearchComputingGroupatArizonaStateUvnersityi for providing compute hours for our experiments. This work was supported in part by the NSF under grants IIS 1909370 and IIS 1942856. Publisher Copyright: {\textcopyright} 2022 Neural information processing systems foundation. All rights reserved.; 36th Conference on Neural Information Processing Systems, NeurIPS 2022 ; Conference date: 28-11-2022 Through 09-12-2022",

year = "2022",

language = "English (US)",

series = "Advances in Neural Information Processing Systems",

publisher = "Neural information processing systems foundation",

editor = "S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh",

booktitle = "Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022",

}

TY - GEN

T1 - Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems

AU - Karia, Rushang

AU - Nayyar, Rashmeet Kaur

AU - Srivastava, Siddharth

N1 - Funding Information: We would like to thank Deepak Kala Vasuvadnefor help with a prototype implementation of the sourcecode.WewouldliketothanktheResearchComputingGroupatArizonaStateUvnersityi for providing compute hours for our experiments. This work was supported in part by the NSF under grants IIS 1909370 and IIS 1942856. Publisher Copyright: © 2022 Neural information processing systems foundation. All rights reserved.

PY - 2022

Y1 - 2022

N2 - Several goal-oriented problems in the real-world can be naturally expressed as Stochastic Shortest Path problems (SSPs). However, the computational complexity of solving SSPs makes finding solutions to even moderately sized problems intractable. State-of-the-art SSP solvers are unable to learn generalized solutions or policies that would solve multiple problem instances with different object names and/or quantities. This paper presents an approach for learning Generalized Policy Automata (GPAs): non-deterministic partial policies that can be used to catalyze the solution process. GPAs are learned using relational, feature-based abstractions, which makes them applicable on broad classes of related problems with different object names and quantities. Theoretical analysis of this approach shows that it guarantees completeness and hierarchical optimality. Empirical analysis shows that this approach effectively learns broadly applicable policy knowledge in a few-shot fashion and significantly outperforms state-of-the-art SSP solvers on test problems whose object counts are far greater than those used during training.

AB - Several goal-oriented problems in the real-world can be naturally expressed as Stochastic Shortest Path problems (SSPs). However, the computational complexity of solving SSPs makes finding solutions to even moderately sized problems intractable. State-of-the-art SSP solvers are unable to learn generalized solutions or policies that would solve multiple problem instances with different object names and/or quantities. This paper presents an approach for learning Generalized Policy Automata (GPAs): non-deterministic partial policies that can be used to catalyze the solution process. GPAs are learned using relational, feature-based abstractions, which makes them applicable on broad classes of related problems with different object names and quantities. Theoretical analysis of this approach shows that it guarantees completeness and hierarchical optimality. Empirical analysis shows that this approach effectively learns broadly applicable policy knowledge in a few-shot fashion and significantly outperforms state-of-the-art SSP solvers on test problems whose object counts are far greater than those used during training.

UR - http://www.scopus.com/inward/record.url?scp=85163169359&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85163169359&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85163169359

T3 - Advances in Neural Information Processing Systems

BT - Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022

A2 - Koyejo, S.

A2 - Mohamed, S.

A2 - Agarwal, A.

A2 - Belgrave, D.

A2 - Cho, K.

A2 - Oh, A.

PB - Neural information processing systems foundation

T2 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022

Y2 - 28 November 2022 through 9 December 2022

ER -

Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this