Transfer of temporal logic formulas in reinforcement learning

Zhe Xu; Ufuk Topcu

doi:10.24963/ijcai.2019/557

Transfer of temporal logic formulas in reinforcement learning

Zhe Xu, Ufuk Topcu

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

27 Scopus citations

Abstract

Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL). For example, propositional logic and first-order logic have been used as representations of such knowledge. We study the transfer of knowledge between tasks in which the timing of the events matters. We call such tasks temporal tasks. We concretize similarity between temporal tasks through a notion of logical transferability, and develop a transfer learning approach between different yet similar temporal tasks. We first propose an inference technique to extract metric interval temporal logic (MITL) formulas in sequential disjunctive normal form from labeled trajectories collected in RL of the two tasks. If logical transferability is identified through this inference, we construct a timed automaton for each sequential conjunctive subformula of the inferred MITL formulas from both tasks. We perform RL on the extended state which includes the locations and clock valuations of the timed automata for the source task. We then establish mappings between the corresponding components (clocks, locations, etc.) of the timed automata from the two tasks, and transfer the extended Q-functions based on the established mappings. Finally, we perform RL on the extended state for the target task, starting with the transferred extended Q-functions. Our implementation results show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing RL in the extended state space, and further improved by up to another order of magnitude using the transferred extended Q-functions.

Original language	English (US)
Title of host publication	Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
Editors	Sarit Kraus
Publisher	International Joint Conferences on Artificial Intelligence
Pages	4010-4018
Number of pages	9
ISBN (Electronic)	9780999241141
DOIs	https://doi.org/10.24963/ijcai.2019/557
State	Published - 2019
Externally published	Yes
Event	28th International Joint Conference on Artificial Intelligence, IJCAI 2019 - Macao, China Duration: Aug 10 2019 → Aug 16 2019

Publication series

Name	IJCAI International Joint Conference on Artificial Intelligence
Volume	2019-August
ISSN (Print)	1045-0823

Conference

Conference	28th International Joint Conference on Artificial Intelligence, IJCAI 2019
Country/Territory	China
City	Macao
Period	8/10/19 → 8/16/19

ASJC Scopus subject areas

Artificial Intelligence

Access to Document

10.24963/ijcai.2019/557

Cite this

Xu, Z., & Topcu, U. (2019). Transfer of temporal logic formulas in reinforcement learning. In S. Kraus (Ed.), Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019 (pp. 4010-4018). (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2019-August). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/557

Transfer of temporal logic formulas in reinforcement learning. / Xu, Zhe; Topcu, Ufuk.
Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. ed. / Sarit Kraus. International Joint Conferences on Artificial Intelligence, 2019. p. 4010-4018 (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2019-August).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Xu, Z & Topcu, U 2019, Transfer of temporal logic formulas in reinforcement learning. in S Kraus (ed.), Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. IJCAI International Joint Conference on Artificial Intelligence, vol. 2019-August, International Joint Conferences on Artificial Intelligence, pp. 4010-4018, 28th International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 8/10/19. https://doi.org/10.24963/ijcai.2019/557

@inproceedings{86ffa27af49c45a3ba1cf6f0919333b2,

title = "Transfer of temporal logic formulas in reinforcement learning",

abstract = "Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL). For example, propositional logic and first-order logic have been used as representations of such knowledge. We study the transfer of knowledge between tasks in which the timing of the events matters. We call such tasks temporal tasks. We concretize similarity between temporal tasks through a notion of logical transferability, and develop a transfer learning approach between different yet similar temporal tasks. We first propose an inference technique to extract metric interval temporal logic (MITL) formulas in sequential disjunctive normal form from labeled trajectories collected in RL of the two tasks. If logical transferability is identified through this inference, we construct a timed automaton for each sequential conjunctive subformula of the inferred MITL formulas from both tasks. We perform RL on the extended state which includes the locations and clock valuations of the timed automata for the source task. We then establish mappings between the corresponding components (clocks, locations, etc.) of the timed automata from the two tasks, and transfer the extended Q-functions based on the established mappings. Finally, we perform RL on the extended state for the target task, starting with the transferred extended Q-functions. Our implementation results show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing RL in the extended state space, and further improved by up to another order of magnitude using the transferred extended Q-functions.",

author = "Zhe Xu and Ufuk Topcu",

note = "Funding Information: This research was partially supported by AFOSR FA9550-19-1-0005, DARPA D19AP00004, NSF 1652113 and ONR N00014-18-1-2829. Publisher Copyright: {\textcopyright} 2019 International Joint Conferences on Artificial Intelligence. All rights reserved.; 28th International Joint Conference on Artificial Intelligence, IJCAI 2019 ; Conference date: 10-08-2019 Through 16-08-2019",

year = "2019",

doi = "10.24963/ijcai.2019/557",

language = "English (US)",

series = "IJCAI International Joint Conference on Artificial Intelligence",

publisher = "International Joint Conferences on Artificial Intelligence",

pages = "4010--4018",

editor = "Sarit Kraus",

booktitle = "Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019",

}

TY - GEN

T1 - Transfer of temporal logic formulas in reinforcement learning

AU - Xu, Zhe

AU - Topcu, Ufuk

N1 - Funding Information: This research was partially supported by AFOSR FA9550-19-1-0005, DARPA D19AP00004, NSF 1652113 and ONR N00014-18-1-2829. Publisher Copyright: © 2019 International Joint Conferences on Artificial Intelligence. All rights reserved.

PY - 2019

Y1 - 2019

N2 - Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL). For example, propositional logic and first-order logic have been used as representations of such knowledge. We study the transfer of knowledge between tasks in which the timing of the events matters. We call such tasks temporal tasks. We concretize similarity between temporal tasks through a notion of logical transferability, and develop a transfer learning approach between different yet similar temporal tasks. We first propose an inference technique to extract metric interval temporal logic (MITL) formulas in sequential disjunctive normal form from labeled trajectories collected in RL of the two tasks. If logical transferability is identified through this inference, we construct a timed automaton for each sequential conjunctive subformula of the inferred MITL formulas from both tasks. We perform RL on the extended state which includes the locations and clock valuations of the timed automata for the source task. We then establish mappings between the corresponding components (clocks, locations, etc.) of the timed automata from the two tasks, and transfer the extended Q-functions based on the established mappings. Finally, we perform RL on the extended state for the target task, starting with the transferred extended Q-functions. Our implementation results show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing RL in the extended state space, and further improved by up to another order of magnitude using the transferred extended Q-functions.

AB - Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL). For example, propositional logic and first-order logic have been used as representations of such knowledge. We study the transfer of knowledge between tasks in which the timing of the events matters. We call such tasks temporal tasks. We concretize similarity between temporal tasks through a notion of logical transferability, and develop a transfer learning approach between different yet similar temporal tasks. We first propose an inference technique to extract metric interval temporal logic (MITL) formulas in sequential disjunctive normal form from labeled trajectories collected in RL of the two tasks. If logical transferability is identified through this inference, we construct a timed automaton for each sequential conjunctive subformula of the inferred MITL formulas from both tasks. We perform RL on the extended state which includes the locations and clock valuations of the timed automata for the source task. We then establish mappings between the corresponding components (clocks, locations, etc.) of the timed automata from the two tasks, and transfer the extended Q-functions based on the established mappings. Finally, we perform RL on the extended state for the target task, starting with the transferred extended Q-functions. Our implementation results show, depending on how similar the source task and the target task are, that the sampling efficiency for the target task can be improved by up to one order of magnitude by performing RL in the extended state space, and further improved by up to another order of magnitude using the transferred extended Q-functions.

UR - http://www.scopus.com/inward/record.url?scp=85074929873&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074929873&partnerID=8YFLogxK

U2 - 10.24963/ijcai.2019/557

DO - 10.24963/ijcai.2019/557

M3 - Conference contribution

AN - SCOPUS:85074929873

T3 - IJCAI International Joint Conference on Artificial Intelligence

SP - 4010

EP - 4018

BT - Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019

A2 - Kraus, Sarit

PB - International Joint Conferences on Artificial Intelligence

T2 - 28th International Joint Conference on Artificial Intelligence, IJCAI 2019

Y2 - 10 August 2019 through 16 August 2019

ER -

Transfer of temporal logic formulas in reinforcement learning

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this