TY - GEN
T1 - Temporal-Logic-Based Intermittent, Optimal, and Safe Continuous-Time Learning for Trajectory Tracking
AU - Kanellopoulos, Aris
AU - Fotiadis, Filippos
AU - Sun, Chuangchuang
AU - Xu, Zhe
AU - Vamvoudakis, Kyriakos G.
AU - Topcu, Ufuk
AU - DIxon, Warren E.
N1 - Funding Information:
5Warren E. Dixon is with the the Department of Mechanical and Aerospace Engineering, University of Florida, Gainesville, FL 32611 ´ 6250, USA, e-mail: wdixon@ufl.edu This work was supported in part, by ARO under grant No. W911NF-19´1´0270, by ONR Minerva under grant No. N00014´18´1´2160, by NSF under grant Nos. CAREER CPS-1851588 and S&AS 1849198, and by the Onassis Foundation-Scholarship ID: F ZQ 064 ´ 1{2020 ´ 2021.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - In this paper, we develop safe reinforcement-learning-based controllers for systems tasked with accomplishing complex missions that can be expressed as linear temporal logic specifications, similar to those required by search-and-rescue missions. We decompose the original mission into a sequence of tracking sub-problems under safety constraints. We impose the safety conditions by utilizing barrier functions to map the constrained optimal tracking problem in the physical space to an unconstrained one in the transformed space. Furthermore, we develop policies that intermittently update the control signal to solve the tracking sub-problems with reduced burden in the communication and computation resources. Subsequently, an actor-critic algorithm is utilized to solve the underlying Hamilton-Jacobi-Bellman equations. Finally, we support our proposed framework with stability proofs and showcase its efficacy via simulation results.
AB - In this paper, we develop safe reinforcement-learning-based controllers for systems tasked with accomplishing complex missions that can be expressed as linear temporal logic specifications, similar to those required by search-and-rescue missions. We decompose the original mission into a sequence of tracking sub-problems under safety constraints. We impose the safety conditions by utilizing barrier functions to map the constrained optimal tracking problem in the physical space to an unconstrained one in the transformed space. Furthermore, we develop policies that intermittently update the control signal to solve the tracking sub-problems with reduced burden in the communication and computation resources. Subsequently, an actor-critic algorithm is utilized to solve the underlying Hamilton-Jacobi-Bellman equations. Finally, we support our proposed framework with stability proofs and showcase its efficacy via simulation results.
UR - http://www.scopus.com/inward/record.url?scp=85126053881&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126053881&partnerID=8YFLogxK
U2 - 10.1109/CDC45484.2021.9683309
DO - 10.1109/CDC45484.2021.9683309
M3 - Conference contribution
AN - SCOPUS:85126053881
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1263
EP - 1268
BT - 60th IEEE Conference on Decision and Control, CDC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 60th IEEE Conference on Decision and Control, CDC 2021
Y2 - 13 December 2021 through 17 December 2021
ER -