Random-Sampling Monte-Carlo Tree Search Methods for Cost Approximation in Long-Horizon Optimal Control

Shankarachary Ragi; Hans D. Mittelmann

doi:10.1109/LCSYS.2020.3043991

Random-Sampling Monte-Carlo Tree Search Methods for Cost Approximation in Long-Horizon Optimal Control

Shankarachary Ragi, Hans D. Mittelmann

Mathematical and Statistical Sciences, School of (SoMSS)

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

We develop Monte-Carlo based heuristic approaches to approximate the objective function in long horizon optimal control problems. In these approaches, to approximate the expectation operator in the objective function, we evolve the system state over multiple trajectories into the future while sampling the noise disturbances at each time-step, and find the average (or weighted average) of the costs along all the trajectories. We call these methods random sampling - multipath hypothesis propagation or RS-MHP. These methods (or variants) exist in the literature; however, the literature lacks results on how well these approximation strategies converge. This letter fills this knowledge gap to a certain extent. We derive stochastic convergence results for the cost approximation error from the RS-MHP methods and discuss their convergence (in probability) as the sample size increases. We consider two case studies to demonstrate the effectiveness of our methods - a) linear quadratic control problem; b) unmanned aerial vehicle path optimization problem.

Original language	English (US)
Article number	9289842
Pages (from-to)	1759-1764
Number of pages	6
Journal	IEEE Control Systems Letters
Volume	5
Issue number	5
DOIs	https://doi.org/10.1109/LCSYS.2020.3043991
State	Published - Nov 2021

Keywords

Markov processes
Optimal control
discrete event systems
optimization

ASJC Scopus subject areas

Control and Systems Engineering
Control and Optimization

Access to Document

10.1109/LCSYS.2020.3043991

Cite this

@article{b1299c33f3eb43e3a3532d46cf4540af,

title = "Random-Sampling Monte-Carlo Tree Search Methods for Cost Approximation in Long-Horizon Optimal Control",

abstract = "We develop Monte-Carlo based heuristic approaches to approximate the objective function in long horizon optimal control problems. In these approaches, to approximate the expectation operator in the objective function, we evolve the system state over multiple trajectories into the future while sampling the noise disturbances at each time-step, and find the average (or weighted average) of the costs along all the trajectories. We call these methods random sampling - multipath hypothesis propagation or RS-MHP. These methods (or variants) exist in the literature; however, the literature lacks results on how well these approximation strategies converge. This letter fills this knowledge gap to a certain extent. We derive stochastic convergence results for the cost approximation error from the RS-MHP methods and discuss their convergence (in probability) as the sample size increases. We consider two case studies to demonstrate the effectiveness of our methods - a) linear quadratic control problem; b) unmanned aerial vehicle path optimization problem.",

keywords = "Markov processes, Optimal control, discrete event systems, optimization",

author = "Shankarachary Ragi and Mittelmann, {Hans D.}",

note = "Publisher Copyright: {\textcopyright} 2017 IEEE.",

year = "2021",

month = nov,

doi = "10.1109/LCSYS.2020.3043991",

language = "English (US)",

volume = "5",

pages = "1759--1764",

journal = "IEEE Control Systems Letters",

issn = "2475-1456",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "5",

}

TY - JOUR

T1 - Random-Sampling Monte-Carlo Tree Search Methods for Cost Approximation in Long-Horizon Optimal Control

AU - Ragi, Shankarachary

AU - Mittelmann, Hans D.

PY - 2021/11

Y1 - 2021/11

N2 - We develop Monte-Carlo based heuristic approaches to approximate the objective function in long horizon optimal control problems. In these approaches, to approximate the expectation operator in the objective function, we evolve the system state over multiple trajectories into the future while sampling the noise disturbances at each time-step, and find the average (or weighted average) of the costs along all the trajectories. We call these methods random sampling - multipath hypothesis propagation or RS-MHP. These methods (or variants) exist in the literature; however, the literature lacks results on how well these approximation strategies converge. This letter fills this knowledge gap to a certain extent. We derive stochastic convergence results for the cost approximation error from the RS-MHP methods and discuss their convergence (in probability) as the sample size increases. We consider two case studies to demonstrate the effectiveness of our methods - a) linear quadratic control problem; b) unmanned aerial vehicle path optimization problem.

AB - We develop Monte-Carlo based heuristic approaches to approximate the objective function in long horizon optimal control problems. In these approaches, to approximate the expectation operator in the objective function, we evolve the system state over multiple trajectories into the future while sampling the noise disturbances at each time-step, and find the average (or weighted average) of the costs along all the trajectories. We call these methods random sampling - multipath hypothesis propagation or RS-MHP. These methods (or variants) exist in the literature; however, the literature lacks results on how well these approximation strategies converge. This letter fills this knowledge gap to a certain extent. We derive stochastic convergence results for the cost approximation error from the RS-MHP methods and discuss their convergence (in probability) as the sample size increases. We consider two case studies to demonstrate the effectiveness of our methods - a) linear quadratic control problem; b) unmanned aerial vehicle path optimization problem.

KW - Markov processes

KW - Optimal control

KW - discrete event systems

KW - optimization

UR - http://www.scopus.com/inward/record.url?scp=85097950920&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85097950920&partnerID=8YFLogxK

U2 - 10.1109/LCSYS.2020.3043991

DO - 10.1109/LCSYS.2020.3043991

M3 - Article

AN - SCOPUS:85097950920

SN - 2475-1456

VL - 5

SP - 1759

EP - 1764

JO - IEEE Control Systems Letters

JF - IEEE Control Systems Letters

IS - 5

M1 - 9289842

ER -

Random-Sampling Monte-Carlo Tree Search Methods for Cost Approximation in Long-Horizon Optimal Control

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this