On-line learning control by association and reinforcement

Jennie Si, Yu Tsung Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations


This paper focuses on a systematic treatment for developing a generic on-line learning control system based on the fundamental principle of reinforcement learning or more specifically neuro-dynamic programming. This real time learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and try to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. Two successful candidates of on-line learning control designs will be introduced. Real time learning algorithms will be derived for individual components in the learning system. Some analytical insight will be provided to give some guidelines on the entire on-line learning control system. The performance of the on-line learning controller is measured by its learning speed, success rate of learning, and the degree to meet the learning control objective. The overall learning control system performance will be tested on a single cart-pole balancing problem and a more complex problem of balancing a triple-link inverted pendulum.

Original languageEnglish (US)
Title of host publicationProceedings of the International Joint Conference on Neural Networks
Place of PublicationPiscataway, NJ, United States
Number of pages6
StatePublished - 2000
EventInternational Joint Conference on Neural Networks (IJCNN'2000) - Como, Italy
Duration: Jul 24 2000Jul 27 2000


OtherInternational Joint Conference on Neural Networks (IJCNN'2000)
CityComo, Italy

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'On-line learning control by association and reinforcement'. Together they form a unique fingerprint.

Cite this