Improved temporal difference methods with linear function approximation

Research output: Chapter in Book/Report/Conference proceedingChapter

37 Scopus citations


This chapter considers temporal difference algorithms within the context of infinite-horizon finite-state dynamic programming problems with discounted cost and linear cost function approximation. This problem arises as a subproblem in the policy iteration method of dynamic programming. Additional discussions of such problems can be found in Chapters 6 and 12. The method presented here is the first iterative temporal difference method that converges without requiring a diminishing step size. The chapter discusses the connections with Sutton’s ID(λ) and with various versions of least-squares that are based on value iteration. It is shown using both analysis and experiments that the proposed method is substantially faster, simpler, and more reliable than TD(λ). Comparisons are also made with the LSTD method of Boyan, and Bradtke and Barto.

Original languageEnglish (US)
Title of host publicationHandbook of Learning and Approximate Dynamic Programming
PublisherJohn Wiley and Sons Inc.
Number of pages25
ISBN (Electronic)9780470544785
ISBN (Print)047166054X, 9780471660545
StatePublished - Jan 1 2004
Externally publishedYes


  • Argon
  • Convergence
  • Eigenvalues and eigenfunctions
  • Function approximation
  • Markov processes
  • Trajectory
  • Vectors

ASJC Scopus subject areas

  • General Computer Science


Dive into the research topics of 'Improved temporal difference methods with linear function approximation'. Together they form a unique fingerprint.

Cite this