Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning

by Shaocong Ma and Yi Zhou

No Customer Reviews

This monograph introduces various value-based approaches for solving the policy evaluation problem in the online reinforcement learning (RL) scenario, which aims to learn the value function associated with a specific policy under a single Markov decision process (MDP). Approaches vary depending on whether they are implemented in an on-policy or off-policy manner. In on-policy settings, where the evaluation of the policy is conducted using data generated from the same policy that is being assessed, classical techniques such as TD(0), TD(λ), and their extensions with function approximation or variance reduction are employed in this setting. For off-policy evaluation, where samples are collected under a different behavior policy, this monograph introduces gradient-based two-timescale algorithms like GTD2, TDC, and variance-reduced TDC. These algorithms are designed to minimize the mean-squared projected Bellman error (MSPBE) as the objective function. This monograph also discusses their finite-sample convergence upper bounds and sample complexity.

Format:Paperback

Language:English

ISBN:1638283702

ISBN13:9781638283706

Release Date:August 2024

Publisher:Now Publishers

Length:60 Pages

Weight:0.21 lbs.

Dimensions:0.1" x 6.1" x 9.2"

Related Subjects

Engineering Math Mathematics Science & Math Technology

Customer Reviews

0 rating

Write a review

ThriftBooks sells millions of used books at the lowest everyday prices. We personally assess every book's quality and offer rare, out-of-print treasures. We deliver the joy of reading in recyclable packaging with free standard shipping on US orders over $15. ThriftBooks.com. Read more. Spend less.

Copyright © 2025 Thriftbooks.com Terms of Use | Privacy Policy | Do Not Sell/Share My Personal Information | Cookie Policy | Cookie Preferences | Accessibility Statement
ThriftBooks ^® and the ThriftBooks ^® logo are registered trademarks of Thrift Books Global, LLC

Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning

Recommended

Customer Reviews

Popular Categories

Website

My Account

Partnerships

Quick Help

About Us

Follow Us