I-CPIE Technical Research Virtual Seminar: Reinforcement Learning using Generative Models for Continuous State and Action Space Systems
Reinforcement Learning using Generative Models for Continuous State and Action Space Systems
Rahul Jain, PhD
Kenneth C. Dahlberg Early Career Chair in Electrical Engineering
University of Southern California
Lehigh University’s Institute for Cyber Physical Infrastructure and Energy (I-CPIE)
4:00 pm EST
September 9th, 2019
Reinforcement Learning (RL) problems for continuous state and action space systems are among the most challenging in RL. Recently, deep reinforcement learning methods have been shown to be quite effective for certain RL problems in settings of very large/continuous state and action spaces. But such methods require extensive hyper-parameter tuning, huge amount of data, and come with no performance guarantees. We note that such methods are mostly trained `offline’ on experience replay buffers. In this talk, I will describe a series of simple reinforcement learning schemes for various settings. Our premise is that we have access to a generative model that can give us simulated samples of the next state. We will start with finite state and action space MDPs. An `empirical value learning’ (EVL) algorithm can be derived quite simply by replacing the expectation in the Bellman operator with an empirical estimate. We note that the EVL algorithm has remarkably good numerical performance for practical purposes. We next extend this to continuous state spaces by considering randomized function approximation on a reproducible kernel Hilbert space (RKHS). This allows for arbitrarily good approximation with high probability for any problem due to its universal function approximation property. Last, I will introduce the RANDPOL (randomized function approximation for policy iteration) algorithm, an actor-critic algorithm that used randomized neural networks that can successfully solve a tough robotic problem. We also provide theoretical performance guarantees for the algorithm. I will also touch upon the probabilistic contraction analysis framework of iterative stochastic algorithms that underpins the theoretical analysis.
Rahul Jain is the K. C. Dahlberg Early Career Chair and Associate Professor of Electrical & Computer Engineering, Computer Science* and ISE* (*by courtesy) at the University of Southern California (USC). He received a B.Tech from the IIT Kanpur, and an MA in Statistics and a PhD in EECS from the University of California, Berkeley. Prior to joining USC, he was at the IBM T J Watson Research Center, Yorktown Heights, NY. He has received numerous awards including the NSF CAREER award, the ONR Young Investigator award, an IBM Faculty award, the James H. Zumberge Faculty Research and Innovation Award, and is a US Fulbright Scholar. His interests span reinforcement learning, statistical learning, stochastic control, stochastic networks, and game theory, and power systems and healthcare on the applications side. This talk is based on work with a number of people that includes Vivek Borkar (IIT Bombay), Peter Glynn (Stanford), Abhishek Gupta (Ohio State), William Haskell (Purdue), Dileep Kalathil (Texas A&M), and Hiteshi Sharma (USC).
Will you be attending Dr. Jain's virtual seminar?
Send me a copy of my responses.
Never submit passwords through Google Forms.
This form was created inside of Lehigh University.