2. Instructor Information
3. Course Objectives
This course aims at studying basic theory and practical algorithms of reinforcement learning (RL).
By the end of the course, students are expected to
define and explain key features of RL,
know how to use RL for a given application problem,
implement common RL algorithms,
understand theoretical and empirical approaches for evaluating the quality of a RL algorithm, and
hopefully, formulate and solve research problems in RL.
4. Prerequisites & require
Mandatory: AI, machine learning, calculus, probability & statistics
Recommend: optimization, programming
5. Grading
The grade is based on
quizzes and class participation (30%),
assignments (30%),
paper presentation (10%), and
project (30%).
If you miss five classes, you will receive F, no matter what.
6. Course Materials
| Title |
Author |
Publisher |
Publication Year/Edition |
ISBN |
7. Course References
There is no official textbook.
Some references:
- “Reinforcement Learning: An Introduction”, by R. S. Sutton and A. G. Barto, MIT Press, 2020. (http://incompleteideas.net/book/RLbook2020.pdf)
- “Bandit algorithm” by T. Lattimore and C. Szepesvari
8. Course Plan
1. Introduction
2. MDP
3. Model-Free Evaluation & Control
4. Policy Gradient, PPO
5. Imitation Learning
6. RLHF and DPO
7. Offline RL
8. Multi-Armed Bandit
9. Bayesian Bandit
10. Data Efficient RL
11. Monte-Carlo Tree Search
12. Case studies: AlphaGo, DDPG, GRPO, etc
13. Research paper presentations
9. Course Operation
Offline.
10. How to Teach & Remark
.
11. Supports for Students with a Disability
- Taking Course: interpreting services (for hearing impairment), Mobility and preferential seating assistances (for developmental disability), Note taking(for all kinds of disabilities) and etc.
- Taking Exam: Extended exam period (for all kinds of disabilities, if needed), Magnified exam papers (for sight disability), and etc.
- Please contact Center for Students with Disabilities (279-2434) for additional assistance