3. 강의목표
This course aims at studying basic theory and practical algorithms of reinforcement learning (RL).
By the end of the course, students are expected to
define and explain key features of RL,
know how to use RL for a given application problem,
implement common RL algorithms,
understand theoretical and empirical approaches for evaluating the quality of a RL algorithm, and
hopefully, formulate and solve research problems in RL.
4. 강의선수/수강필수사항
Mandatory: AI, machine learning, calculus, probability & statistics
Recommend: optimization, programming
5. 성적평가
The grade is based on
quizzes and class participation (30%),
assignments (30%),
paper presentation (10%), and
project (30%).
If you miss five classes, you will receive F, no matter what.
7. 참고문헌 및 자료
There is no official textbook.
Some references:
- “Reinforcement Learning: An Introduction”, by R. S. Sutton and A. G. Barto, MIT Press, 2020. (http://incompleteideas.net/book/RLbook2020.pdf)
- “Bandit algorithm” by T. Lattimore and C. Szepesvari
8. 강의진도계획
1. Introduction
2. MDP
3. Model-Free Evaluation & Control
4. Policy Gradient, PPO
5. Imitation Learning
6. RLHF and DPO
7. Offline RL
8. Multi-Armed Bandit
9. Bayesian Bandit
10. Data Efficient RL
11. Monte-Carlo Tree Search
12. Case studies: AlphaGo, DDPG, GRPO, etc
13. Research paper presentations
11. 장애학생에 대한 학습지원 사항
- 수강 관련: 문자 통역(청각), 교과목 보조(발달), 노트필기(전 유형) 등
- 시험 관련: 시험시간 연장(필요시 전 유형), 시험지 확대 복사(시각) 등
- 기타 추가 요청사항 발생 시 장애학생지원센터(279-2434)로 요청