3. 강의목표
This course aims at studying basic theory and practical algorithms of reinforcement learning (RL) so that being capable of understanding research papers on RL, applying RL techniques to problems in other fields, and, hopefully, formulating/solving research problems in RL.
4. 강의선수/수강필수사항
Mandatory: machine learning, calculus, probability & statistics
Recommend: optimization, programming
5. 성적평가
The grade is based on assignments (35%), midterm exam (20%), paper review (20%), proposal (20%), participation (5%), while there is policy of 5-strike out for attendance, i.e., 5 absences lead to F, no matter what (email me in advance of inevitable absence).
6. 강의교재
도서명 |
저자명 |
출판사 |
출판년도 |
ISBN |
Reinforcement Learning: An Introduction
|
R. S. Sutton and A. G. Barto
|
MIT Press
|
2018
|
|
7. 참고문헌 및 자료
There will be no official textbook. However, most of contents are based on the following books:
- “Bandit algorithm” by T. Lattimore and C. Szepesvari
- “Reinforcement Learning: An Introduction”, by R. S. Sutton and A. G. Barto, MIT Press, 2018, (link to draft)
8. 강의진도계획
[Tentative Syllabus]
https://docs.google.com/spreadsheets/d/18p6JMZ76PCBQzcQnlnpOfe2_MuHv31ICtBGkBjMRynU/edit?usp=sharing
1. Introduction to RL
2. Multi-Armed Bandit
3. Regret Analysis in MAB
4. Sample Complexity in MAB
5. Markov Decision Process (MDP)
6. Dynamic Programming
7. Regret Minimization in MDP
8. Sampling Schemes
9. Temporal Difference learning
10. n-step Bootstrapping
11. Function Approximation
12. Deep Q-Network
13. Eligibility trace
14/15. Policy gradient method (Buddha's birthday / Childern's day - video lecture or scheduling make-up class)
16. Midterm Exam (9:30am~12:30pm )
17. Optimization techniques for RL
18. Scaling RL
19. Exploration via Intrinsic Motivation
20. Partially Observable RL
21. Bayesian and meta RL
22. Multitask and Hierarchical RL
23. Multi-agent RL
24. Adversarial Search: Alphago
25. Imitation learning and inverse RL
26. Application of RL
27-30. Final project presentation and guest lecture
11. 장애학생에 대한 학습지원 사항
- 수강 관련: 문자 통역(청각), 교과목 보조(발달), 노트필기(전 유형) 등
- 시험 관련: 시험시간 연장(필요시 전 유형), 시험지 확대 복사(시각) 등
- 기타 추가 요청사항 발생 시 장애학생지원센터(279-2434)로 요청