3. 강의목표
- Large scale optimization and control has become very important in recent years, which involve complex and are often model free. Also, adaptive systems with sequential decision-making powers are sought for effective real-time optimization and control.
- This is a course on the theory and practice of optimal sequential decision-making over time, sometimes in the presence of uncertainties. The course will stress intuition, the mathematical foundations being for the most part elementary.
- This course will introduce (1) the art of formulating recursive equations, (2) the theory about how and why dynamic programming can be the method that can solve many of optimization problems involving sequential decision making in both deterministic and stochastic environment, (3) applications (mainly in IME related application areas), and (4) computational aspects of dynamic programming. Moreover, the students will have knowledge about approximate dynamic programming and reinforcement learning to handle the critical limit, “curse of dimensionality” of conventional dynamic programming approach.
- Many large scale industrial applications such as airline pricing, operations in logistics and transportation, and inventory control use this sequential decision making approach, which has created a demand for this knowledge.
4. 강의선수/수강필수사항
IMEN 261 OR1, IMEN 266 OR2
5. 성적평가
Two Exams and Homeworks
7. 참고문헌 및 자료
Dynamic Programming: Models and Applications, written by Eric V. Denardo, Dover
Dynamic Programming, written by Richard Bellman, Dover
Markov Decision Processes : Discrete Stochastic Dynamic Programming, written by Martin L. Putterman, Wiley
Dynamic Programming and Optimal Control, written by Dimitri P. Bertsekas, Athena Scientific
Approximate Dynamic Programming, written by Warren B. Powell, Wiley
Reinforcement Learning: An Introduction, written by Richard S. Sutton and Andrew G. Barto, MIT Press
8. 강의진도계획
1. Introduction (Sequential Decision Process, Principle of Optimality)
2. Deterministic DP : DP Algorithm
3. Deterministic DP : Shortest Path Problem and Other Applications
4. Deterministic DP : DP and Hamilton-Jacobi-Bellman (HJB) Eq.
5. Stochastic DP : Introduction to Markov Decision Process (MDP)
6. Stochastic DP : Short Review on Discrete Time Markov Chain (DTMC)
7. Stochastic DP : MDP Applications
8. Stochastic DP : Finite-Horizon MDP / Infinite-Horizon MDP
9. Stochastic DP : Discounted MDP
11. Approximate Dynamic Programming
12. Reinforcement Learning : Temporal Difference Learning, On-Policy Learning/Off-Policy Learning
13. Reinforcement Learning : Value Function Approximation, Basics of Deep Q Learning
11. 장애학생에 대한 학습지원 사항
- 수강 관련: 문자 통역(청각), 교과목 보조(발달), 노트필기(전 유형) 등
- 시험 관련: 시험시간 연장(필요시 전 유형), 시험지 확대 복사(시각) 등
- 기타 추가 요청사항 발생 시 장애학생지원센터(279-2434)로 요청