동적계획법과 강화학습응용 (2023-092, 50152342

1. 수업정보

학수번호	IMEN764	분반	01	학점	3.00
이수구분	전공선택	강좌유형	강의실 강좌	선수과목
포스테키안 핵심역량	대인관계역량 글로벌시민역량 지식탐구역량 디지털리터러시역량 자기관리역량 창의융합역량
강의시간	화, 목 / 11:00 ~ 12:15 / 제4공학관 멀티미디어 강의실 [407호]			성적취득 구분	G

2. 강의교수 정보

이름	최동구	학과(전공)	산업경영공학과
이메일 주소	dgchoi@postech.ac.kr	Homepage	se.postech.ac.kr
연구실		전화	054-279-2375
Office Hours	Tuesday 13:30 to 15:00 or by appointment

3. 강의목표

- Large scale optimization and control has become very important in recent years, which involve complex and are often model free. Also, adaptive systems with sequential decision-making powers are sought for effective real-time optimization and control.
- This is a course on the theory and practice of optimal sequential decision-making over time, sometimes in the presence of uncertainties. The course will stress intuition, the mathematical foundations being for the most part elementary.
- This course will introduce (1) the art of formulating recursive equations, (2) the theory about how and why dynamic programming can be the method that can solve many of optimization problems involving sequential decision making in both deterministic and stochastic environment, (3) applications (mainly in IME related application areas), and (4) computational aspects of dynamic programming. Moreover, the students will have knowledge about approximate dynamic programming and reinforcement learning to handle the critical limit, “curse of dimensionality” of conventional dynamic programming approach.
- Many large scale industrial applications such as airline pricing, operations in logistics and transportation, and inventory control use this sequential decision making approach, which has created a demand for this knowledge.

4. 강의선수/수강필수사항

IMEN 261 OR1, IMEN 266 OR2

5. 성적평가

Two Exams and Homeworks

6. 강의교재

도서명	저자명	출판사	출판년도	ISBN

7. 참고문헌 및 자료

Dynamic Programming: Models and Applications, written by Eric V. Denardo, Dover
Dynamic Programming, written by Richard Bellman, Dover
Markov Decision Processes : Discrete Stochastic Dynamic Programming, written by Martin L. Putterman, Wiley
Dynamic Programming and Optimal Control, written by Dimitri P. Bertsekas, Athena Scientific
Approximate Dynamic Programming, written by Warren B. Powell, Wiley
Reinforcement Learning: An Introduction, written by Richard S. Sutton and Andrew G. Barto, MIT Press

8. 강의진도계획

1. Introduction (Sequential Decision Process, Principle of Optimality)
2. Deterministic DP : DP Algorithm
3. Deterministic DP : Shortest Path Problem and Other Applications
4. Deterministic DP : DP and Hamilton-Jacobi-Bellman (HJB) Eq.
5. Stochastic DP : Introduction to Markov Decision Process (MDP)
6. Stochastic DP : Short Review on Discrete Time Markov Chain (DTMC)
7. Stochastic DP : MDP Applications
8. Stochastic DP : Finite-Horizon MDP / Infinite-Horizon MDP
9. Stochastic DP : Discounted MDP
11. Approximate Dynamic Programming
12. Reinforcement Learning : Temporal Difference Learning, On-Policy Learning/Off-Policy Learning
13. Reinforcement Learning : Value Function Approximation, Basics of Deep Q Learning

9. 수업운영

이론강의 중심.

10. 학습법 소개 및 기타사항

11. 장애학생에 대한 학습지원 사항

- 수강 관련: 문자 통역(청각), 교과목 보조(발달), 노트필기(전 유형) 등

- 시험 관련: 시험시간 연장(필요시 전 유형), 시험지 확대 복사(시각) 등

- 기타 추가 요청사항 발생 시 장애학생지원센터(279-2434)로 요청

2023년도 2학기 동적계획법과 강화학습응용 (IMEN764-01) 강의계획서