2024년도 2학기 특론: 인공지능 시스템 (CSED703O-01) 강의계획서

1. 수업정보

학수번호 CSED703O 분반 01 학점 3.00
이수구분 전공선택 강좌유형 강의실 강좌 선수과목
포스테키안 핵심역량
강의시간 월, 수 / 14:00 ~ 15:15 / 제2공학관 강의실 [109호] 성적취득 구분 G

2. 강의교수 정보

전명재 이름 전명재 학과(전공) 인공지능대학원
이메일 주소 jmj119@postech.ac.kr Homepage https://sites.google.com/site/myeongjae/
연구실 전화 054-279-2264
Office Hours by appointment (Mon, Wed)

3. 강의목표

This class will explore key concepts in system support for machine learning, deep learning, and large language model workloads. The primary objectives of this class are:
- Understanding the key properties of these workloads.
- Learning about the state-of-the-art system mechanisms and policies implemented in contemporary training and inference frameworks.
- Examining how past research on traditional big data processing technologies has evolved to improve the performance, scalability, and programmability of training and inference workloads.

4. 강의선수/수강필수사항

Required Prerequisite: Operating Systems, Computer Architecture

5. 성적평가

Attendance (10%), Class Participation (15%), Presentation (25%), Midterm (25%), Final (25%)
Be aware that these weights are subject to changes.

6. 강의교재

도서명 저자명 출판사 출판년도 ISBN

7. 참고문헌 및 자료

8. 강의진도계획

[Week 1] Introduction
- 9/2: Course Introduction
- 9/4: No class

[Week 2] Systems Basics
- 9/9: Scheduling
- 9/11: Memory Systems

[Week 3] Chuseok (Korean Thanksgiving Day)
- 9/16: No class
- 9/18: No class

[Week 4] Data Preprocessing
- 9/23: Basics, MinIO (VLDB'21), Revamper (ATC'21)
- 9/25: FastFlow (VLDB'23), FusionFlow (VLDB'24)

[Week 5] Single-GPU Training
- 9/30: Basics
- 10/2: Zico (ATC'21), Zero-Offload (ATC'21)

[Week 6] Multi-GPU & Multi-node Training
- 10/7: Basics
- 10/9: No class (Hangul Day)

[Week 7] Multi-GPU & Multi-node Training
- 10/14: GPipe (NeurIPS'19), PipeDream (SOSP'19)
- 10/16: Megatron-LM (SC'21), ByteScheduler (SOSP'19)

[Week 8] Midterm Exam

[Week 9] Automazation / Energy Efficiency
- 10/28: Alpa (OSDI’22), GEMINI (SOSP’23)
- 10/30: Zeus (NSDI'23), EnvPipe (ATC'23)

[Week 10] Memory Oversubscription
- 11/4: Basics
- 11/6: Capuchin (ASPLOS'20), HUVM (ATC'22)

[Week 11] Scheduler & Cluster Manager
- 11/11: Basics
- 11/13: Gandiva (OSDI'18), Pollux (OSDI'21)

[Week 12] Scheduler & Cluster Manager / (LLM) Serving Systems
- 11/18: MLaaS in the Wild (NSDI'22), Oobleck (SOSP'23)
- 11/20: MLPerf (ISCA'20), AlpaServe (OSDI'23)

[Week 13] (LLM) Serving Systems
- 11/25: DeepPlan (EuroSys'23), LLM Serving Basics
- 11/27: Orca (OSDI'22), PagedAttention (SOSP'23)

[Week 14] (LLM) Serving Systems
- 12/2: FlexGen (ICML'23), Sarathi-Serve (OSDI'24)
- 12/4: DistServe (OSDI'24), InfiniGen (OSDI'24)

[Week 15] Edge AI
- 12/9: Basics, CarM (DAC'22), Miro (MobiCom'23)
- 12/11: Sage (MobiSys'22), Ekya (NSDI'22)

[Week 16] Final Exam

9. 수업운영

This course will be based on paper reading and research-oriented discussion. We will roughly spend 55 minutes on paper presentations (two or three papers including a 5-min interim break) and 20 minutes on follow-up discussions in every session. Each student is expected to solve system design questions for each exam.

10. 학습법 소개 및 기타사항

11. 장애학생에 대한 학습지원 사항

- 수강 관련: 문자 통역(청각), 교과목 보조(발달), 노트필기(전 유형) 등

- 시험 관련: 시험시간 연장(필요시 전 유형), 시험지 확대 복사(시각) 등

- 기타 추가 요청사항 발생 시 장애학생지원센터(279-2434)로 요청