2024년도 2학기 데이터 분석 및 인공지능: 철강산업 응용 (MSIP601-01) 강의계획서

1. 수업정보

학수번호 MSIP601 분반 01 학점 3.00
이수구분 전공선택 강좌유형 온라인 병행강좌 선수과목
포스테키안 핵심역량
강의시간 목 / 11:00 ~ 13:45 / 온라인 강의실 [004호] 성적취득 구분 G

2. 강의교수 정보

송민석 이름 송민석 학과(전공) 산업경영공학과
이메일 주소 mssong@postech.ac.kr Homepage https://minseoksong.github.io
연구실 HTTP://AIM.POSTECH.AC.KR 전화 054-279-2376
Office Hours

3. 강의목표

The course will discuss the principles and ideas underlying the current practice of business analytics, as well as introduce a broad collection of useful data analytics tools (such as text mining, web analytics, network analytics, etc.). It covers the process of formulating business objectives, data selection, preparation, and partition to successfully design, build, evaluate, and implement predictive models for a variety of practical business applications (such as direct marketing, cross-selling, customer retention, delinquency and collection analytics, fraud detection, machine failure detection, insurance underwriting). The course emphasizes hands-on learning with a focus on dealing with real business problems. The course will use analytics software Python for hands-on experimentation with various analytics techniques. Some topics covered in this course are as follows:

Data preprocessing and visualization: Data preprocessing and visualization techniques enable analysts and data scientists to understand the data, identify patterns, detect anomalies, and communicate insights to stakeholders. They are crucial for exploratory data analysis, model development, and decision-making processes across various domains and industries. In this course, students will learn visual representation methods and techniques that increase their understanding of complex data and models. Emphasis is placed on the identification of patterns, trends, and differences from datasets across categories, space, and time.

Database and SQL: Companies store and collect large amounts of data during day-to-day transactions. To analyze long-term trends and patterns in the data and provide actionable intelligence to managers, this data needs to be consolidated in a data warehouse. A database is an organized collection of data stored and accessed electronically. SQL stands for Structured Query Language. It is a programming language designed for managing relational databases. SQL is used to communicate with and manipulate databases, retrieve and store data, define database structure, and perform various other operations related to data management. SQL is widely used in various applications and platforms that require data management and storage, such as web applications, enterprise systems, and data analytics. It is a standard language for interacting with relational databases and has different implementations, including popular database management systems like MySQL, Oracle, Microsoft SQL Server, and PostgreSQL.

Machine Learning: Machine learning algorithms can be categorized into several types, including supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves training a model using labeled data to make predictions or classifications. Unsupervised learning aims to discover hidden patterns or structures in unlabeled data. Reinforcement learning involves training a model through interaction with an environment to maximize rewards or minimize penalties.

Process Mining: Process mining bridges the gap between traditional model-based process analysis (e. g., simulation and other business process management techniques) and data-centric analysis techniques such as machine learning and data mining. The course explains the key analysis techniques in process mining. Students will learn various process discovery algorithms. These can be used to automatically learn process models from raw event data. Various other process analysis techniques that use event data will be presented. Moreover, the course will provide easy-to-use software, real-life data sets, and practical skills to directly apply the theory in a variety of application domains.

4. 강의선수/수강필수사항

N/A

5. 성적평가

Grades will be based on a weighted average of the following activities:
- Assignment: 50%
- Final Examination or Term project: 40%
- Attendance/Participation: 20%

6. 강의교재

도서명 저자명 출판사 출판년도 ISBN

7. 참고문헌 및 자료

References:
- Python for Data Analysis, The Third Edition, by Wes McKinney, O’Reilly, https://wesmckinney.com/book/
- Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2 by Sebastian Raschka, PACKT, 2019, https://github.com/rasbt/python-machine-learning-book-3rd-edition
- Learn Python 3 @ Code Academy, https://www.codecademy.com/learn/learn-python-3

8. 강의진도계획

Introduction
Introduction to Python
Numpy Basics
Getting started with pandas
Data Wrangling
Pivoting and Visualization
Introduction to Database
SQL
Evaluating Performance
Multiple Linear Regression
ANN (Artificial Neural Network)
Logistic Regression
SVM (Support Vector Machine)
CART
Clustering
Association Rules/Collaborative Filtering
Text Mining
Image Recognition
Process Mining
Final Exam or Term Project

9. 수업운영

10. 학습법 소개 및 기타사항

This course will be given by three professors. The co-instructors will be given lectures in the second and third weeks.
Instructor: Prof. Minseok Song (POSTECH), mssong@postech.ac.kr
Co-instructors: Prof. Young U. Ryu (UT Dallas), ryoung@utdallas.edu, Prof. Heeseung Andrew Lee (UT Dallas), andrewlee@utdallas.edu

11. 장애학생에 대한 학습지원 사항

- 수강 관련: 문자 통역(청각), 교과목 보조(발달), 노트필기(전 유형) 등

- 시험 관련: 시험시간 연장(필요시 전 유형), 시험지 확대 복사(시각) 등

- 기타 추가 요청사항 발생 시 장애학생지원센터(279-2434)로 요청