Overview

This introductory course delves into the foundational principles of data science, starting with an understanding of the CRISP-DM methodology. Participants will explore both supervised and unsupervised learning techniques, tackling classification and regression problems, and understanding key metrics and concepts such as loss functions, overfitting, and underfitting. The curriculum also covers classical machine learning methods, including linear regression, decision trees, and clustering, as well as an introduction to neural networks and scalable algorithms.

Materials

General course information, CRISP-DM methodology 📜
Introduction to Python 🤖
Introduction to Python: EDA 101 🤖
Supervised learning/unsupervised learning. Classification/regression problems. Accuracy metrics (precision, recall, ROC-AUC scores). Concept of loss functions, overfitting/ underfitting 📜
Introduction to ML with Python 🤖
Linear regression. Logistic regression. Support vector machine 📜
Linear Models: Linear Regression 🤖
Linear Models: AUROC 🤖
Decision trees. Random forests. Boosting 📜
Decision Tree Classification Algorithm 🤖
Random Forest 🤖
Gradient Boosting over Decision Trees (GBDT) 🤖
Dimensionality reduction: linear, non-linear methods 📜
Dimensionality reduction: PCA, t-SNE, UMAP 🤖
Clustering methods 📜
Clustering: k-means, DBScan, Community detection 🤖
Basic neural networks 📜
Basics of Neural Networks 🤖
Scalable algorithms 📜
Simple Annotated MNIST Exercise 🤖
Quick introduction to hyperparameter tuning with Optuna 🤖
Fine-tuning Large Language Model on the song lyrics 🤖
No matching items

Team

Ekaterina Muravleva

Instructor

Daniil Bershatskiy

TA

Daria Frolova

TA

Daniil Merkulov

TA

Simon Polyanskiy

TA

Vlad Trifonov

TA

No matching items