Aller au contenu

Session 4 — Beyond Supervised Learning

Instructor: Stéphane Derrode, Centrale Lyon
Formation: Centrale Digital Lab @ Ecole Centrale Lyon
Back to course index


📦 Download all session files — notebook

⬇ session4.zip

Contents: session4_beyond_supervised.ipynb
Note: data files from Sessions 2 and 3 are required — download them from their respective pages.


Overview

Datasets Spotify Tracks (K-Means) · Heart Disease UCI (Naive Bayes, MLP)
Duration 3 hours
Format Jupyter notebook + paper quiz (15 min)
Prerequisite notebooks Sessions 2 and 3 — the data files from those sessions are reused

This final session broadens the picture beyond the classifiers seen in Session 3. You will explore unsupervised clustering, probabilistic classification, and neural networks — then compare all models seen across the course and get a conceptual overview of Deep Learning as the next horizon.


Learning objectives

By the end of this session, you will be able to:

  • Explain the difference between supervised, unsupervised, and probabilistic learning
  • Apply K-Means, choose k with the elbow method and silhouette score, and interpret clusters
  • State Bayes’ theorem and explain the “naive” independence assumption
  • Describe the forward pass of a Multi-Layer Perceptron with one hidden layer
  • Name the role of activation functions and of backpropagation
  • Compare all models on the same dataset and articulate when to use each
  • Name the three main Deep Learning architectures (CNN, RNN/LSTM, Transformer) and their use cases

Before the session — what you need to do

1. Verify your environment

All packages from previous sessions must be installed. No new installation required.

2. Download the session 4 notebook

⬇ session4.zip

3. Launch Jupyter and run the setup cell

jupyter notebook session4_beyond_supervised.ipynb

You should see: All imports OK.


Session content

The notebook is divided into 5 blocks:

Block Dataset Topic Key tools
1 Spotify K-Means clustering KMeans, silhouette_score, PCA projection, cluster profiling
2 Heart Disease Naive Bayes GaussianNB, Bayes’ theorem, class priors
3 Heart Disease Neural Networks (MLP) MLPClassifier, loss curve, early stopping
4 Heart Disease Model comparison Unified ROC plot, metric table, “when to use” guide
5 Introduction to Deep Learning CNNs, RNNs/LSTMs, Transformers — markdown only

Key formulas to know

K-Means — assignment step: $\(c^{(i)} = \arg\min_{j} \| \mathbf{x}^{(i)} - \boldsymbol{\mu}_j \|^2\)$

Silhouette score: $\(s = \frac{b - a}{\max(a, b)} \in [-1, 1]\)$ where \(a\) = mean intra-cluster distance, \(b\) = mean distance to nearest other cluster.

Bayes’ theorem: $\(P(y \mid \mathbf{x}) = \frac{P(\mathbf{x} \mid y) \cdot P(y)}{P(\mathbf{x})}\)$

MLP — forward pass (one hidden layer): $\(\mathbf{a}^{(1)} = f(W^{(1)} \mathbf{x} + \mathbf{b}^{(1)}), \qquad \hat{y} = \sigma(W^{(2)} \mathbf{a}^{(1)} + \mathbf{b}^{(2)})\)$


Quiz

A 15-minute paper quiz (closed book, no devices) will be held at the end of the session.
It covers:

  • True/False on K-Means (supervised vs unsupervised), silhouette score, Naive Bayes independence assumption, MLP activation functions
  • Multiple choice: inertia definition, Gaussian Naive Bayes parameters, role of backpropagation
  • Short questions: reading an elbow plot, explaining the “naive” assumption with a concrete example, arguing against a deep network on small tabular data

💡 Tip: For the elbow plot question, practice computing cumulative inertia drops and identifying where the curve flattens. For Naive Bayes, think of pairs of features in the Heart Disease dataset that are clearly not independent.


Key concepts to remember

  • K-Means is unsupervised — it discovers structure without labels; you must interpret the clusters
  • Elbow + silhouette together — neither alone is sufficient to choose k
  • Naive Bayes is fast and surprisingly robust — a violated assumption does not necessarily mean poor predictions
  • Without activation functions, an MLP collapses to a linear model
  • On small tabular data, simpler models often win — do not reach for deep networks by default
  • Deep Learning = same math, different scale and architecture

Model comparison summary

Model Best when… Main limitation
Logistic Regression Interpretability needed; linearly separable data Cannot capture non-linear patterns
Random Forest Good default for tabular data Less interpretable
Naive Bayes Small dataset; fast inference Independence assumption often violated
MLP Large dataset; complex patterns Data-hungry; requires tuning
K-Means No labels; want to find natural groups Must specify k; assumes spherical clusters

Back to course index