Overfitting vs Underfitting
Every machine learning model walks a tightrope between two failure modes: overfitting (memorizing training data) and underfitting (failing to learn patterns). Understanding how to diagnose and fix each is fundamental to building models that generalize well to unseen data.
Comparison​
| Overfitting | Underfitting | |
|---|---|---|
| Symptom | High training accuracy, LOW validation/test accuracy | LOW training accuracy, LOW validation/test accuracy |
| Cause | Model memorizes training data; too complex for the data | Model too simple to capture underlying patterns |
| Bias-Variance | Low bias, HIGH variance | HIGH bias, low variance |
How to Recognize It​
Training: 98%, Test: 65% → Overfitting (large gap)
Training: 55%, Test: 52% → Underfitting (both low)
Training: 92%, Test: 89% → Good fit (small gap, both high)
Remedies​
Fixing Overfitting (High Variance)​
- More training data — the single most effective remedy
- Regularization — L1, L2, or Elastic Net
- Dropout — for neural networks (20-50% rate)
- Early stopping — stop when validation loss starts increasing
- Data augmentation — create synthetic training examples (images)
- Reduce model complexity — fewer layers/neurons, shallower trees
- Decrease
max_depthand increasemin_child_weightin XGBoost - Feature selection — reduce the number of input features
Fixing Underfitting (High Bias)​
- More features / feature engineering — give the model more signal
- Use a more complex model — switch from linear to tree-based or deep learning
- Train longer — more epochs, more boosting rounds
- Decrease regularization — you may be constraining the model too much
- Increase model capacity — more layers, deeper trees
- Increase
max_depthin tree-based models
The Bias-Variance Tradeoff​
The bias-variance tradeoff is the central tension in machine learning.
| Concept | Meaning | Effect |
|---|---|---|
| High Bias | Model is too simple | Underfitting — "consistently wrong" in the same way |
| High Variance | Model is too sensitive to training data | Overfitting — "different results on different data" |
| Goal | Find the sweet spot | Low bias AND low variance |
Think of bias as systematic error (the model always misses in the same direction) and variance as sensitivity (the model gives wildly different predictions depending on which training data it sees). The best models balance both.
Flashcards​
How do you identify overfitting from training and test metrics?
Click to revealLarge gap between training and test performance. Example: Training accuracy 98%, test accuracy 65%. The model has memorized training data but fails to generalize.
When diagnosing model performance, always look at both training and validation metrics together. Looking at only one of them can mislead you about the model's true behavior.