Algorithm Selection
Selecting the right algorithm is where theory meets practice. The best algorithm depends on your data type (tabular, text, images, time series), whether you have labels, and the specific problem you are solving. This guide covers the most important algorithms and when to reach for each one.
Quick Reference​
Supervised — Tabular Data​
| Algorithm | Best For | Strengths | Limitations |
|---|---|---|---|
| XGBoost | Tabular classification and regression (#1 default for structured data) | Handles missing values, built-in feature importance, regularization | Not for images, text sequences, or very high-dimensional sparse data |
| Random Forest | Classification and regression with less tuning | Robust, handles non-linear relationships, Gini importance | Slower than XGBoost, larger model size |
| Logistic Regression / Linear Learner | Binary/multi-class classification with linear decision boundary | Fast, interpretable coefficients | Cannot capture complex non-linear patterns |
| k-Nearest Neighbors (k-NN) | Classification + "find similar items" | Simple, no training phase | Slow at inference for large datasets, sensitive to dimensionality |
| Factorization Machines | Recommendation systems, click-through prediction, sparse data | Handles high-dimensional sparse data, captures feature interactions | Limited to pairwise feature interactions |
Supervised — Time Series​
| Algorithm | Best For | Key Feature |
|---|---|---|
| DeepAR | Multiple related time series, cold-start | Learns patterns across related series, handles NaN, probabilistic forecasts |
| ARIMA / SARIMA | Single time series, statistical approach | Good for stationary data with clear trend/seasonality |
| CNN-QR | Forecasting with related time series + metadata | Supports related data, holidays, promotions |
| Exponential Smoothing (ETS) | Simple time series with trend/seasonality | Cannot use related time series or metadata |
Unsupervised​
| Algorithm | Best For | Key Detail |
|---|---|---|
| K-Means | Clustering — group similar data points | Elbow method for optimal k. Often paired with PCA |
| PCA | Dimensionality reduction — compress features while preserving variance | Must scale data first. Unsupervised. Does NOT give feature importance for target |
| Random Cut Forest (RCF) | Anomaly detection — find outliers | Higher anomaly score = more anomalous |
| LDA / NTM | Topic modeling — discover topics in TEXT documents | For text only, not structured tabular data |
| t-SNE | Visualization of high-dimensional data in 2D/3D | For visualization ONLY, not for feature reduction in production |
Flashcards​
What is the default go-to algorithm for structured/tabular data?
Click to revealXGBoost — it handles missing values natively, provides built-in feature importance, includes regularization, and works well out-of-the-box for both classification and regression on tabular data.
Time Series Algorithm Selection: Single series, simple = ARIMA. Multiple related series OR new products = DeepAR. Need related features + promotions = CNN-QR. "Predict demand for NEW product" = DeepAR (cold-start capability).