Time Series
Time series data has a temporal ordering that fundamentally changes how you approach modeling. You cannot randomly shuffle it, you must respect chronological order, and specialized algorithms exist to handle its unique characteristics like trend, seasonality, and autocorrelation.
Components of Time Series​
Every time series can be decomposed into these components:
| Component | What It Is | Example |
|---|---|---|
| Trend | Long-term increase or decrease | Population growth, rising stock market over decades |
| Seasonality | Regular, repeating patterns at fixed intervals | Higher ice cream sales in summer, holiday shopping spikes |
| Cyclical | Long-term fluctuations without fixed period | Business cycles, economic booms and recessions |
| Residual (Noise) | Random, unexplained variation | Unexpected events, measurement error |
# Decompose a time series into its components
from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(data, model='additive', period=12)
result.plot()
# Shows: observed, trend, seasonal, and residual components
Algorithm Selection for Time Series​
| Algorithm | Best For | Key Capability |
|---|---|---|
| ARIMA / SARIMA | Single time series, statistical approach | Good for stationary data with clear trend/seasonality. Cannot use related time series |
| Exponential Smoothing (ETS) | Simple time series with trend and/or seasonality | Lightweight and fast. Cannot use related time series or metadata |
| DeepAR | Multiple related time series, cold-start scenarios | Learns patterns across related series. Handles missing values (NaN). Probabilistic forecasts |
| CNN-QR | Forecasting with related data, promotions, holidays | Supports related time series, item metadata, and future known values |
Decision Guide​
Single series, simple → ARIMA
Multiple related series OR new products → DeepAR (cold-start capability)
Need related features + promotions → CNN-QR
Simple trend/seasonality → Exponential Smoothing
"Predict demand for a NEW product with no history" = DeepAR. It can generate forecasts for new items by learning patterns from similar items in the training set (cold-start capability). No other approach handles this as naturally.
Critical Rules for Time Series​
| Rule | Why It Matters |
|---|---|
| NEVER randomly split time series data | Random splitting leaks future information into training. Always split chronologically: train on past, test on future |
| Check stationarity before ARIMA | ARIMA assumes stationarity (constant mean and variance). Use the Dickey-Fuller test or STL decomposition to verify |
| DeepAR handles missing values as NaN | A unique capability — no need to impute missing target values |
| DeepAR handles cold-start | Can predict for new items based on patterns learned from similar items |
The single most important rule for time series validation: always split chronologically. Train on the past, validate on the future. Random splitting creates data leakage that produces unrealistically optimistic results.
Stationarity​
A stationary time series has a constant mean and variance over time. Many statistical models (ARIMA) require stationarity.
How to check:
- Visual inspection of the time series plot
- Dickey-Fuller test (statistical test for unit root)
- STL decomposition to separate trend and seasonality
How to make data stationary:
- Differencing (subtract previous value)
- Log transformation (stabilize variance)
- Detrending (remove trend component)
from statsmodels.tsa.stattools import adfuller
# Augmented Dickey-Fuller test
result = adfuller(time_series)
p_value = result[1]
if p_value < 0.05:
print("Stationary (reject null hypothesis)")
else:
print("Non-stationary — apply differencing")
Time Series Validation​
from sklearn.model_selection import TimeSeriesSplit
# Expanding window cross-validation
tscv = TimeSeriesSplit(n_splits=5)
for train_idx, test_idx in tscv.split(X):
# train_idx always precedes test_idx chronologically
X_train, X_test = X[train_idx], X[test_idx]
y_train, y_test = y[train_idx], y[test_idx]
Flashcards​
What are the four components of a time series?
Click to reveal1) Trend — long-term direction. 2) Seasonality — regular repeating patterns at fixed intervals. 3) Cyclical — long-term fluctuations without fixed period. 4) Residual — random unexplained variation (noise).