Skip to main content

ML Problem Types

Correctly framing your problem is the most critical first step in any ML project. Choosing the wrong problem type leads to wrong algorithms, wrong metrics, and wasted effort. Understanding the distinction between supervised and unsupervised learning, and between classification, regression, and clustering, will guide every downstream decision.

Quick Reference​

Problem TypeOutputExamplesAlgorithms
Binary ClassificationYes/No, 0/1Fraud or not, churn or not, spam or notXGBoost, Logistic Regression, Linear Learner, Random Forest
Multi-class ClassificationOne of N categoriesImage labels, document type, product categoryXGBoost, Random Forest, CNN, BlazingText
RegressionContinuous numeric valuePrice prediction, demand quantity, temperatureXGBoost, Linear Learner, Linear Regression
ForecastingFuture values over timeSales forecast, demand planning, stock pricesDeepAR, ARIMA, CNN-QR, Exponential Smoothing
ClusteringGroup assignments (no labels)Customer segmentation, anomaly groupingK-Means, DBSCAN
Anomaly DetectionNormal vs anomalousFraud detection, defect detection, network intrusionRandom Cut Forest, Isolation Forest, IP Insights
RecommendationRanked items for a userProduct recommendations, content suggestionsFactorization Machines, Collaborative Filtering
Topic ModelingTopics within documentsCategorize news articles, discover themesLDA, Neural Topic Model (NTM)
Object DetectionBounding boxes + labelsFind cars in images, detect facesSSD, YOLO, Faster R-CNN
Semantic SegmentationPixel-level labelsAutonomous driving, medical imagingFCN, U-Net, DeepLab

Decision Flow​

  • Has labeled data? → Supervised (classification, regression, forecasting)
  • No labels? → Unsupervised (clustering, anomaly detection, topic modeling)
  • Predict a category? → Classification
  • Predict a number? → Regression
  • Predict future values? → Forecasting
  • Group similar items? → Clustering
  • Find unusual items? → Anomaly Detection

Flashcards​

1 / 10
Question

What type of ML problem is 'predict whether a transaction is fraudulent or not'?

Click to reveal
Answer

Binary Classification — the output is one of two categories (fraud/not fraud). Common algorithms: XGBoost, Logistic Regression, Linear Learner.

Common Misconception

"Identify groups of customers" = Clustering (K-Means), NOT Semantic Segmentation. "Customer segmentation" is a business term for clustering. Semantic Segmentation is a computer vision technique for pixel-level image labeling.