Machine Learning Use Cases Per Algorithm

When you start learning about machine learning, its easy to be overwhelmed by the sheer amount of different types of machine learning there are. Here is a list of the types of algorithms that one can use, and when you should use them.

Clustering

K-means: this algorithm is for characterizing existing data behavior, and not predicting future behavior

Regression

Ordinal regression: predicts values for data in rank-oriented categories
Poisson regression: predicts the count of a particular event
Fast-forrest quantile regression: predicts a distribution of values
Linear regression: predicts a single value by linear approximation
Bayesian linear regression: predicts a single value by linear approximation where your data points are statistically independent
Neural network regression: predicts a single value where linear approximation is not preferred, and and explainable class boundaries are not preferred
Decision forest regression: predicts a single value where linear approximation is not preferred and explainable class boundaries are desired
Boosted decision tree regression: predicts a single value where linear approximation is not preferred and explainable class boundaries are desired and you have overlapping features

Anomaly Detection (2 Class Classification)

One-class SVM: use this if you want to predict 2 categories where one of the categories is rare you have <100K data points or >100 features
PCA-based anomaly: use this if you want to predict 2 categories where one of the categories is rare and you have more than 100K data points or less than 100 features

Two-Class Classification

Two-class decision forest: predicts 2 categories where neither are rare, and you have >100k data points and <100 features, you prefer accuracy over speed, and you prefer explainable class boundaries
Two-class decision jungle: predicts 2 categories where neither are rare, and you have >100k data points and <100 features, you prefer accuracy over speed, and you prefer explainable class boundaries
Two-class boosted decision tree: predicts 2 categories where neither are rare, and you have >100k data points and <100 features, you prefer accuracy over speed, and you prefer explainable class boundaries (and you have overlapping features)
Two-class logistic regression: predicts 2 categories where neither are rare, and you have >100k data points and <100 features, you prefer accuracy over speed, and you DO NOT prefer explainable class boundaries
Two-class neural network: predicts 2 categories where neither are rare, and you have >100k data points and <100 features, you prefer accuracy over speed, and you DO NOT prefer explainable class boundaries AND you prefer performance over training time and all features are numerical
Two-class SVM: predicts 2 categories where neither are rare, and you have <100k data points or >100 features
Locally Deep SVM: predicts 2 categories where neither are rare, and you have <100k data points or >100 features AND you want it to perform better than Locally Deep SVM
Two-class averaged perceptron: predicts 2 categories where neither are rare, and you have >100k data points or <100 features and you are trying to increase speed/performance
Two-class Bayes point machine: predicts 2 categories where neither are rare, and you have >100k data points and <100 features, you prefer accuracy over speed, and you DO NOT prefer explainable class boundaries AND your data points are statistically independent

Multi-Class Classification

One vs all multiclass: predict >2 categories where you prefer the the classifier is built from more than 1 two-class classifier
Multiclass logistic regression: predict >2 categories where you want 1 two-class classifier and you do not prefer explainable class boundaries
Multiclass neural network: predict >2 categories where you want 1 two-class classifier and you do not prefer explainable class boundaries AND you prefer performance over training time
Multiclass decision forest: predict >2 categories where you want 1 two-class classifier and you prefer explainable class boundaries
Multiclass decision jungle: predict >2 categories where you want 1 two-class classifier and you prefer explainable class boundaries

Comments are closed.