Hyperparameter Optimization

2 min read

Table of Contents

Important Hyperparameters for Supervised Learning Algorithms
Hyperparameter Optimization Methods

Also referred to as model tuning, hyperparameter optimization is the experimental process of searching for the combination of hyperparameters for a given dataset and machine learning algorithm that deliver the best model performance as measured on a validation set.

A hyperparameter is a configuration variable that determines the architecture of the model and the learning process, e.g., number of trees in a tree-based model, learning rate, etc. It has to be manually specified by the machine learning engineer / data scientist in an algorithm before training and does not change during model training run.

Contrast a hyperparameter with a parameter which is a variable that is internal to a model and whose value is learned from the dataset during training, e.g., coefficients in a linear regression model, node weights in neural networks, etc. It is often saved as part of the final model.

Important Hyperparameters for Supervised Learning Algorithms #

Algorithm	Hyperparameter
Regression Logistic regression	Regularization parameter – Lasso (L1) or Ridge (L2)
Decision tree	Minimum size of leaves Maximum size of leaves Maximum tree depth
k-nearest neighbors	Number of neighbors (k)
Support vector machines	Kernel type (dot, radial, neural, etc) Kernel parameters(gamma, sigma, degree, etc) Penalty (C) – lower values mean harder boundaries and vice-versa
Artificial neural networks	Number of neurons in each layer Number of hidden layers Number of training iterations (known as epochs) Learning rate Initial weights
Random forest	All decision tree hyperparameters Number of trees Number of features to select at each split

Hyperparameter Optimization Methods #

Algorithms can have many hyperparameters and finding the optimal combination can be treated as a search problem.

Manual tuning is a trial and error method. With experience it is possible to “guess” hyperparameter values that deliver very good performance.

Grid search (or parameter sweeping) uses brute force to test all combinations of a specified hyperparameter subset, measure the performance typically using cross-validation and pick the one that gives the optimal performance. It is computationally expensive.

Random search randomly samples and evaluates hyperparametric combinations from a specified statistical distribution.

Bayesian optimization uses the performance of past choices as the basis for selecting the hyperparameters to consider in each run. It is less computationally expensive than grid search and does not require input from data scientist / machine learning engineer to determine values.

Other methods: gradient-based optimization, evolutionary optimization, population-based optimization, etc

Join WhatsApp group here
Join Facebook group here

What are your Feelings

Still stuck? How can we help?

Updated on 3 June, 2022

1. Introduction to Artificial Intelligence and Machine Learning

2. Machine Learning Success Factors

3. Build and Use a Quick Machine Learning Model

4. Defining the Business Problem

5. Data Understanding

6. Data Preparation

7. Modeling

8. Predictive Modeling

9. Model Validation

10. Model Deployment

Hyperparameter Optimization

Important Hyperparameters for Supervised Learning Algorithms #

Hyperparameter Optimization Methods #

What are your Feelings

Leave a Reply Cancel reply

How can we help you?

Important Hyperparameters for Supervised Learning Algorithms #

Hyperparameter Optimization Methods #

What are your Feelings

Share post:

What's on your mind?

Leave a Reply Cancel reply