Statistics - Generalized Linear Models (GLM) - Extensions of the Linear Model

About

The Generalized Linear Model is an extension of the linear model that allows for lots of different, non-linear models to be tested in the context of regression.

GLM is the mathematical framework used in many statistical analyses such as:

multiple regression,
analysis of variance (for categorical predictors)
moderation,
and mediation.

GLM is a supervised algorithm with a classic statistical technique (Supports thousands of input variables, text and transactional data) used for:

Classification
and/or Regression

GLM implements:

logistic regression for classification of binary targets
and linear regression for continuous targets.

Confidence bounds are supported with a

GLM classification for prediction probabilities.
GLM regression for predictions.

Articles Related

Assumptions

The General Linear model has two main characteristics:

Linear: linear relationships between the predictors and the outcome measure.
Additive: the effects of each predictor are additive with one another

That doesn't mean that the GLM can't handle non-additive or non-linear effects.

Removing the additive assumption:

interactions and
non-linearity

GLM can accommodate such non-additive or non-linear effects with:

Transformation of variables: in order to make them linear
Adding interaction terms or moderation terms: in order to do a moderation analysis and test for non-additive facts.

Methods

Methods that expand the scope of linear models and how they are fit:

Classification problems: logistic regression, support vector machines
Non-linearity: kernel smoothing, splines and generalized additive models; nearest neighbour methods.
Interactions: Tree-based methods, bagging, random forests and boosting (these also capture non-linearities)
Regularized fitting: Ridge regression and lasso. These have become very popular lately, especially when we have data sets where we have very large numbers of variables–so-called wide data sets, and even linear models are too rich for them, and so we need to use methods to control the variability.