# Generalized Linear Models

Summary

*Generalized Linear Models* are a class of models which belong to the *ordinary least squares* framework. They generally consist of a set of parameters \mathbf{w}, a feature mapping \varphi() and a *link function* which dictates how the probability distribution of the output quantity is described.

*Generalized Linear Models* (GLM) are available in the context of regression and binary classification, more specifically in DynaML the following members of the GLM family are implemented. The `GeneralizedLinearModel[T]`

class is the base of the GLM hierarchy in DynaML, all linear models are extensions of it. It's companion object is used for the creation of GLM instances as follows.

```
val data: Stream[(DenseVector[Double], Double)] = ...
//The task variable is a string which is set to "regression" or "classification"
val task = ...
//The map variable defines a possibly higher dimensional function of the input
//which is akin to a basis function representation of the original features
val map: DenseVector[Double] => DenseVector[Double] = ...
//modeltype is set to "logit" or "probit"
//if one wishes to create a binary classification model,
//depending on the classification model involved
val modeltype = "logit"
val glm = GeneralizedLinearModel(data, task, map, modeltype)
```

## Normal GLM¶

The most common regression model, also known as *least squares linear regression*, implemented as the class `RegularizedGLM`

which represents a regression model with the following prediction:

Here \varphi(.) is an appropriately chosen set of *basis functions*. The inference problem is formulated as

## Logit GLM¶

In binary classification the most common GLM used is the *logistic regression* model which is given by
$$
\begin{equation}
P(y = 1 | \mathbf{x}) = \sigma(w^T \varphi(\mathbf{x}) + b)
\end{equation}
$$

Where \sigma(z) = \frac{1}{1 + exp(-z)} is the logistic function which maps the output of the linear function w^T \varphi(\mathbf{x}) + b to a probability value.

## Probit GLM¶

The *probit regression* model is an alternative to the *logit* model it is represented as:
$$
\begin{equation}
P(y = 1 | \mathbf{x}) = \Phi(w^T \varphi(\mathbf{x}) + b)
\end{equation}
$$
Where \Phi(z) is the cumulative distribution function of the standard normal distribution.

GLS

The *Generalized Least Squares* model which is a more broad formulation of the *Ordinary Least Squares* (OLS) regression model.