# Probability Distributions

Summary

The DynaML dynaml.probability.distributions package leverages and extends the breeze.stats.distributions package. Below is a list of distributions implemented.

## Specifying Distributions¶

Every probability density function $\rho(x)$ defined over some domain $x \in \mathcal{X}$ can be represented as $\rho(x) = \frac{1}{Z} f(x)$, where $f(x)$ is the un-normalized probability weight and $Z$ is the normalization constant. The normalization constant ensures that the density function sums to $1$ over the whole domain $\mathcal{X}$.

### Describing Skewness¶

An important analytical way to create skewed distributions was described by Azzalani et. al. It consists of four components.

• A symmetric probability density $\varphi(.)$
• An odd function $w(.)$
• A cumulative distribution function $G(.)$ of some symmetric density
• A cut-off parameter $\tau$
\rho(x) = \frac{1}{G(\tau)} \times \varphi(x)\times G(w(x) + \tau)

## Distributions API¶

The Density[T] and Rand[T] traits form the API entry points for implementing probability distributions in breeze. In the dynaml.probability.distributions package, these two traits are inherited by GenericDistribution[T] which is extended by AbstractContinuousDistr[T] and AbstractDiscreteDistr[T] classes.

Distributions which can produce confidence intervals

The trait HasErrorBars[T] can be used as a mix in to provide the ability of producing error bars to distributions. To extend it, one has to implement the confidenceInterval(s: Double): (T, T) method.

Skewness

The SkewSymmDistribution[T] class is the generic base implementations for skew symmetric family of distributions in DynaML.

## Distributions Library¶

Apart from the distributions defined in the breeze.stats.distributions, users have access to the following distributions implemented in the dynaml.probability.distributions.

### Multivariate Students T¶

Defines a Students' T distribution over the domain of finite dimensional vectors.

$\mathcal{X} \equiv \mathbb{R}^{n}$

$f(x) = \left[1+{\frac {1}{\nu }}({\mathbf {x} }-{\boldsymbol {\mu }})^{\rm {T}}{\boldsymbol {\Sigma }}^{-1}({\mathbf {x} }-{\boldsymbol {\mu }})\right]^{-(\nu +p)/2}$

$Z = \frac{\Gamma \left[(\nu +p)/2\right]}{\Gamma (\nu /2)\nu ^{p/2}\pi ^{p/2}\left|{\boldsymbol {\Sigma }}\right|^{1/2}}$

Usage:

val mu = 2.5
val mean = DenseVector(1.0, 0.0)
val cov = DenseMatrix((1.5, 0.5), (0.5, 2.5))
val d = MultivariateStudentsT(mu, mean, cov)


### Matrix T¶

Defines a Students' T distribution over the domain of matrices.

$\mathcal{X} \equiv \mathbb{R}^{n \times p}$

$f(x) = \left|{\mathbf {I}}_{n}+{\boldsymbol \Sigma }^{{-1}}({\mathbf {X}}-{\mathbf {M}}){\boldsymbol \Omega }^{{-1}}({\mathbf {X}}-{\mathbf {M}})^{{{\rm {T}}}}\right|^{{-{\frac {\nu +n+p-1}{2}}}}$

$Z = {\frac {\Gamma_{p}\left({\frac {\nu +n+p-1}{2}}\right)}{(\pi )^{{\frac {np}{2}}}\Gamma _{p}\left({\frac {\nu +p-1}{2}}\right)}}|{\boldsymbol \Omega }|^{{-{\frac {n}{2}}}}|{\boldsymbol \Sigma }|^{{-{\frac {p}{2}}}}$

Usage:

val mu = 2.5
val mean = DenseMatrix((-1.5, -0.5), (3.5, -2.5))
val cov_rows = DenseMatrix((1.5, 0.5), (0.5, 2.5))
val cov_cols = DenseMatrix((0.5, 0.1), (0.1, 1.5))
val d = MatrixT(mu, mean, cov_rows, cov_cols)


### Matrix Normal¶

Defines a Gaussian distribution over the domain of matrices.

$\mathcal{X} \equiv \mathbb{R}^{n \times p}$

$f(x) = \exp\left( -\frac{1}{2} \, \mathrm{tr}\left[ \mathbf{V}^{-1} (\mathbf{X} - \mathbf{M})^{T} \mathbf{U}^{-1} (\mathbf{X} - \mathbf{M}) \right] \right)$

$Z = (2\pi)^{np/2} |\mathbf{V}|^{n/2} |\mathbf{U}|^{p/2}$

Usage:

val mean = DenseMatrix((-1.5, -0.5), (3.5, -2.5))
val cov_rows = DenseMatrix((1.5, 0.5), (0.5, 2.5))
val cov_cols = DenseMatrix((0.5, 0.1), (0.1, 1.5))
val d = MatrixNormal(mean, cov_rows, cov_cols)


### Truncated Normal¶

Defines a univariate Gaussian distribution that is defined in a finite domain.

$\mathcal{X} \equiv [a, b]$

$f(x) = \begin{cases} \phi ({\frac {x-\mu }{\sigma }}) & a \leq x \leq b\\0 & else\end{cases}$

$Z = \sigma \left(\Phi ({\frac {b-\mu }{\sigma }})-\Phi ({\frac {a-\mu }{\sigma }})\right)$

$\phi()$ and $\Phi()$ being the gaussian density function and cumulative distribution function respectively

Usage:

val mean = 1.5
val sigma = 1.5
val (a,b) = (-0.5, 2.5)
val d = TruncatedGaussian(mean, sigma, a, b)


### Skew Gaussian¶

#### Univariate¶

$\mathcal{X} \equiv \mathbb{R}$

$f(x) = \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}))$

$Z = \frac{1}{2}$

$\phi()$ and $\Phi()$ being the standard gaussian density function and cumulative distribution function respectively

#### Multivariate¶

$\mathcal{X} \equiv \mathbb{R}^d$

$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu}, {\Sigma}) \Phi(\mathbf{\alpha}^{\intercal} L^{-1}(\mathbf{x} - \mathbf{\mu}))$

$Z = \frac{1}{2}$

$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively and $L$ is the lower triangular Cholesky decomposition of $\Sigma$.

Skewness parameter $\alpha$

The parameter $\alpha$ determines the skewness of the distribution and its sign tells us in which direction the distribution has a fatter tail. In the univariate case the parameter $\alpha$ is a scalar, while in the multivariate case $\alpha \in \mathbb{R}^d$, so for the multivariate skew gaussian distribution, there is a skewness value for each dimension.

Usage:

//Univariate
val mean = 1.5
val sigma = 1.5
val a = -0.5
val d = SkewGaussian(a, mean, sigma)

//Multivariate
val mu = DenseVector.ones[Double](4)
val alpha = DenseVector.fill[Double](4)(1.2)
val cov = DenseMatrix.eye[Double](4)*1.5
val md = MultivariateSkewNormal(alpha, mu, cov)


### Extended Skew Gaussian¶

#### Univariate¶

The generalization of the univariate skew Gaussian distribution.

$\mathcal{X} \equiv \mathbb{R}$

$f(x) = \phi(\frac{x - \mu}{\sigma}) \Phi(\alpha (\frac{x-\mu}{\sigma}) + \tau\sqrt{1 + \alpha^{2}})$

$Z = \Phi(\tau)$

$\phi()$ and $\Phi()$ being the standard gaussian density function and cumulative distribution function respectively

#### Multivariate¶

$\mathcal{X} \equiv \mathbb{R}^d$

$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu}, {\Sigma}) \Phi(\mathbf{\alpha}^{\intercal} L^{-1}(\mathbf{x} - \mathbf{\mu}) + \tau\sqrt{1 + \mathbf{\alpha}^{\intercal}\mathbf{\alpha}})$

$Z = \Phi(\tau)$

$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively and $L$ is the lower triangular Cholesky decomposition of $\Sigma$.

Usage:

//Univariate
val mean = 1.5
val sigma = 1.5
val a = -0.5
val c = 0.5
val d = ExtendedSkewGaussian(c, a, mean, sigma)

//Multivariate
val mu = DenseVector.ones[Double](4)
val alpha = DenseVector.fill[Double](4)(1.2)
val cov = DenseMatrix.eye[Double](4)*1.5
val tau = 0.2
val md = ExtendedMultivariateSkewNormal(tau, alpha, mu, cov)


Confusing Nomenclature

The following distribution has a very similar form and name to the extended skew gaussian distribution shown above. But despite its deceptively similar formula, it is a very different object.

We use the name MESN to denote the variant below instead of its expanded form.

### MESN¶

The Multivariate Extended Skew Normal or MESN distribution was formulated by Adcock and Schutes. It is given by

$\mathcal{X} \equiv \mathbb{R}^d$

$f(x) = \phi_{d}(\mathbf{x}; \mathbf{\mu} + \mathbf{\alpha}\tau, {\Sigma} + \mathbf{\alpha}\mathbf{\alpha}^\intercal) \Phi\left(\frac{\mathbf{\alpha}^{\intercal} \Sigma^{-1}(\mathbf{x} - \mathbf{\mu}) + \tau}{\sqrt{1 + \mathbf{\alpha}^{\intercal}\Sigma^{-1}\mathbf{\alpha}}}\right)$

$Z = \Phi(\tau)$

$\phi_{d}(.; \mathbf{\mu}, {\Sigma})$ and $\Phi()$ are the multivariate gaussian density function and standard gaussian univariate cumulative distribution function respectively.

Usage:

//Univariate
val mean = 1.5
val sigma = 1.5
val a = -0.5
val c = 0.5
val d = UESN(c, a, mean, sigma)

//Multivariate
val mu = DenseVector.ones[Double](4)
val alpha = DenseVector.fill[Double](4)(1.2)
val cov = DenseMatrix.eye[Double](4)*1.5
val tau = 0.2
val md = MESN(tau, alpha, mu, cov)


Extended Skew Gaussian Process ESGP

The MESN distribution is used to define the finite dimensional probabilities for the ESGP process.