v1.4.1
Version 1.4.1 of DynaML, released March 26, 2017, implements a number of new models (Extended Skew GP, student T process, generalized least squares, etc) and features.
Pipes API¶
Additions¶
The pipes API has been vastly extended by creating pipes which encapsulate functions of multiple arguments leading to the following end points.
DataPipe2[A, B, C]
: Pipe which takes 2 argumentsDataPipe3[A, B, C, D]
: Pipe which takes 3 argumentsDataPipe4[A, B, C, D, E]
: Pipe which takes 4 arguments
Furthermore there is now the ability to create pipes which return pipes, something akin to curried functions in functional programming.
MetaPipe
: Takes an argument returns aDataPipe
MetaPipe21
: Takes 2 arguments returns aDataPipe
MetaPipe12
: Takes an argument returns aDataPipe2
A new kind of Stream data pipe, StreamFlatMapPipe
is added to represent data pipelines which can perform flat map like operations on streams.
val mapFunc: (I) => Stream[J] = ...
val streamFMPipe = StreamFlatMapPipe(mapFunc)
- Added Data Pipes API for Apache Spark RDDs.
val num = 20
val numbers = sc.parallelize(1 to num)
val convPipe = RDDPipe((n: Int) => n.toDouble)
val sqPipe = RDDPipe((x: Double) => x*x)
val sqrtPipe = RDDPipe((x: Double) => math.sqrt(x))
val resultPipe = RDDPipe((r: RDD[Double]) => r.reduce(_+_).toInt)
val netPipeline = convPipe > sqPipe > sqrtPipe > resultPipe
netPipeline(numbers)
- Added
UnivariateGaussianScaler
class for gaussian scaling of univariate data.
Core API¶
Additions¶
Package dynaml.models.bayes
This new package will house stochastic prior models, currently there is support for GP and Skew GP priors, to see a starting example see stochasticPriors.sc
in the scripts
directory of the DynaML source.
Package dynaml.kernels
- Added
evaluateAt(h)(x,y)
andgradientAt(h)(x,y)
; expressingevaluate(x,y)
andgradient(x,y)
in terms of them - Added
asPipe
method for Covariance Functions - For backwards compatibility users are advised to extend
LocalSVMKernel
in their custom Kernel implementations incase they do not want to implement theevaluateAt
API endpoints. - Added
FeatureMapKernel
, representing kernels which can be explicitly decomposed into feature mappings. - Added Matern half integer kernel
GenericMaternKernel[I]
- Added
block(S: String*)
method to block any hyper-parameters of kernels. - Added
NeuralNetworkKernel
andGaussianSpectralKernel
. - Added
DecomposableCovariance
import io.github.mandar2812.dynaml.DynaMLPipe._
import io.github.mandar2812.dynaml.kernels._
implicit val ev = VectorField(6)
implicit val sp = breezeDVSplitEncoder(2)
implicit val sumR = sumReducer
val kernel = new LaplacianKernel(1.5)
val other_kernel = new PolynomialKernel(1, 0.05)
val decompKernel = new DecomposableCovariance(kernel, other_kernel)(sp, sumReducer)
val other_kernel1 = new FBMKernel(1.0)
val decompKernel1 = new DecomposableCovariance(decompKernel, other_kernel1)(sp, sumReducer)
val veca = DenseVector.tabulate[Double](8)(math.sin(_))
val vecb = DenseVector.tabulate[Double](8)(math.cos(_))
decompKernel1.evaluate(veca, vecb)
Package dynaml.algebra
Partitioned Matrices/Vectors and the following operations
- Addition, Subtraction
- Matrix, vector multiplication
- LU, Cholesky
A\y
,A\Y
Added calculation of quadratic forms, namely:
quadraticForm
which calculates \mathbf{x}^\intercal A^{-1} \mathbf{x}crossQuadraticForm
which calculates \mathbf{y}^\intercal A^{-1} \mathbf{x}
Where A is assumed to be a symmetric positive semi-definite matrix
Usage:
import io.github.mandar2812.dynaml.algebra._
val x: DenseVector[Double] = ...
val y: DenseVector[Double] = ...
val a: DenseMatrix[Double] = ...
quadraticForm(a,x)
crossQuadraticForm(y, a, x)
Package dynaml.modelpipe
New package created, moved all inheriting classes of ModelPipe
to this package.
Added the following:
GLMPipe2
A pipe taking two arguments and returning aGeneralizedLinearModel
instanceGeneralizedLeastSquaresPipe2
:GeneralizedLeastSquaresPipe3
:
Package dynaml.models
- Added a new Neural Networks API:
NeuralNet
andGenericFFNeuralNet
, for an example refer toTestNNDelve
indynaml-examples
. GeneralizedLeastSquaresModel
: The GLS model.ESGPModel
: The implementation of a skew gaussian process regression model- Warped Gaussian Process models WIP
- Added mean function capability to Gaussian Process and Student T process models.
- Added Apache Spark implementation of Generalized Linear Models; see SparkGLM, SparkLogisticModel, SparkProbitGLM
Package dynaml.probability
MultivariateSkewNormal
as specified in Azzalani et. alExtendedMultivariateSkewNormal
UESN
andMESN
representing an alternative formulation of the skew gaussian family from Adcock and Shutes.TruncatedGaussian
: Truncated version of the Gaussian distribution.- Matrix Normal Distribution
- Added Expectation operator for
RandomVariable
implementations in theio.github.mandar2812.dynaml.probability
package object. Usage example given below. SkewGaussian
,ExtendedSkewGaussian
: An breeze implementation of the SkewGaussian and extended Skew-Gaussian distributions respectivelyPushforwardMap
,DifferentiableMap
added:PushforwardMap
enables creating new random variables with defined density from base random variables.
import io.github.mandar2812.dynaml.analysis._
import io.github.mandar2812.dynaml.probability._
import io.github.mandar2812.dynaml.probability.distributions._
val g = GaussianRV(0.0, 0.25)
val sg = RandomVariable(SkewGaussian(1.0, 0.0, 0.25))
//Define a determinant implementation for the Jacobian type (Double in this case)
implicit val detImpl = identityPipe[Double]
//Defines a homeomorphism y = exp(x) x = log(y)
val h: PushforwardMap[Double, Double, Double] = PushforwardMap(
DataPipe((x: Double) => math.exp(x)),
DifferentiableMap(
(x: Double) => math.log(x),
(x: Double) => 1.0/x)
)
//Creates a log-normal random variable
val p = h->g
//Creates a log-skew-gaussian random variable
val q = h->sg
//Calculate expectation of q
println("E[Q] = "+E(q))
ContinuousMCMC
and the underlying sampling implementation in GeneralMetropolisHastings
.
- Added implementation of Approximate Bayesian Computation (ABC) in the ApproxBayesComputation
class.
//The mean
val center: DenseMatrix[Double] = ...
//Covariance (positive semi-def) matrix among rows
val sigmaRows: DenseMatrix[Double] = ...
//Covariance (positive semi-def) matrix among columns
val sigmaCols: DenseMatrix[Double] = ...
val matD = MatrixNormal(center, sigmaRows, sigmaCols)
//The degrees of freedom (must be > 2.0 for existence of finite moments)
val mu: Double = ...
//The mean
val center: DenseMatrix[Double] = ...
//Covariance (positive semi-def) matrix among rows
val sigmaRows: DenseMatrix[Double] = ...
//Covariance (positive semi-def) matrix among columns
val sigmaCols: DenseMatrix[Double] = ...
val matD = MatrixT(mu, center, sigmaCols, sigmaRows)
Package dynaml.optimization
- Added
ProbGPCommMachine
which performs grid search or CSA and then instead of selecting a single hyper-parameter configuration calculates a weighted Gaussian Process committee where the weights correspond to probabilities or confidence on each model instance (hyper-parameter configuration).
Package dynaml.utils
//Returns logarithm of multivariate gamma function
val g = mvlgamma(5, 1.5)
Package dynaml.dataformat
- Added support for reading MATLAB
.mat
files in theMAT
object.
Improvements/Bug Fixes¶
Package dynaml.probability
- Removed
ProbabilityModel
and replaced withJointProbabilityScheme
andBayesJointProbabilityScheme
, major refactoring toRandomVariable
API.
Package dynaml.optimization
- Improved logging of
CoupledSimulatedAnnealing
- Refactored
GPMLOptimizer
toGradBasedGlobalOptimizer
Package dynaml.utils
- Correction to utils.getStats
method used for calculating mean and variance of data sets consisting of DenseVector[Double]
.
- minMaxScalingTrainTest
minMaxScaling
of DynaMLPipe
using GaussianScaler
instead of MinMaxScaler
for processing of features.
Package dynaml.kernels
- Fix to
CoRegCauchyKernel
: corrected mismatch of hyper-parameter string - Fix to
SVMKernel
objects matrix gradient computation in the case when kernel dimensions are not multiples of block size. - Correction to gradient calculation in RBF kernel family.
- Speed up of kernel gradient computation, kernel and kernel gradient matrices with respect to the model hyper-parameters now calculated in a single pass through the data.