State of DynaML 2016
Summarizes some of the pet projects being tackled in DynaML
The past year has seen DynaML grow by leaps and bounds, this post hopes to give you an update about what has been achieved and a taste for what is to come.
Completed Features¶
A short tour of the enhancements which were completed.
January to June¶
 Released
v1.3.x
series with the following new additions
Models
 Regularized Least Squares
 Logistic and Probit Regression
 Feed Forward Neural Nets
 Gaussian Process (GP) classification and NARX based models
 Least Squares Support Vector Machines (LSSVM) for classification and regression
 Meta model API, committee models
Optimization Primitives
 Regularized Least Squares Solvers
 Gradient Descent
 Committee model solvers
 Linear Solvers for LSSVM
 Laplace approximation for GPs
Miscellaneous
 Data Pipes API

Migration to scala version 2.11.8

Started work on release
1.4.x
series with initial progress
Improvements
 Migrated from Maven to Sbt.
 Set Ammonite as default REPL.
June to December¶
 Released
v1.4
with the following features.
Models
The following inference models have been added.
 LSSVM committees.
 Multioutput, multitask Gaussian Process models as reviewed in Lawrence et. al.
 Student T Processes: single and multi output inspired from Shah, Ghahramani et. al
 Performance improvement to computation of marginal likelihood and posterior predictive distribution in Gaussian Process models.
 Posterior predictive distribution outputted by the
AbstractGPRegression
base class is now changed toMultGaussianRV
which is added to thedynaml.probability
package.
Kernels

Added
StationaryKernel
andLocallyStationaryKernel
classes in the kernel APIs, convertedRBFKernel
,CauchyKernel
,RationalQuadraticKernel
&LaplacianKernel
to subclasses ofStationaryKernel

Added
MLPKernel
which implements the maximum likelihood perceptron kernel as shown here. 
Added coregionalization kernels which are used in Lawrence et. al to formulate kernels for vector valued functions. In this category the following coregionalization kernels were implemented.
CoRegRBFKernel
CoRegCauchyKernel
CoRegLaplaceKernel

CoRegDiracKernel

Improved performance when calculating kernel matrices for composite kernels.

Added
:*
operator to kernels so that one can create separable kernels used in coregionalization models.
Optimization
 Improved performance of
CoupledSimulatedAnnealing
, enabled use of 4 variants of Coupled Simulated Annealing, adding the ability to set annealing schedule using so called variance control scheme as outlined in deSouza, Suykens et. al.
Pipes

Added
Scaler
andReversibleScaler
traits to represent transformations which input and output into the same domain set, these traits are extensions ofDataPipe
. 
Added Discrete Wavelet Transform based on the Haar wavelet.

Started work on
v1.4.1
with the following progress
Linear Algebra API

Partitioned Matrices/Vectors and the following operations
 Addition, Subtraction
 Matrix, vector multiplication
 LU, Cholesky
 A\y, A\Y
Probability API
 Added API end points for representing Measurable Functions of random variables.
Model Evaluation
 Added Matthews Correlation Coefficient calculation to
BinaryClassificationMetrics
via thematthewsCCByThreshold
method
Data Pipes API
 Added
Encoder[S,D]
traits which are reversible data pipes representing an encoding between typesS
andD
.
Miscellaneous
 Updated
ammonite
version to0.8.1
 Added support for compiling basic R code with renjin. Run R code in the following manner:
val toRDF = csvToRDF("dfWine", ';')
val wine_quality_red = toRDF("data/winequalityred.csv")
//Descriptive statistics
val commands: String = """
print(summary(dfWine))
print("\n")
print(str(dfWine))
"""
r(commands)
//Build Linear Model
val modelGLM = rdfToGLM("model", "quality", Array("fixed.acidity", "citric.acid", "chlorides"))
modelGLM("dfWine")
//Print goodness of fit
r("print(summary(model))")
Ongoing Work¶
Some projects being worked on right now are.
 Bayesian optimization using Gaussian Process models.
 Implementation of Neural Networks using the akka actor API.
 Implementation of kernels which can be decomposed on data dimensions k((x_1, x_2), (y_1, y_2)) = k_1(x_1, y_1) + k_2(x_2, y_2)