Model Pipes
Summary
Model pipes define pipelines which involve predictive models.
Note
The classes described here exist in the dynaml.modelpipe
package of the dynaml-core
module. Although they are not strictly part of the pipes module, they are included here for clarity and continuity.
The pipes module gives the user the ability to create workflows of arbitrary complexity. In order to enable end to end machine learning, we need pipelines which involve predictive models. These pipelines can be of two types.
-
Pipelines which take data as input and output a predictive model.
It is evident that the model creation itself is a common step in the data analysis workflow, therefore one needs library pipes which create machine learning models given the training data and other relevant inputs.
-
Pipelines which encapsulate predictive models and generate predictions for test data splits.
Once a model has been tuned/trained, it can be a part of a pipeline which generates predictions for previously unobserved data.
Model Creation¶
All pipelines which return predictive models as outputs extend the ModelPipe
trait.
Generalized Linear Model Pipe¶
//Pre-process data
val pre: (Source) => Stream[(DenseVector[Double], Double)] = _
val feature_map: (DenseVector[Double]) => (DenseVector[Double]) = _
val glm_pipe =
GLMPipe[(DenseMatrix[Double], DenseVector[Double]), Source](
pre, map, task = "regression",
modelType = "")
val dataSource: Source = _
val glm_model = glm_pipe(dataSource)
- Type:
DataPipe[Source, GeneralizedLinearModel[T]]
- Result: Takes as input a data of type
Source
and outputs a Generalized Linear Model.
Generalized Least Squares Model Pipe¶
val kernel: LocalScalarKernel[DenseVector[Double]]
val gls_pipe2 = GeneralizedLeastSquaresPipe2(kernel)
val featuremap: (DenseVector[Double]) => (DenseVector[Double]) = _
val data: Stream[(DenseVector[Double], Double)] = _
val gls_model = gls_pipe2(data, featuremap)
- Type:
DataPipe2[Stream[(DenseVector[Double], Double)], DataPipe[DenseVector[Double], DenseVector[Double]], GeneralizedLeastSquaresModel]]
- Result: Takes as inputs data and a feature mapping and outputs a Generalized Least Squares Model.
Gaussian Process Regression Model Pipe¶
//Pre-process data
val pre: (Source) => Stream[(DenseVector[Double], Double)] = _
//Declare kernel and noise
val kernel: LocalScalarKernel[DenseVector[Double]] = _
val noise: LocalScalarKernel[DenseVector[Double]] = _
GPRegressionPipe(
pre, kernel, noise,
order: Int = 0, ex: Int = 0)
- Type:
DataPipe[Source, M]
- Result: Takes as input data of type
Source
and outputs a Gaussian Process regression model as the output.
Dual LS-SVM Model Pipe¶
//Pre-process data
val pre: (Source) => Stream[(DenseVector[Double], Double)] = _
//Declare kernel
val kernel: LocalScalarKernel[DenseVector[Double]] = _
DLSSVMPipe(pre, kernel, task = "regression")
- Type:
DataPipe[Source, DLSSVM]
- Result: Takes as input data of type
Source
and outputs a LS-SVM regression/classification model as the output.
Model Prediction¶
Prediction pipelines encapsulate predictive models, the ModelPredictionPipe
class provides an expressive API for creating prediction pipelines.
//Any model
val model: Model[T, Q, R] = _
//Data pre and post processing
val preprocessing: DataPipe[P, Q] = _
val postprocessing: DataPipe[R, S] = _
val prediction_pipeline = ModelPredictionPipe(
preprocessing,
model,
postprocessing)
//In case no pre or post processing is done.
val prediction_pipeline2 = ModelPredictionPipe(model)
//Incase feature and target scaling is performed
val featureScaling: ReversibleScaler[Q] = _
val targetScaling: ReversibleScaler[R] = _
val prediction_pipeline3 = ModelPredictionPipe(
featureScaling,
model,
targetScaling)