Performance Evaluation
Model evaluation is the litmus test for knowing if your modeling effort is headed in the right direction and for comparing various alternative models (or hypothesis) attempting to explain a phenomenon. The evaluation package contains classes and traits to calculate performance metrics for DynaML models.
Classes which implement model performance calculation can extend the Metrics[P] trait. The Metrics trait requires that its sub-classes implement three methods or behaviors.
- Print out the performance metrics (whatever they may be) to the screen i.e. printmethod.
- Return the key performance indicators in the form of a breeze DenseVector[Double], i.e. thekpi()method.
Regression Models¶
Regression models are generally evaluated on a few standard metrics such as mean square error, mean absolute error, coefficient of determination (R^2), etc. DynaML has implementations for single output and multi-output regression models.
Single Output¶
Small Test Set
The RegressionMetrics class takes as input a scala list containing the predictions and actual outputs and calculates the following metrics.
- Mean Absolute Error (mae)
- Root Mean Square Error (rmse)
- Correlation Coefficient (\rho_{y \hat{y}})
- Coefficient of Determination (R^2)
//Predictions computed by any model.
val predictionAndOutputs: List[(Double, Double)] = ...
val metrics = new RegressionMetrics(predictionAndOutputs, predictionAndOutputs.length)
//Print results on screen
metrics.print
Large Test Set
The RegressionMetricsSpark class takes as input an Apache Spark RDD containing the predictions and actual outputs and calculates the same metrics as above.
//Predictions computed by any model.
val predictionAndOutputs: RDD[(Double, Double)] = ...
val metrics = new RegressionMetricsSpark(predictionAndOutputs, predictionAndOutputs.length)
//Print results on screen
metrics.print
Multiple Outputs¶
The MultiRegressionMetrics class calculates regression performance for multi-output models.
//Predictions computed by any model.
val predictionAndOutputs: List[(DenseVector[Double], DenseVector[Double])] = ...
val metrics = new MultiRegressionMetrics(predictionAndOutputs, predictionAndOutputs.length)
//Print results on screen
metrics.print
Classification Models¶
Currently (as of v1.4) there is only a binary classification implementation for calculating model performance.
Binary Classification¶
Small Test Sets
The BinaryClassificationMetrics class calculates the following performance indicators.
- Classification accuracy
- F-measure
- Precision-Recall Curve (and area under it).
- Receiver Operating Characteristic (and area under it)
- Matthew's Correlation Coefficient
val scoresAndLabels: List[(Double, Double)] = ...
//Set logisticFlag = true in case outputs are produced via logistic regression
val metrics = new BinaryClassificationMetrics(
          scoresAndLabels,
          scoresAndLabels.length,
          logisticFlag = true)
metrics.print
Large Test Sets
The BinaryClassificationMetricsSpark class takes as input an Apache Spark RDD containing the predictions and actual labels and calculates the same metrics as above.