Performance Evaluation
Model evaluation is the litmus test for knowing if your modeling effort is headed in the right direction and for comparing various alternative models (or hypothesis) attempting to explain a phenomenon. The evaluation
package contains classes and traits to calculate performance metrics for DynaML models.
Classes which implement model performance calculation can extend the Metrics[P]
trait. The Metrics
trait requires that its sub-classes implement three methods or behaviors.
- Print out the performance metrics (whatever they may be) to the screen i.e.
print
method. - Return the key performance indicators in the form of a breeze
DenseVector[Double]
, i.e. thekpi()
method.
Regression Models¶
Regression models are generally evaluated on a few standard metrics such as mean square error, mean absolute error, coefficient of determination (R^2), etc. DynaML has implementations for single output and multi-output regression models.
Single Output¶
Small Test Set
The RegressionMetrics
class takes as input a scala list containing the predictions and actual outputs and calculates the following metrics.
- Mean Absolute Error (mae)
- Root Mean Square Error (rmse)
- Correlation Coefficient (\rho_{y \hat{y}})
- Coefficient of Determination (R^2)
//Predictions computed by any model.
val predictionAndOutputs: List[(Double, Double)] = ...
val metrics = new RegressionMetrics(predictionAndOutputs, predictionAndOutputs.length)
//Print results on screen
metrics.print
Large Test Set
The RegressionMetricsSpark
class takes as input an Apache Spark RDD containing the predictions and actual outputs and calculates the same metrics as above.
//Predictions computed by any model.
val predictionAndOutputs: RDD[(Double, Double)] = ...
val metrics = new RegressionMetricsSpark(predictionAndOutputs, predictionAndOutputs.length)
//Print results on screen
metrics.print
Multiple Outputs¶
The MultiRegressionMetrics
class calculates regression performance for multi-output models.
//Predictions computed by any model.
val predictionAndOutputs: List[(DenseVector[Double], DenseVector[Double])] = ...
val metrics = new MultiRegressionMetrics(predictionAndOutputs, predictionAndOutputs.length)
//Print results on screen
metrics.print
Classification Models¶
Currently (as of v1.4) there is only a binary classification implementation for calculating model performance.
Binary Classification¶
Small Test Sets
The BinaryClassificationMetrics
class calculates the following performance indicators.
- Classification accuracy
- F-measure
- Precision-Recall Curve (and area under it).
- Receiver Operating Characteristic (and area under it)
- Matthew's Correlation Coefficient
val scoresAndLabels: List[(Double, Double)] = ...
//Set logisticFlag = true in case outputs are produced via logistic regression
val metrics = new BinaryClassificationMetrics(
scoresAndLabels,
scoresAndLabels.length,
logisticFlag = true)
metrics.print
Large Test Sets
The BinaryClassificationMetricsSpark
class takes as input an Apache Spark RDD containing the predictions and actual labels and calculates the same metrics as above.