Building Blocks

Summary

The dtflearn object makes it easy to create and train neural networks of varying complexity.

Activation Functions

Apart from the activation functions defined in tensorflow for scala, DynaML defines some additional activations.

Layers

DynaML aims to supplement and extend the collection of layers available in org.platanios.tensorflow.api.layers, all the layers defined in DynaML's tensorflow package extend the Layer[T, R] class in org.platanios.tensorflow.api.layers.

Radial Basis Function Network

Radial Basis Function (RBF) networks are an important class of basis functions, each of which are expressed as decaying with distance from a defined central node.

\begin{align} f(x) & = \sum_{i}{w_{i} \varphi(||x - c_{i}||/\sigma)} \\ \varphi(u) & = exp(-u^2/2) \end{align}

The RBF layer implementation in DynaML treats the node center positions c_i and length scales \sigma_i as parameters to be learned via gradient based back-propagation.

import io.github.mandar2812.dynaml.tensorflow._

val rbf = dtflearn.rbf_layer(name = "rbf1", num_units = 10)

Continuous Time RNN

Continuous time recurrent neural networks (CTRNN) are an important class of recurrent neural networks. They enable the modelling of non-linear and potentially complex dynamical systems of multiple variables, with feedback.

Each state variable is modeled by a single neuron y_i, the evolution of the system y = (y_1, \cdots, y_n)^T is governed by a set of coupled ordinary differential equations. These equations can be expressed in vector form as follows.

\begin{align} dy/dt & = - \Lambda . y + W . \sigma(G.y + b) \\ \end{align}

The parameters of the system above are.

  • Time Constant/Decay Rate
\begin{equation} \Lambda = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix} \end{equation}
  • Gain
\begin{equation} G = \begin{pmatrix} g_{11} & \cdots & g_{1n} \\ \vdots & \ddots & \vdots \\ g_{n1} & \cdots & g_{nn} \end{pmatrix} \end{equation}
  • Bias
\begin{equation} b = \begin{pmatrix} b_{1}\\ \vdots\\ b_{n} \end{pmatrix} \end{equation}
  • Weights
\begin{equation} W = \begin{pmatrix} w_{11} & \cdots & w_{1n} \\ \vdots & \ddots & \vdots \\ w_{n1} & \cdots & w_{nn} \end{pmatrix} \end{equation}

In order to use the CTRNN model in a modelling sequences of finite length, we need to solve its governing equations numerically. This gives us the trajectory of the state upto T steps y^{0}, \cdots, y^{T}.

y^{k+1} = y^{k} + \Delta t (- \Lambda . y^{k} + W . \sigma(G.y^{k} + b))

DynaML's implementation of the CTRNN can be used to learn the trajectory of dynamical systems upto a predefined time horizon. The parameters \Lambda, G, b, W are learned using gradient based loss minimization.

The CTRNN implementations are also instances of Layer[Output, Output], which take as input tensors of shape n and produce tensors of shape (n, T), there are two variants that users can choose from.

Fixed Time Step Integration

When the integration time step \Delta t is user defined and fixed.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._ 

val ctrnn_layer = dtflearn.ctrnn(
name = "CTRNN_1", units = 10, 
horizon = 5, timestep = 0.1)

Dynamic Time Step Integration

When the integration time step \Delta t is a parameter that can be learned during the training process.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._ 

val dctrnn_layer = dtflearn.dctrnn(
name = "DCTRNN_1", units = 10, 
horizon = 5)

Stack & Concatenate

Often one would need to combine inputs of previous layers in some manner, the following layers enable these operations.

Stack Inputs

This is a computational layer which performs the function of dtf.stack().

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._ 

val stk_layer = dtflearn.stack_outputs("StackTensors", axis = 1)

Concatenate Inputs

This is a computational layer which performs the function of dtf.concatenate().

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._ 

val concat_layer = dtflearn.stack_outputs("ConcatenateTensors", axis = 1)

Collect Layers

A sequence of layers can be collected into a single layer which accepts a sequence of symbolic tensors.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._ 

val layers = Seq(
  tf.learn.Linear("l1", 10),
  dtflearn.identity("Identity"),
  dtflearn.ctrnn(
    name = "CTRNN_1", units = 10, 
    horizon = 5, timestep = 0.1
  )
)

val combined_layer = dtflearn.stack_layers("Collect", layers)

Input Pairs

To handle inputs consisting of pairs of elements, one can provide a separate layer for processing each of the elements.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._

val sl = dtflearn.tuple2_layer(
  "tuple2layer", 
  dtflearn.rbf_layer("rbf1", 10), 
  tf.learn.Linear("lin1", 10)) 

Combining the elements of Tuple2 can be done as follows.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._

//Stack elements of the tuple into one tensor
val layer1 = dtflearn.stack_tuple2("tuple2layer", axis = 1) 
//Concatenate elements of the tuple into one tensor
val layer2 = dtflearn.concat_tuple2("tuple2layer", axis = 1) 

Stoppage Criteria

In order to train tensorflow models using iterative gradient based models, the user must define some stoppage criteria for the training process. This can be done via the method tf.learn.StopCriteria(). The following preset stop criteria call tf.learn.StopCriteria() under the hood.

Iterations Based

val stopc1 = dtflearn.max_iter_stop(10000)

Change in Loss

Absolute Value of Loss

val stopc2 = dtflearn.abs_loss_change_stop(0.1)

Relative Value of Loss

val stopc2 = dtflearn.rel_loss_change_stop(0.1)

Network Building Blocks

To make it convenient to build deeper stacks of neural networks, DynaML includes some common layer design patterns as ready made easy to use methods.

Convolutional Neural Nets

Convolutional neural networks (CNN) are a crucial building block of deep neural architectures for visual pattern recognition. It turns out that CNN layers must be combined with other computational units such as rectified linear (ReLU) activations, dropout and max pool layers.

Currently two abstractions are offered for building large CNN based network stacks

Convolutional Unit

A single CNN unit is expressed as a convolutional layer followed by a ReLU activation and proceeded by a dropout layer.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._

//Learn 16 filters of shape (2, 2, 4), suitable for 4 channel jpeg images.
//Slide the filters over the image in steps of 1 pixel in each direction.
val cnn_unit = dtflearn.conv2d_unit(
    shape = Shape(2, 2, 4, 16), stride = (1, 1),
    relu_param = 0.05f, dropout = true,
    keep_prob = 0.55f)(i = 1)

Convolutional Pyramid

A CNN pyramid builds a stack of CNN units each with a stride multiplied by a factor of 2 and depth divided by a factor of 2 with respect to the previous unit.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._

//Start with a CNN unit of shape (2, 2, 3, 16) stride (1, 1)
//End with a CNN unit of shape (2, 2, 8, 4) and stride of (8, 8)
val cnn_stack = dtflearn.conv2d_pyramid(
  size = 2, num_channels_input = 3)(
  start_num_bits = 4, end_num_bits = 2)(
  relu_param = 0.1f, dropout = true, 
  keep_prob = 0.6F)

Feed-forward Neural Nets

Feed-forward networks are the oldest and most frequently used components of neural network architectures, they are often stacked into a number of layers. With dtflearn.feedforward_stack(), you can define feed-forward stacks of arbitrary width and depth.

import io.github.mandar2812.dynaml.tensorflow._
import org.platanios.tensorflow.api._

val net_layer_sizes = Seq(10, 20, 13, 15)

val architecture = dtflearn.feedforward_stack(
    (i: Int) => dtflearn.Phi("Act_"+i), FLOAT64)(
    net_layer_sizes)

Building Tensorflow Models

After defining the key ingredients needed to build a tensorflow model, dtflearn.build_tf_model() builds a new computational graph and creates a tensorflow model and estimator which is trained on the provided data. In the following example, we bring together all the elements of model training: data, architecture, loss etc.

import ammonite.ops._
import io.github.mandar2812.dynaml.tensorflow.dtflearn
import org.platanios.tensorflow.api._
import org.platanios.tensorflow.api.ops.NN.SamePadding
import org.platanios.tensorflow.data.image.CIFARLoader
import java.nio.file.Paths


val tempdir = home/"tmp"

val dataSet = CIFARLoader.load(
  Paths.get(tempdir.toString()), 
  CIFARLoader.CIFAR_10)

val trainImages = tf.data.TensorSlicesDataset(dataSet.trainImages)
val trainLabels = tf.data.TensorSlicesDataset(dataSet.trainLabels)

val trainData = 
  trainImages.zip(trainLabels)
    .repeat()
    .shuffle(10000)
    .batch(128)
    .prefetch(10)


println("Building the classification model.")
val input = tf.learn.Input(
  UINT8, 
  Shape(
    -1, 
    dataSet.trainImages.shape(1), 
    dataSet.trainImages.shape(2), 
    dataSet.trainImages.shape(3))
)

val trainInput = tf.learn.Input(UINT8, Shape(-1))

val architecture = tf.learn.Cast("Input/Cast", FLOAT32) >>
  dtflearn.conv2d_pyramid(2, 3)(4, 2)(0.1f, true, 0.6F) >>
  tf.learn.MaxPool(
    "Layer_3/MaxPool", 
    Seq(1, 2, 2, 1), 
    1, 1, SamePadding) >>
  tf.learn.Flatten("Layer_3/Flatten") >>
  dtflearn.feedforward(256)(id = 4) >>
  tf.learn.ReLU("Layer_4/ReLU", 0.1f) >>
  dtflearn.feedforward(10)(id = 5)

val trainingInputLayer = tf.learn.Cast("TrainInput/Cast", INT64)

val loss = tf.learn.SparseSoftmaxCrossEntropy("Loss/CrossEntropy") >>
  tf.learn.Mean("Loss/Mean") >> 
  tf.learn.ScalarSummary("Loss/Summary", "Loss")

val optimizer = tf.train.AdaGrad(0.1)

println("Training the linear regression model.")
val summariesDir = java.nio.file.Paths.get(
  (tempdir/"cifar_summaries").toString()
)

val (model, estimator) = dtflearn.build_tf_model(
  architecture, input, trainInput, trainingInputLayer,
  loss, optimizer, summariesDir, 
  dtflearn.max_iter_stop(1000),
  100, 100, 100)(trainData)

Comments