A simple multilayer perceptron (MLP), also known as a fully connected feedforward artificial neural network, written from scratch in Julia.
First, initialize the neural network by filling the NN
structure from
mlp.jl according to the networks dimensions.
include("mlp.jl")
# size of each layer
dims = [784, 100, 20, 10]
nn = init(dims)
Then train the network on a data batch of type Data
(defined in
mlp.jl). The train!()
function modifies the networks parameters
based on the average gradient across all data points. Optionally, the
learning rate η
can be passed (default η=1.0
). The function returns
the average loss of the network.
train!(nn, batch, η=0.001)
In order to achieve stochastic gradient descent, the train!()
function
can be called from a for
-loop. The forward!()
and loss()
function
can also be called manually. Have a look at the examples.
Based on the above equation, one can infer the partial derivatives of the biases, weights and activations with respect to the loss / cost using the chain rule.
The backprop!()
function from mlp.jl is optimized and
vectorized, so the equations look different than above.