Understanding neuronal networks
I tried to understand neuronal networks and deep learning. Therefore I read an viewed some really great tutorials, that I append on the end. You can find the custom implementations of neuronal networks here. The purpose is to try to understand the concepts and mathematics and to implement an own neuronal network from scratch.
Other work
Bottom to top
The explanation of Andrej Karpathy uses a bottom to top approach. He describes first how circuits are used to do calculations and goes then to backpropagation and calculus and then to neuronal networks.
Top to bottom
Th explanation of 3Blue1Brown uses a top to bottom approach. He describes first a neuronal network, then backpropagation and then the calculus behind it.
My try
For my neuronal network, I use the classic hello world program of machine learning: prediction handwritten digits.
Neuron
A neuron is just a ‘box’ that computes input values to an output value. It does this by getting a input values from previous neurons, giving them a weight and adding a bias. As an formular this looks like: ax + c, where a is the input from a previous neuron, which gets multiplied with an weight x, and c is the bias. As the bological neurons, a neuron does not send values always. That means, it does not fire all the time. To specify when it fires, you wrap the calculation of the output value into an activation function.
Neuronal network
A neuronal network consists of multiple layers that contain multiple neurons. The first layer is the input layer and holds the input data. The last layer is the output layer and computes the predictions.
Steps
Data
First you have to get data. This step contains often cleaning up and transforming the data, like in ETL processes. For the handwritten digits, this means, that you have to transform them in way you can pass them in the network. For example by resizing all images to one size, e.g. 28 px x 28 px and to align the images in the center. Then you can parse the images to a digist format, like a value between zero and one, depending on the darkness of the pixel. In the MNIST data set, all this done for you.
Training
Forwading data
Then the data has to loaded in the input layer. That means that the neurons of the input layer will be set with the data as value. As the pixels from the digits are loaded as an array of float numbers, each of them will be loaded in a single input neuron.
Backpropagation + Maths
After forwarding the data, all neurons will have random biases and weights, as well computed values and activations. Also the output neurons will have computed an prediction. Now you have to turn the prediciton into an understandable value, based on the goal of the network. For example you can create ten output neurons for the possible outputs of the network: numbers between zero and nine. Depending of the output values of each neuron you can calculate how much they are wrong or right. For example this can be done by a formular called mean squared error.
Testing
To check how accurate your network is, you have to test against data that it has never seen. Then you can simply calculate the error rate.
Implementing neuronal networks
One hidden layer with one neuron
The full code can be found here.
One hidden layer with multiple neurons
The full code can be found here.
Mulitple layers with multiple neurons
The challenge with multiple layers is that the backpropagation gets complexer, as changes on the first hidden layer will have impact on a lot of neurons because of the next connections between the neurons of the next layers. The challenge with multiple neurons per layer is that instead of one weight and input pair, the neuron has to calculate the sum of multiple pairs of weights and input values. Therefore you have to save the weights somewhere. The full code can be found here.
Adding dynammically layers and neurons
The full code can be found here.
Review
My neuronal network can be optimzied by using
More informations
- http://karpathy.github.io/neuralnets/
- http://neuralnetworksanddeeplearning.com
- https://www.youtube.com/watch?v=u4alGiomYP4 and https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/