Sometime in the last few weeks, while I was writing the explanations for the way in which neural networks learn and backpropagation algorithmI realized how I never tried to implement these algorithms in one of the programming languages. Then it struck me that I’ve never tried to implement the whole Artificial Neural Network from scratch. I was always using some libraries that were hiding that implementation from me so I could focus on the mathematical model and the problem I was trying to solve. One thing led to another and the decision to implement my own Neural Network from scratch without using third-party libraries was made. Also, I decided to use object-oriented programming language I prefer – C#.

This means that a more OO approach was taken and not the usual scripting point of view like we would have by using Python and R. One very good article did that implementation that kind of way and I strongly recommend you to skim through it. What I wanted to do is to separate every component and every operation. What was initially just a thought exercise grew into quite a cool mini side-project. So, I decided to share it with the world. Before we dive into the code, I would like to emphasize that this is not really the way you would generally implement the network. More math and forms of matrix multiplication should be used to optimize this entire process.

Apart from that, the implemented network represents a simplified, most basic form of Neural Network. Nevertheless, this way one can see all the components and elements of one Artificial Neural Network and get more familiar with the concepts from previous articles.

Artificial Neural Network Structure

Before we dive into the code, let’s run through the structure of ANN. In general, Artificial Neural Networks are biologically motivated, meaning that they are trying to mimic the behavior of the real nervous systems. Just like the smallest building unit in the real nervous system is the neuron, the same is with artificial neural networks – the smallest building unit is artificial neuron. In a real nervous system, these neurons are connected to each other by synapsis, which gives this entire system enormous processing power, ability to learn and huge flexibility. Artificial neural networks apply the same principle.

By connecting artificial neurons they aim to create a similar system. They are grouping neurons into layers and then create connections among neurons from each layer. Also, by assigning weights to each connection, they are able to filter important from non-important connections. The structure of the artificial neuron is a mirroring structure of the real neuron, too. Since they can have multiple inputs, i.e. input connections, a special function that collects that data is used – input function. The function that is usually used as input function in neurons is the function that sums all weighted inputs that are active on input connections – weighted input function.

Another important part of each artificial neuron is activation function. This function defines whether this neuron will send any signal to its outputs and which value will be propagated to the outputs. Basically, this function receives value from the input function and according to this value it generates an output value and propagates them to the outputs. If you need more details on the architecture of the artificial neural network, you can find it here.

Implementation

So, as you can see from the previous chapter there are a few important entities that we need to pay attention to and that we can abstract. They are neurons, connections, layer, and functions. In this solution, a separate class will implement each of these entities. Then, by putting it all together and adding backpropagation algorithm on top of it, we will have our implementation of this simple neural network.

Input Functions

As mentioned before, crucial parts of the neuron are input function and activation function. Let’s examine the input function. First I created an interface for this function so it can be easily changed in the neuron implementation later on:

These functions have only one method – CalculateInput, which receives a list of connections which are described in ISynapse interface. We will cover this abstraction later; so far all we need to know is that this interface represents connections among neurons. CalculateInput method needs to return some sort of value based on the data contained in the list of connections. Then, I did the concrete implementation of input function – weighted sum function.

This function sums weighted values on all connections that are passed in the list.

Activation Functions

Taking the same approach as in input function implementation, the interface for activation functions is implemented first:

After that, concrete implementations can be done. The CalculateOutput method should return the output value of the neuron based on input value that it got from input function. I like to have options, so I’ve done all functions mentioned in one of the previous blog posts. Here is how the step function looks:

Pretty straightforward, isn’t it? A threshold value is defined during the construction of the object, and then the CalculateOutput returns 1 if the input value exceeds the threshold value, otherwise, it returns 0.

Other functions are easy as well. Here is the Sigmoid activation function implementation:

And here is Rectifier activation function implementation:

So far so good –  we have implementations for input and activation function, and we can proceed to implement the trickier parts of the network – neurons and connections.

Neuron

The workflow that a neuron should follow goes like this: Receive input values from one or more weighted input connections. Collect those values and pass them to the activation function, which calculates the output value of the neuron. Send those values to the outputs of the neuron. Based on that workflow abstraction of the neuron this is created:

Before we explain each property and method, let’s see the concrete implementation of a neuron, since that will make the way it works far clearer:

Each neuron has its unique identifier – Id. This property is used in backpropagation algorithm later. Another property that is added for backpropagation purposes is the PreviousPartialDerivate, but this will be examined in detail further on. A neuron has two lists, one for input connections – Inputs, and another one for output connections – Outputs. Also, it has two fields, one for each of the functions described in previous chapters. They are initialized through the constructor. This way, neurons with different input and activation functions can be created.

This class has some interesting methods, too. AddInputNeuron and AddOutputNeuron are used to create a connection among neurons. The first one adds input connection to some neuron and the second one adds output connection to some neuron. AddInputSynapse adds InputSynapse to the neuron, which is a special type of connection. These are special connections that are used just for the input layer of the neuron, i.e. they are used only for adding input to the entirety of the system. This will be covered in more detail in the next chapter.

Last but not least, the CalculateOutput method is used to activate a chain reaction of output calculation. What will happen when this function is called? Well, this will call input function, which will request values from all input connections. In turn, these connections will request output values from input neurons of these connections, i.e. output values of neurons from the previous layer. This process will be done until input layer is reached and input values are propagated through the system.

Connections

Connections are abstracted trough ISynapse interface:

Every connection has its weight represented through the property of the same name. Additional property PreviousWeight is added and it is used during backpropagation of the error through the system. Update of the current weight and storing of the previous one is done in helper function UpdateWeight. There is another helper function – IsFromNeuron, which detects if a certain neuron is an input neuron to the connection. Of course, there is a method that gets an output value of the connection – GetOutput.

Here is the implementation of the connection:

Notice the fields _fromNeuron and _toNeuron, which define neurons that this synapse connects.

Apart from this implementation of the connection, there is another one that I’ve mentioned in the previous chapter about neurons. It is InputSynapse and it is used as an input to the system. The weight of these connections is always 1 and it is not updated during the training process. Here is the implementation of it:

Layer

Implementation of the neural layer is quite easy:

It contains the list of neurons used in that layer and the ConnectLayers method, which is used to glue two layers together.

Simple Artificial Neural Network

Now, let’s put all that together and add backpropagation to it. Take a look at the implementation of the Network itself:

This class contains a list of neural layers and a layer factory, a class that is used to create new layers. During construction of the object, initial input layer is added to the network. Other layers are added through the function AddLayer, which adds a passed layer on top of the current layer list. The GetOutput method will activate the output layer of the network, thus initiating a chain reaction through the network. Also, this class has a few helper methods such as PushExpectedValues, which is used to set desired values for the training set that will be passed during training, as well as PushInputValues, which is used to set certain input to the network.

The most important method of this class is the Train method. It receives the training set and the number of epochs. For each epoch, it runs the whole training set through the network as explained in this article. Then, the output is compared with desired output and functions HandleOutputLayer and HandleHiddenLayer are called. These functions implement backpropagation algorithm as described in this article.

Typical Workflow

Typical workflow can be seen in one of the tests implemented in the code on the repository – Train_RuningTraining_NetworkIsTrained. It goes something like this:

Firstly, a neural network object is created. In the constructor, it is defined that there will be three neurons in the input layer. After that, two layers are added using function AddLayer and layer factory. For each layer, the number of neurons and functions for each neuron are defined. After this part is completed, the expected outputs are defined and the Train function with input training set and the number of epochs is called.

Conclusion

This implementation of the neural network is far from optimal. You will notice plenty of nested for loops which certainly have bad performance. Also, in order to simplify this solution, some of the components of the neural network were not introduced in this first iteration of implementation, momentum and bias, for example. Nevertheless, it was not a goal to implement a network with high performance, but to analyze and display important elements and abstractions that each Artificial Neural Network have.

1. Thanks for writing this up! I learned many new things today.

Few comments on the C# side:
– Your Random won’t likely work, it should be a static instance across calls to really produce random values
– You may want to replace `ForEach() `either with simple `foreach` or `Select().ToList()`
– Replace List with IEnumerable or IReadOnlyCollection whenever possible (in ctors and method parameters, for instance)

1. Thanks for reading and thanks for the tips!
There is a lot of room for improvement, I’ll definitely fix things you’ve mentioned.

1. Toni Fasth says:

There’s nothing wrong with you using Lambdas instead of breaking ForEach() loops into separate foreach loops. Makes the code much easier to follow this way.

I agree that the Random() might not be that reliable using it the way it was implemented. Either way the code block explains what should be done and optimizing it for reliability is beside the point.

Same for IReadOnlyCollection VS List…for production code where the code is starting to become really complex this would be preferred but makes absolutely no difference in this case.

Keep it Simple and stay on track. There is always someone who thinks some code should be implemented in a “better” way. I see no point in wasting time with optimizing code when only proof of concept and readability are the main points to be made.

2. Thanks, for the post

pretty good for a start offf

3. John Woakes says:

In the Train method in SimpleNeuralNetwork class you derive a totalError value but don’t use it. I would have thought that would be used to help in the training process.

1. Yes, I was thinking I was going to use it, but I haven’t in this first iteration.

1. Makes sense 😉 How far are you planning to take this project. I have learnt so much from it and would like to see it progress. Great work.

2. Thank you, glad to hear that 🙂
I think I will keep it simple, just introduce some common concepts I skipped in the first run.

4. Toni Fasth says:

Very well written article and descriptive implementations with very clean and well documented code optimized for clarity. When explaining things, simple is the way to do it, even if performance suffers.

I’ve read a lot of AI articles with theory and code in the past, but the code was so heavily optimized or otherwise complex, without proper explanations, so that it was very difficult to understand the actual points the author wanted to make. This article made everything extremely clear.

I am hoping to see more similar implementations from scratch in C# for future AI and Statistics theory topics.

Really great work! Thank you!

1. Thank you for reading and for the lovely comment!

5. Avinash says:

This is awesome! First small step in neural network. Probably you should try implementing CNNs (Convolutional Neural network).

1. Thanks! Glad you liked it 🙂
Yes, that could be fun.

6. Hamed says:

Where is the NeuralLayerFactory class definition?

7. Great article! Especially since I am quite green in the ANN field. From this I am learning a lot. Except for one thing. In this example, it is not clear what your ANN is learning. What is it actually thinking about? It’s as if your code is missing a Main function, where it would become clear how this ANN can be implemented to solve a practical problem.

1. rubikscode says:

Thank you, I am glad that this post helped you learn.
But, you are right, in this implementation, I focused too much on the abstract solution in order to display certain concepts (and I didn’t cover them all). There is test method that is calling all this – Train_RuningTraining_NetworkIsTrained, but it is not solving any problem. It will be improved soon.

Thank you for reading!

8. Thanks for sharing !
I’ve learned a lot thanks to this 😀

1. rubikscode says:

Hey, thanks for reading!
Glad you liked it!

This site uses Akismet to reduce spam. Learn how your comment data is processed.