Often we forget that we stand on the shoulders of giants. Machine Learning, Deep Learning and AI have gained such traction, that there are many frameworks available. Today, it is really easy to pick the framework (for example ML.NET) of our choosing and start the project fully focusing on the problem we are trying to solve. However, sometimes it is good to stop and actually consider what we are using and how that thing actually works.

Since I myself am a software developer that switched to machine learning, back in the day I decided to build Neural Network from scratch using the object-oriented programming and C#. This is because I wanted to split the building blocks of neural networks and learn more about them with the tools I already know. This way I was not learning two things, but just one. Since then I often used this solution to explain deep learning concepts to .NET developers. So, in this article, we go through that solution.

ML.NET Full-Stack: Complete Guide to Machine Learning for .NET Developers

From the basics of machine learning to more complex topics like neural networks, object detection and NLP, this course will guide you into becoming ML.NET superhero.

In this article, we cover:

  1. Artificial Neural Networks and Object-Oriented Programming?
  2. Artificial Neural Networks – A Biology Inspired Idea
  3. Main Components of Artificial Neural Networks
  4. Implementation
    • Input Function
    • Activation Function
    • Neuron
    • Connections
    • Layer
    • Putting it all together
    • Workflow

1. Artificial Neural Networks and Object-Oriented Programming?

Whenever I presented this solution to someone who is deep (pun intended) in the field, the question is always “Yes, but why?”. Indeed, this is not a way to professionally build neural networks. Yes, we can abstract all the weights and biases as matrixes. Yes, we can use some framework that exceptionally well handles operations on those matrices, like PyTorch for example.

Yes, some may consider this thought exercise to be redundant. And they might be right. However, time and time again I used this solution to explain and further introduce basic concepts to people with a software development background, and they loved it. The learning curve was just faster. Further abstractions came later after the good foundation is laid out. So, bear with me 🙂

Data Visual

In this article, the less standard way and OO approach was taken and not the usual scripting point of view as we would have by using Python and R. One very good article did that implementation that kind of way and I strongly recommend you to skim through it.

What I wanted to do is to separate every component and every operation. What was initially just a thought exercise grew into quite a cool mini side-project. Again, I would like to emphasize that this is not really the way you would generally implement the network. More math and forms of matrix multiplication should be used to optimize this entire process.

Apart from that, the implemented network represents a simplified, most basic form of a Neural Network. Nevertheless, this way one can see all the components and elements of one Artificial Neural Network and get more familiar with the concepts.

2. Artificial Neural Networks – A Biology-Inspired Idea

Artificial Neural Networks are insipired by our nervous system. The general idea is that if we copy the brain structure, we will build a learning machine.  As you are probably aware, the smallest unit of the nervous system is a neuron. These are cells with similar and simple structures.

Yet, by continuous communication, these cells achieve enormous processing power. If you put it in simple terms, neurons are just switches. These switches generate an output signal if they receive a certain amount of input stimuli. This output signal is input for another neuron.

Each neuron has these components:

  • Body, also known as soma
  • Dendrites
  • Axon

The body (soma) of a neuron carries out the basic life processes of a neuron. Every neuron has a single axon. This is a long part of the cell; in fact, some of these go through the entire length of the spine. It acts like a wire and it is an output of the neuron. Dendrites, on the other hand, are inputs of neurons and each neuron has multiple dendrites. These inputs and outputs, axons, and dendrites of different neurons never touch each other even though they come close.

These gaps between axons and dendrites are called synapses. Through these synapses signals are carried by neurotransmitter molecules. There are various neurotransmitter chemicals and each serves a different type of neuron. Among them are the famous serotonin and dopamine. The amount and type of these chemicals will dictate how “strong” the input to the neuron is. And, if there is enough input on all dendrites, the soma will “fire up” the signal on the axon, and transmit it to the next neuron.

3. Main Components of Artificial Neural Networks

Before we dive into the code, let’s run through the structure of Artificial Neural Networks. As we mentioned, Artificial Neural Networks are biologically motivated, meaning that they are trying to mimic the behavior of the real nervous systems.

Just like the smallest building unit in the real nervous system is the neuron, the same is with artificial neural networks – the smallest building unit is the artificial neuron. In a real nervous system, these neurons are connected to each other by synapsis, which gives this entire system enormous processing power, ability to learn and huge flexibility. Artificial neural networks apply the same principle.

Artificial Neuron

By connecting artificial neurons they aim to create a similar system. They are grouping neurons into layers and then create connections among neurons from each layer. Also, by assigning weights to each connection,a they are able to filter important from non-important connections.

The structure of the artificial neuron is a mirroring structure of the real neuron, too. Since they can have multiple inputs, i.e. input connections, a special function that collects that data is used – the input function. The function that is usually used as the input function in neurons is the function that sums all weighted inputs that are active on input connections – the weighted input function.

Another important part of each artificial neuron is the activation function. This function defines whether this neuron will send any signal to its outputs and which value will be propagated to the outputs. Basically, this function receives a value from the input function and according to this value, it generates an output value and propagates them to the outputs.

Artificial Neural Network

4. Implementation

So, as you can see from the previous chapter there are a few important entities that we need to pay attention to and that we can abstract. They are neurons, connections, layers, and functions. In this solution, a separate class will implement each of these entities. Then, by putting it all together and adding a backpropagation algorithm on top of it, we will have our implementation of this simple neural network.

4.1 Input Functions

As mentioned before, crucial parts of the neuron are the input function and activation function. Let’s examine the input function. First I created an interface for this function so it can be easily changed in the neuron implementation later on:

public interface IInputFunction
{
    double CalculateInput(List<ISynapse> inputs);
}

These functions have only one method – CalculateInput, which receives a list of connections that are described in the ISynapse interface. We will cover this abstraction later; so far all we need to know is that this interface represents connections among neurons. CalculateInput method needs to return some sort of value based on the data contained in the list of connections. Then, I did the concrete implementation of the input function – weighted sum function.

public class WeightedSumFunction : IInputFunction
{
    public double CalculateInput(List<ISynapse> inputs)
    {
        return inputs.Select(x => x.Weight * x.GetOutput()).Sum();
    }
}

This function sums weighted values on all connections that are passed in the list.

4.2 Activation Functions

Taking the same approach as in input function implementation, the interface for activation functions is implemented first:

public interface IActivationFunction
{
    double CalculateOutput(double input);
}

After that, concrete implementations can be done. The CalculateOutput method should return the output value of the neuron based on the input value that it got from the input function. I like to have options, so I’ve done all functions mentioned in one of the previous blog posts. Here is how the step function looks:

public class StepActivationFunction : IActivationFunction
{
    private double _treshold;

    public StepActivationFunction(double treshold)
    {
        _treshold = treshold;
    }

    public double CalculateOutput(double input)
    {
        return Convert.ToDouble(input > _treshold);
    }
}
AI Visual

Pretty straightforward, isn’t it? A threshold value is defined during the construction of the object, and then the CalculateOutput returns 1 if the input value exceeds the threshold value, otherwise, it returns 0.

Other functions are easy as well. Here is the Sigmoid activation function implementation:

public class SigmoidActivationFunction : IActivationFunction
{
    private double _coeficient;

    public SigmoidActivationFunction(double coeficient)
    {
        _coeficient = coeficient;
    }

    public double CalculateOutput(double input)
    {
        return (1 / (1 + Math.Exp(-input * _coeficient)));
    }
}

And here is Rectifier activation function implementation:

public class RectifiedActivationFuncion : IActivationFunction
{
    public double CalculateOutput(double input)
    {
        return Math.Max(0, input);
    }
}

So far so good –  we have implementations for input and activation functions, and we can proceed to implement the trickier parts of the network – neurons and connections.

4.3 Neuron

The workflow that a neuron should follow goes like this: Receive input values from one or more weighted input connections. Collect those values and pass them to the activation function, which calculates the output value of the neuron. Send those values to the outputs of the neuron. Based on that workflow abstraction of the neuron this is created:

    public interface INeuron
    {
        Guid Id { get; }
        double PreviousPartialDerivate { get; set; }

        List<ISynapse> Inputs { get; set; }
        List<ISynapse> Outputs { get; set; }

        void AddInputNeuron(INeuron inputNeuron);
        void AddOutputNeuron(INeuron inputNeuron);
        double CalculateOutput();

        void AddInputSynapse(double inputValue);
        void PushValueOnInput(double inputValue);
    }

Before we explain each property and method, let’s see the concrete implementation of a neuron, since that will make the way it works far clearer:

public class Neuron : INeuron
{
    private IActivationFunction _activationFunction;
    private IInputFunction _inputFunction;

    /// <summary>
    /// Input connections of the neuron.
    /// </summary>
    public List<ISynapse> Inputs { get; set; }

    /// <summary>
    /// Output connections of the neuron.
    /// </summary>
    public List<ISynapse> Outputs { get; set; }

    public Guid Id { get; private set; }

    /// <summary>
    /// Calculated partial derivate in previous iteration of training process.
    /// </summary>
    public double PreviousPartialDerivate { get; set; }

    public Neuron(IActivationFunction activationFunction, IInputFunction inputFunction)
    {
        Id = Guid.NewGuid();
        Inputs = new List<ISynapse>();
        Outputs = new List<ISynapse>();

        _activationFunction = activationFunction;
        _inputFunction = inputFunction;
    }

    /// <summary>
    /// Connect two neurons. 
    /// This neuron is the output neuron of the connection.
    /// </summary>
    /// <param name="inputNeuron">Neuron that will be input neuron of the newly created connection.</param>
    public void AddInputNeuron(INeuron inputNeuron)
    {
        var synapse = new Synapse(inputNeuron, this);
        Inputs.Add(synapse);
        inputNeuron.Outputs.Add(synapse);
    }

    /// <summary>
    /// Connect two neurons. 
    /// This neuron is the input neuron of the connection.
    /// </summary>
    /// <param name="outputNeuron">Neuron that will be output neuron of the newly created connection.</param>
    public void AddOutputNeuron(INeuron outputNeuron)
    {
        var synapse = new Synapse(this, outputNeuron);
        Outputs.Add(synapse);
        outputNeuron.Inputs.Add(synapse);
    }

    /// <summary>
    /// Calculate output value of the neuron.
    /// </summary>
    /// <returns>
    /// Output of the neuron.
    /// </returns>
    public double CalculateOutput()
    {
        return _activationFunction.CalculateOutput(_inputFunction.CalculateInput(this.Inputs));
    }

    /// <summary>
    /// Input Layer neurons just receive input values.
    /// For this they need to have connections.
    /// This function adds this kind of connection to the neuron.
    /// </summary>
    /// <param name="inputValue">
    /// Initial value that will be "pushed" as an input to connection.
    /// </param>
    public void AddInputSynapse(double inputValue)
    {
        var inputSynapse = new InputSynapse(this, inputValue);
        Inputs.Add(inputSynapse);
    }

    /// <summary>
    /// Sets new value on the input connections.
    /// </summary>
    /// <param name="inputValue">
    /// New value that will be "pushed" as an input to connection.
    /// </param>
    public void PushValueOnInput(double inputValue)
    {
        ((InputSynapse)Inputs.First()).Output = inputValue;
    }
}
Programming Visual

Each neuron has its unique identifier – Id. This property is used in the backpropagation algorithm later. Another property that is added for backpropagation purposes is the PreviousPartialDerivate, but this will be examined in detail further on. A neuron has two lists, one for input connections – Inputs, and another one for output connections – Outputs. Also, it has two fields, one for each of the functions described in previous chapters. They are initialized through the constructor. This way, neurons with different input and activation functions can be created.

This class has some interesting methods, too. AddInputNeuron and AddOutputNeuron are used to create a connection among neurons. The first one adds an input connection to some neuron and the second one adds an output connection to some neuron. AddInputSynapse adds InputSynapse to the neuron, which is a special type of connection. These are special connections that are used just for the input layer of the neuron, i.e. they are used only for adding input to the entirety of the system. This will be covered in more detail in the next chapter.

Programming Visual

Last but not least, the CalculateOutput method is used to activate a chain reaction of output calculation. What will happen when this function is called? Well, this will call the input function, which will request values from all input connections. In turn, these connections will request output values from input neurons of these connections, i.e. output values of neurons from the previous layer. This process will be done until the input layer is reached and input values are propagated through the system.

4.4 Connections

Connections are abstracted through the ISynapse interface:

public interface ISynapse
{
    double Weight { get; set; }
    double PreviousWeight { get; set; }
    double GetOutput();

    bool IsFromNeuron(Guid fromNeuronId);
    void UpdateWeight(double learningRate, double delta);
}

Every connection has its weight represented through the property of the same name. Additional property PreviousWeight is added and it is used during the backpropagation of the error through the system. An update of the current weight and storing of the previous one is done in the helper function UpdateWeight. 

There is another helper function – IsFromNeuron, which detects if a certain neuron is an input neuron to the connection. Of course, there is a method that gets an output value of the connection – GetOutput. Here is the implementation of the connection:

public class Synapse : ISynapse
{
    internal INeuron _fromNeuron;
    internal INeuron _toNeuron;

    /// <summary>
    /// Weight of the connection.
    /// </summary>
    public double Weight { get; set; }

    /// <summary>
    /// Weight that connection had in previous itteration.
    /// Used in training process.
    /// </summary>
    public double PreviousWeight { get; set; }

    public Synapse(INeuron fromNeuraon, INeuron toNeuron, double weight)
    {
        _fromNeuron = fromNeuraon;
        _toNeuron = toNeuron;

        Weight = weight;
        PreviousWeight = 0;
    }

    public Synapse(INeuron fromNeuraon, INeuron toNeuron)
    {
        _fromNeuron = fromNeuraon;
        _toNeuron = toNeuron;

        var tmpRandom = new Random();
        Weight = tmpRandom.NextDouble();
        PreviousWeight = 0;
    }

    /// <summary>
    /// Get output value of the connection.
    /// </summary>
    /// <returns>
    /// Output value of the connection.
    /// </returns>
    public double GetOutput()
    {
        return _fromNeuron.CalculateOutput();
    }

    /// <summary>
    /// Checks if Neuron has a certain number as an input neuron.
    /// </summary>
    /// <param name="fromNeuronId">Neuron Id.</param>
    /// <returns>
    /// True - if the neuron is the input of the connection.
    /// False - if the neuron is not the input of the connection. 
    /// </returns>
    public bool IsFromNeuron(Guid fromNeuronId)
    {
        return _fromNeuron.Id.Equals(fromNeuronId);
    }

    /// <summary>
    /// Update weight.
    /// </summary>
    /// <param name="learningRate">Chossen learning rate.</param>
    /// <param name="delta">Calculated difference for which weight of the connection needs to be modified.</param>
    public void UpdateWeight(double learningRate, double delta)
    {
        PreviousWeight = Weight;
        Weight += learningRate * delta;
    }
}

Notice the fields _fromNeuron and _toNeuron, which define neurons that this synapse connects. Apart from this implementation of the connection, there is another one that I’ve mentioned in the previous chapter about neurons. It is InputSynapse and it is used as an input to the system. The weight of these connections is always 1 and it is not updated during the training process. Here is the implementation of it:

public class InputSynapse : ISynapse
{
    internal INeuron _toNeuron;

    public double Weight { get; set; }
    public double Output { get; set; }
    public double PreviousWeight { get; set; }

    public InputSynapse(INeuron toNeuron)
    {
        _toNeuron = toNeuron;
        Weight = 1;
    }

    public InputSynapse(INeuron toNeuron, double output)
    {
        _toNeuron = toNeuron;
        Output = output;
        Weight = 1;
        PreviousWeight = 1;
    }

    public double GetOutput()
    {
        return Output;
    }

    public bool IsFromNeuron(Guid fromNeuronId)
    {
        return false;
    }

    public void UpdateWeight(double learningRate, double delta)
    {
        throw new InvalidOperationException("It is not allowed to call this method on Input Connecion");
    }
}
Coding Visual

4.5 Layer

From here, the implementation of the neural layer is quite easy:

public class NeuralLayer
{
    public List<INeuron> Neurons;

    public NeuralLayer()
    {
        Neurons = new List<INeuron>();
    }

    /// <summary>
    /// Connecting two layers.
    /// </summary>
    public void ConnectLayers(NeuralLayer inputLayer)
    {
        var combos = Neurons.SelectMany(neuron => inputLayer.Neurons, (neuron, input) => new { neuron, input });
        combos.ToList().ForEach(x => x.neuron.AddInputNeuron(x.input));
    }
}

It contains the list of neurons used in that layer and the ConnectLayers method, which is used to glue two layers together.

4.6 Simple Artificial Neural Network

Now, let’s put all that together and add backpropagation to it. Take a look at the implementation of the Network itself:

public class SimpleNeuralNetwork
{
    private NeuralLayerFactory _layerFactory;

    internal List<NeuralLayer> _layers;
    internal double _learningRate;
    internal double[][] _expectedResult;

    /// <summary>
    /// Constructor of the Neural Network.
    /// Note:
    /// Initialy input layer with defined number of inputs will be created.
    /// </summary>
    /// <param name="numberOfInputNeurons">
    /// Number of neurons in input layer.
    /// </param>
    public SimpleNeuralNetwork(int numberOfInputNeurons)
    {
        _layers = new List<NeuralLayer>();
        _layerFactory = new NeuralLayerFactory();

        // Create input layer that will collect inputs.
        CreateInputLayer(numberOfInputNeurons);

        _learningRate = 2.95;
    }

    /// <summary>
    /// Add layer to the neural network.
    /// Layer will automatically be added as the output layer to the last layer in the neural network.
    /// </summary>
    public void AddLayer(NeuralLayer newLayer)
    {
        if (_layers.Any())
        {
            var lastLayer = _layers.Last();
            newLayer.ConnectLayers(lastLayer);
        }

        _layers.Add(newLayer);
    }

    /// <summary>
    /// Push input values to the neural network.
    /// </summary>
    public void PushInputValues(double[] inputs)
    {
        _layers.First().Neurons.ForEach(x => x.PushValueOnInput(inputs[_layers.First().Neurons.IndexOf(x)]));
    }

    /// <summary>
    /// Set expected values for the outputs.
    /// </summary>
    public void PushExpectedValues(double[][] expectedOutputs)
    {
        _expectedResult = expectedOutputs;
    }

    /// <summary>
    /// Calculate output of the neural network.
    /// </summary>
    /// <returns></returns>
    public List<double> GetOutput()
    {
        var returnValue = new List<double>();

        _layers.Last().Neurons.ForEach(neuron =>
        {
             returnValue.Add(neuron.CalculateOutput());
        });

        return returnValue;
    }

    /// <summary>
    /// Train neural network.
    /// </summary>
    /// <param name="inputs">Input values.</param>
    /// <param name="numberOfEpochs">Number of epochs.</param>
    public void Train(double[][] inputs, int numberOfEpochs)
    {
        double totalError = 0;

        for(int i = 0; i < numberOfEpochs; i++)
        {
            for(int j = 0; j < inputs.GetLength(0); j ++)
            {
                PushInputValues(inputs[j]);

                var outputs = new List<double>();

                // Get outputs.
                _layers.Last().Neurons.ForEach(x =>
                {
                    outputs.Add(x.CalculateOutput());
                });

                // Calculate error by summing errors on all output neurons.
                totalError = CalculateTotalError(outputs, j);
                HandleOutputLayer(j);
                HandleHiddenLayers();
            }
        }
    }

    /// <summary>
    /// Hellper function that creates input layer of the neural network.
    /// </summary>
    private void CreateInputLayer(int numberOfInputNeurons)
    {
        var inputLayer = _layerFactory.CreateNeuralLayer(numberOfInputNeurons, new RectifiedActivationFuncion(), new WeightedSumFunction());
        inputLayer.Neurons.ForEach(x => x.AddInputSynapse(0));
        this.AddLayer(inputLayer);
    }

    /// <summary>
    /// Hellper function that calculates total error of the neural network.
    /// </summary>
    private double CalculateTotalError(List<double> outputs, int row)
    {
        double totalError = 0;

        outputs.ForEach(output =>
        {
            var error = Math.Pow(output - _expectedResult[row][outputs.IndexOf(output)], 2);
            totalError += error;
        });

        return totalError;
    }

    /// <summary>
    /// Hellper function that runs backpropagation algorithm on the output layer of the network.
    /// </summary>
    /// <param name="row">
    /// Input/Expected output row.
    /// </param>
    private void HandleOutputLayer(int row)
    {
        _layers.Last().Neurons.ForEach(neuron =>
        {
            neuron.Inputs.ForEach(connection =>
            {
                var output = neuron.CalculateOutput();
                var netInput = connection.GetOutput();

                var expectedOutput = _expectedResult[row][_layers.Last().Neurons.IndexOf(neuron)];

                var nodeDelta = (expectedOutput - output) * output * (1 - output);
                var delta = -1 * netInput * nodeDelta;

                connection.UpdateWeight(_learningRate, delta);

                neuron.PreviousPartialDerivate = nodeDelta;
            });
        });
    }

    /// <summary>
    /// Hellper function that runs backpropagation algorithm on the hidden layer of the network.
    /// </summary>
    /// <param name="row">
    /// Input/Expected output row.
    /// </param>
    private void HandleHiddenLayers()
    {
        for (int k = _layers.Count - 2; k > 0; k--)
        {
            _layers[k].Neurons.ForEach(neuron =>
            {
                neuron.Inputs.ForEach(connection =>
                {
                    var output = neuron.CalculateOutput();
                    var netInput = connection.GetOutput();
                    double sumPartial = 0;

                    _layers[k + 1].Neurons
                    .ForEach(outputNeuron =>
                    {
                        outputNeuron.Inputs.Where(i => i.IsFromNeuron(neuron.Id))
                        .ToList()
                        .ForEach(outConnection =>
                        {
                            sumPartial += outConnection.PreviousWeight * outputNeuron.PreviousPartialDerivate;
                        });
                    });

                    var delta = -1 * netInput * sumPartial * output * (1 - output);
                    connection.UpdateWeight(_learningRate, delta);
                });
            });
        }
    }
}
Programming Visual

This class contains a list of neural layers and a layer factory, a class that is used to create new layers. During the construction of the object, the initial input layer is added to the network. Other layers are added through the function AddLayer, which adds a passed layer on top of the current layer list. The GetOutput method will activate the output layer of the network, thus initiating a chain reaction through the network.

Also, this class has a few helper methods such as PushExpectedValues, which is used to set desired values for the training set that will be passed during training, as well as PushInputValues, which is used to set certain input to the network.

The most important method of this class is the Train method. It receives the training set and the number of epochs. For each epoch, it runs the whole training set through the network as explained in this article. Then, the output is compared with desired output and the functions HandleOutputLayer and HandleHiddenLayer are called. These functions implement the backpropagation algorithm as described in this article.

4.7 Workflow

Typical workflow can be seen in one of the tests – Train_RuningTraining_NetworkIsTrained. It goes something like this:

var network = new SimpleNeuralNetwork(3);

var layerFactory = new NeuralLayerFactory();
network.AddLayer(layerFactory.CreateNeuralLayer(3, new RectifiedActivationFuncion(), new WeightedSumFunction()));
network.AddLayer(layerFactory.CreateNeuralLayer(1, new SigmoidActivationFunction(0.7), new WeightedSumFunction()));

network.PushExpectedValues(
    new double[][] {
        new double[] { 0 },
        new double[] { 1 },
        new double[] { 1 },
        new double[] { 0 },
        new double[] { 1 },
        new double[] { 0 },
        new double[] { 0 },
    });

network.Train(
    new double[][] {
        new double[] { 150, 2, 0 },
        new double[] { 1002, 56, 1 },
        new double[] { 1060, 59, 1 },
        new double[] { 200, 3, 0 },
        new double[] { 300, 3, 1 },
        new double[] { 120, 1, 0 },
        new double[] { 80, 1, 0 },
    }, 10000);

network.PushInputValues(new double[] { 1054, 54, 1 });
var outputs = network.GetOutput();

Firstly, a neural network object is created. In the constructor, it is defined that there will be three neurons in the input layer. After that, two layers are added using the function AddLayer and layer factory. For each layer, the number of neurons and functions for each neuron is defined. After this part is completed, the expected outputs are defined and the Train function with the input training set and the number of epochs is called.

Conclusion

This implementation of the neural network is far from optimal. You will notice plenty of nested for loops which certainly have bad performance. Also, in order to simplify this solution, some of the components of the neural network were not introduced in this first iteration of implementation, momentum, and bias, for example. Nevertheless, it was not a goal to implement a network with high performance, but to analyze and display important elements and abstractions that each Artificial Neural Network has.
Thanks for reading!

Discover more from Rubix Code

Subscribe now to keep reading and get access to the full archive.

Continue reading