In the previous article, we got familiar with the main concepts of Self-Organizing Maps. We explored how they utilize different type of learning than we got a chance to see during our trip through the world of artificial neural networks – unsupervised learning. This is the type of learning in which network doesn’t get the expected result for certain input, but it got to figure out inner data relationship on its own. Self-Organizing Maps use this approach for clustering and classification purposes and they are quite good at it.
Another important thing we got a chance to see is that the concepts of neurons, connection and weights are having a different meaning in Self-Organizing Maps world. Neurons are usually organized in two big groups. The first group is a collection of input neurons, and their number corresponds to the number of features that we have in used dataset.
The second group is a collection of output neurons. These neurons are usually organized as one or two-dimensional arrays and are triggered only by certain input values. While here is no concept of locations neuron in artificial neural network, that is not the case with the Self-Organizing Maps. Not only that each neuron has a location, but it is considered that neurons that lie close to each have similar properties and actually represent a cluster.
Every input neuron is connected to every output neuron. The learning process is also different than in standard feed-forward neural networks since unsupervised learning is used. These are the main steps of this process for Self-Organizing Maps:
- Weight initialization
- The input vector is selected from the dataset and used as an input for the network
- BMU is calculated
- The radius of neighbors that will be updated is calculated
- Each weight of the neurons within the radius are adjusted to make them more like the input vector
- Steps from 2 to 5 are repeated for each input vector of the dataset
You can check out the previous article in which this process is explained in details.
There are many existing implementations of Self-Organizing Maps available online, and we will check some of them in the next chapter. However, the idea behind this article is to create our own implementation using TensorFlow and then use it in the future articles for solving the real-world problems.
As we already mentioned, there are many available implementations of the Self-Organizing Maps for Python available at PyPl. To name the some:
The last implementation in the list – MiniSOM is one of the most popular ones. It is a minimalistic, Numpy based implementation of the Self-Organizing Maps and it is very user friendly. It can be installed using pip:
pip install minisom
or using the downloaded setup:
python setup.py install
As mentioned, usage of this library is quite easy and straight-forward. In general, all you have to do is create an object of SOM class, and define its size, size of the input, learning rate and radius (sigma). After that you can use one of the two options for training that this implementation provides – train_batch or train_random. The first one uses samples in order in which is recorded in the data set, while the second one shuffles through the samples. Here is an example:
In this example, 6×6 Self-Organizing Map is created, with the 4 input nodes (because data set in this example is having 4 features). Learning rate and radius (sigma) are both initialized to 0.5. Than Self-Organizing Map is trained with input data for 100 iterations using train_random.
For this implementation, TensorFlow 1.10.0 version is used. Here you can find quick guide how to quickly install it and how to start working with it. In general, low level API of this library is used for the implementation. So, lets check out the code:
That is quite a lot of code, so let’s dissect it into smaller chunks and explain what each piece means. Majority of the code is in the constructor of class which, similar to the MiniSOM implementation, takes dimensions of the Self-Organizing Map, input dimensions, radius and learning rate as an input parameters. The first thing that is done is initialization of all the fields with the values that are passed into the class constructor:
Note that we created TensorFlow graph as a _graph field. In the next part of the code, we essentially add operations to this graph and initialize our Self-Organizing Map. If you need more information on how TensorFlows graphs and session work, you can find it here. Anyway, the first step that needs to be done is to initialize variables and placeholders:
Basically, we created _weights as a randomly initialized tensor. In order to easily manipulate the neurons matrix of indexes is created – _locations. They are generated by using _generate_index_matrix, which looks like this:
Also, notice that _input (input vector) and _iter_input (iteration number, which is used for radius calculations) are defined as placeholders. This is due to the fact that this information is filled during the training phase, not the construction phase. Once all variables and placeholders are initialized, we can start with the Self-Organizing Map learning process algorithm. Firstly, BMU is calculated and it’s location is determined:
The first part basically calculates the Euclidean distances between all neurons and the input vector. Don’t get confused by the first line of this code. In an essence, this input sample vector is repeated and matrix is created, so it can be used for calculations with weights tensor. Once distances are calculated, index of the BMU is returned. This index is used, in the second part of the gist, to get BMU location. We relied on the slice function for this. Once that is done, we need to calculate values for learning rate and radius for current iteration. That is done like this:
Variable decay_function is created based on iteration number. This “function” is used to determine how much mentioned properties are shrinked in defined iteration. After that, fields _learning_rate and _radius are updated accordingly. The next step is to create learning rates for all neurons based on iteration number and location in comparison to the BMU location. That is handeled like this:
First matrix of BMU location value is created. Than of the neuron to the BMU is calculated. After that, so called neighbourhood_func is created. This function is bacially defining how the weight of concrete neuron will changed. Finally, the weights are updated accordingly and TensorFlow session is initialized and run:
Apart from _generate_index_matrix function that you saw previously, this class has also two important functions – train and map_input. The first one, as its name suggests, is used to train Self-Organizing Map with proper input. Here is how that function looks like:
Essentially, we have just run defined number of iterations on passed input data. For that we used _training operation that we created during class construction. Notice that here placeholders for iteration number and input sample are filled. That is how we run created session with correct data.
The second function that this class has is map_input. This function is mapping defined input sample to the correct output. Here is how it looks like:
At the end, we got Self-Organizing Map with pretty straight forward API that can be easily used. In the next article, we will use this class to solve one real-world problem. To sum it up, it can be used something like this:
As you can see we tried to keep API very similar to the one from MiniSOM implementation.
In this article we learned how to implement Self-Organizing map algorithm using TensorFlow. We used flexibility of the lower level API so to get in even more details of their learning process and got comfortable with it. To sum it up, we applied all theoretical knowledge that we learned in the previous article. Apart from that, we saw how we can use already available Self-Organizing implementations, namely MiniSOM. Next step would be using this implementation to solve some real-world problems, which we will do in the future.
Thank you for reading!
This article is a part of Artificial Neural Networks Series, which you can check out here.
Read more posts from the author at Rubik’s Code.