The code that accompanies this article can be downloaded here.


Open-source library TensorFlow.js was introduced about a year ago. However, I didn’t manage to try it out up until now. In this article, we are going to get to know how to use this technology, and we are going to do it on one real-world classification problem. The idea is to use possibilities of TensorFlow.js to build and run our machine learning and deep learning modes in a browser or under Node.js. To be honest, I was a bit skeptical at first. However, this turned out as a cool way to keep web developers and data scientists closer together.

In essence, there are several perks that we can consider when using TensorFlow.js. For me, the main gain is that you can build models directly in a browser. Apart from that, you can import existing pre-trained models from Python and re-train them as well. Imagine that you are working under JavaScript stack with NoSQL JSON based database (eg. MongoDB). This is certainly use case in which we should consider using TensorFlow.js.

TensorFlow.js incudes Keras API and exposes it as it’s high-level API. This is very nice and it eases up the process of building machine learning and deep learning models. It also includes a lower level API, previously called deeplearn.js, which can be used for linear algebra and automatic differentiation. Eager execution is supported as well. Underneath it all TensorFlow.js is powered by WebGL, a JavaScript API for rendering 2D and 3D graphics within any web browser without the plug-ins.

TensorFlow.js API

In this article, we are going to build a simple neural network using TensorFlow.js which will solve a simple classification problem. However, before that let’s see how we can install TensorFlow.js.

Installation

There are several ways in which we can use TensorFlow.js. First one, of course, is using it just by adding script tag inside of our main HTML file:

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.0.0/dist/tf.min.js"></script>

You can also install it using npm or yarn for setting it up under Node.js:

npm install @tensorflow/tfjs
yarn add @tensorflow/tfjs

As you remember from previous posts, TensorFlow has GPU support for higher performances. You can instal it like this:

npm install @tensorflow/tfjs-node-gpu
npm install @tensorflow/tfjs-node-gpu

Use this option only if your system has a NVIDIA® GPU with CUDA support and you use Linux. If that is not the case, you can still get better results by using TensorFlow with naitive C++ bindings:

npm install @tensorflow/tfjs-node
yarn add @tensorflow/tfjs-node

Wine Quality Classification Problem

If you read some of our previous articles, you may notice that we like using this dataset. That is because this dataset is really good for simple classification analysis, but it comes from real-world. Our goal is to predict the quality of the wine based on the provided chemical data. Data itself is about Vinho Verde, a unique product from the Minho region of Portugal. This product is taking up to 15% of the total Portuguese wine production and Portugal is the tenth biggest wine producer in the world.

Information was collected from May/2004 to February/2007 and due to privacy and logistic issues, only physicochemical and sensory variables are available. Price and origin of data are not provided in the dataset. The dataset contains two .csv files, one for red wine (1599 samples) and one for white wine (4898 samples). For the purpose of this article, we will use only the white wine samples.

Every sample contains these features:

  • Fixed acidity
  • Volatile acidity
  • Citric acid
  • Residual sugar
  • Chlorides
  •  Free sulfur dioxide
  • Total sulfur dioxide
  • Density
  • pH
  • Sulfates
  • Alcohol
  • Quality (score between 0 and 10)

The dataset presented in tabular form looks something like this:

Data Analysis

I know that in JavaScrip universe there must be libraries for the purposes of this chapter. However, I was lazy enough not to look for them, so the analysis of the data is done in Python. If you have suggestions which JavaScript modules I can use for these purposes, please send me the info, I will be very grateful.

Data Analysis is consisting of several sub-steps itself:

  • Univariate Analysis – Analysing types and nature of every feature.
  • Missing Data Treatment – Detecting missing data and making a strategy about it.
  • Outlier Detection – Detecting anomalies in the data. Outliers are samples that diverge from an overall pattern in some data.
  • Correlation Analysis – Comparing features among each other.

During the univariate analysis, we noticed that output data quality is actually integer not category. This will be handled during implementation. Apart from that, we could notice notice that the features are not on the same scale. This can cause a problem so we will need to handle it during implementation as well.

Wine Quality Dataset

During Missing Data Treatment phase we notice that some samples have Fixed acidity feature empty. Our strategy is to replace this information with the mean value of that feature. Other options are available too, like changing missing values with max feature value, or some default value. Let’s check the quality distribution and detect outliers:

Quality Distribution

From the picture above, we can see that most of the wines fall in the category between 5 and 6. This means the most of the wines are average and we have just a few wines with high or low quality. Finally, let’s check correlation matrix:

Correlation Matrix

As you can see we can not detect feature that effects quality too much. The only thing that can raise our suspicion is the high correlation between residual sugar feature and density feature. However, we will leave both features in the game and see where we will land.

Implementation

The whole code that accompanies this blog post can be found here.

Dataset itself comes in .csv file format. So the first thing we had to do was to convert this into JSON file and upload it. You can find whole new created JSON file here. In general, every sample from .csv file is now a separate JSON node. Here is how that looks like:

 [
{
"fixed_acidity":7,
"volatile_acidity":0.27,
"citric_acid":0.36,
"residual_sugar":20.7,
"chlorides":0.045,
"free_sulfur_dioxide":45,
"total_sulfur_dioxide":170,
"density":1.001,
"pH":3,
"sulphates":0.45,
"alcohol":8.8,
"quality":6
},
{
"fixed_acidity":6.3,
"volatile_acidity":0.3,
"citric_acid":0.34,
"residual_sugar":1.6,
"chlorides":0.049,
"free_sulfur_dioxide":14,
"total_sulfur_dioxide":132,
"density":0.994,
"pH":3.3,
"sulphates":0.49,
"alcohol":9.5,
"quality":6
},
...

We have done this because as you know, working with JSON files in JavaScript is natural thing. Now, let’s check out our index.html file:

As you can see we added mentioned script tag for TensorFlow.js and additional for tfjs-vis. This is a small library for in browser visualization. Apart from that, you could notice that we defined script.js. This file is located in the same folder as index.html. To run this whole process, all you have to do is open index.html in your browser. Here is how the main run function from script.js file looks:

This function essentially reveals our workflow. First, we get data from this location, using getData method:

This is achieved simply by using fetch method. After that, we use displayData to plot some interesting graphs:

Note that in the gist above singlePlot function is presented as well. This method wraps tfjs-vis functionality and displays only one graph. The displayData function utilizes this method to plot three graphs. Here they are:

Here we can see distribution of quality by different feature. Once we visualized data, we can create our model. This is done in the function createModel:

The main goal here was not to generate a perfect model for this problem, but to try out some of the TensorFlow.js possibilities. If you are familiar with building neural network models with Keras, this API will be easy to understand. However, if you want to learn more about neural networks, you can check our huge series on them here.

Let’s have a quick overview. We use sequential to create a placeholder for our model. This way we can add different layers into it. Then we add layers of densely connected neurons by utilizing dense. In the first call, we create one hidden layer of 50 neurons, with an input layer of 11 neurons. We have 11 inputs because in our dataset we have 11 features. Then we add an additional two hidden layers with 30 and 20 neurons respectively. Finally, our output layer is having 10 neurons, because we have 10 possible categories of wine. Once we print the summary of the model with tfjs-vis here is what we get:

Awesome! Now, we prepare data for model itself. Our model will not work with JSON objects, or with arrays for that matter. We need to create tensor objects. This is done in prepareData function:

In this function, we first convert JSON objects into simple arrays. We split data into inputs and outputs. In this particular example, we haven’t split data into train and test sets, which is something that can be improved. Once this is done, we convert them into tensors. Finally, we normalize data, meaning we put it on the same scale. This is something that we noticed during the data analysis phase. Also, note that we use oneHot method to convert output data into the categorical variable.

So, we have done all preparation steps and we can train our model using trainModel function:

Once again, you can notice that TensorFlow.js kept the API that is similar to the TensorFlow API in Python. We compile our model with Adam optimizer and cathegorical crossentropy. Then we run the training process with fit method call. This process is visualized as well:

Finally, we use evaluateModel method to evaluate accuracy of our neural network. This method is just a wrapper for evaluate method of created model:

The output of evaluation is printed in the console:

Accuracy is:
0.5163332223892212

We got accuracy of just 51%, meaning there is a lot of place for improvement of our model. However, we were able to do all this in the browser, which is awesome.

Conclusion

In this article we learned how to use JavaScript for machine learning and deep learning. We figured out how to make basic models using TensorFlow.js and train them. In future articles, we will investigate this library even further and try to use it to implement more sophisticated architectures.

Thank you for reading!


Read more posts from the author at Rubik’s Code.


2 comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.