The code presented in this article can be found here.

COVID-19 virus hit us hard. Warnings from Nicolas Taleb that our interconnectedness could cause wide pandemic were true. Schools are closed and most of us are working from home, spending time in isolation and trying not to spread the virus. At the moment when I am writing this, all the borders in my home country are closed, all bars and malls are closed and you can not go out after 5 PM. Apart from that, this pandemic has a huge impact on the economy. By many parameters, we were on the verge of a big economic crisis without coronavirus, but now it seems certain. We can only hope that this will not have the same catastrophic event as the crisis of 2008. The optimistic predictions say that the ICT industry will recover by the Q4 of 2020, the pessimistic ones say that ICT will be back on track in Q2 of 2021. It all sounds like a diary entry from some post-apocalyptic world.

However, there is some good news too. More than 70% of patients, of about 80,000 people sick from COVID-19 in China, have recovered and been discharged from hospitals. China also released a statement that vaccines are in clinical trials. Apart from this, scientists have figured out how the coronavirus breaks into human cells, which will help significantly in developing treatments. One of the two coronavirus clusters in Italy – Codogno, has reported significantly fewer infections per day due to high levels of self-quarantine. So, there is a light at the end of this tunnel.

We don’t do sales, but given the circumstances and the severity of the situation, we decided to change that. Don’t be fooled, this sale isn’t meant for profit and it’s most definitely not planned.  This sale is here to help people who want to become better, learn new skills and be more productive than ever before. Our book offers are on a 50% sale.

As mentioned, most of us are working from home in home isolation. This can be extremely stressful so we are also sharing Productivity (and sanity) tips on our LinkedIn page. Also, so you don’t go crazy at home we run 50% “emergency discount” on our ebook offers. Other bigger platforms like Ivy League and Pluralsite are offering free content, so you can use this isolation period to learn something new. What all of us can do is use this time best that we can and come out better and stronger on the other end. Let’s use this period to reflect on our behavior and figure out how to fix the damage that this pandemic caused.

Led by this notion, we decided to create this educational article that can help you learn how to apply neural networks and transfer learning on X-ray images and detect coronavirus. Models presented here should not be used in any real-world scenarios, and are not ready for production. Again, this article has an educational character. We are also aware that better detection of COVID-19 can be achieved with CT scans and not with X-Rays, but again, this is just for education and research can be extended further. We use Python, TensorFlow and Deep Learning to make those detections. To be more precise we utilize Transfer Learning as described in Bonus #2. All trained models can be found in this GitHub repository, along with code and data that we used.

Data

There is a small number of X-ray images of coronavirus in the world. In fact, there is only one good source for this type of data, which is not fake and you can find it here. This dataset contains images from other similar diseases like MERS, SARS, and ARDS. COVID-19 presents several unique features, which are hardly detectable by humans. That is why we utilized some pre-trained models to create an automatic detection system. First, we went through the metadata of the mentioned dataset and picked up only the posterior view of infected patients. To be more precise this is how the metadata looks like:

So, we picked up only coronavirus images and moved them to another folder, using this script:

import pandas as pd
import shutil

metadata = pd.read_csv('../covid-chestxray-dataset/metadata.csv')

images = metadata[(metadata['finding'] == 'COVID-19') & (metadata['view'] == 'PA')]
images.reset_index()

for _, row in images.iterrows():
    shutil.copy(f"../covid-chestxray-dataset/images/{row.filename}", \
		f"./data/covid-19/{row.filename}")

We ended up with 68 images, which is really not that much. Also, we picked up 70 images of healthy X-ray scans from this dataset. What we have done is separate them into two folders within the data folder ‘covid-19’ and ‘healthy’. This is an important step. Here are the infected images:

One thing that we can notice here is that this dataset is not exactly clean. Some of the images have red arrows on them and some of them have additional text, like date or index. When you are working with this kind of data it is important to remove those images or find a different way how to use them. This is because neural networks can learn the wrong thing from these images. This is out of the scope for this article, so we will continue with the dataset as it is.

Building the Dataset

The first step is pulling all that data into memory and pre-processing it so models can use them. A while back we had an article about building a dataset from images using TensorFlow, so we are following the same principle here. For that, we use tf.data modules and our own DatasetBuilder class. Here is what that class looks like:

class DataProcessor():
    def __init__(self, data_location):
        self.labeled_dataset = tf.data.Dataset.list_files(f"{data_location}/*/*")
        
    def _get_label(self, file_path):
        parts = tf.strings.split(file_path, os.path.sep)
        return parts[-2] == CLASS_NAMES
    
    def _decode_image(self, img):
        img = tf.image.decode_jpeg(img, channels=3)
        img = tf.image.convert_image_dtype(img, tf.float32)
        return tf.image.resize(img, [IMAGE_SHAPE[0], IMAGE_SHAPE[1]])
    
    def _pre_proces_images(self, file_path):
        label = self._get_label(file_path)
        img = tf.io.read_file(file_path)
        img = self._decode_image(img)
        return img, label
    
    def prepare_dataset(self):
        self.labeled_dataset = self.labeled_dataset.map(self._pre_proces_images)
        self.labeled_dataset = self.labeled_dataset.cache()
        self.labeled_dataset = self.labeled_dataset.shuffle(buffer_size=10)
        self.labeled_dataset = self.labeled_dataset.repeat()
        self.labeled_dataset = self.labeled_dataset.batch(BATCH_SIZE)
        self.labeled_dataset = self.labeled_dataset.prefetch(buffer_size= \ 
								tf.data.experimental.AUTOTUNE)
        
        train_size = int(0.7 * DATASET_SIZE)
        val_size = int(0.15 * DATASET_SIZE)
        test_size = int(0.15 * DATASET_SIZE)
        
        train_dataset = self.labeled_dataset.take(train_size)
        test_dataset = self.labeled_dataset.skip(train_size)
        val_dataset = test_dataset.skip(test_size)
        test_dataset = test_dataset.take(test_size)
        
        return train_dataset, test_dataset, val_dataset

This class is pretty simple. As the parameter in the constructor, it receives a path to the data. This parameter is used to create a so-called listed dataset. Essentially we utilize the list_files method from the tf.data.Dataset to load a list of all images from the given folder. This method uses pattern matching that is why we added ‘/*/*’ after the path to the data. The DataProcessor class has three private and one public method:

  • _get_label – Based on the location of the data this method extracts the label. In our case, the labels are ‘covid-19’ and ‘healthy’ because those are the folders that we have created. Also, we created CLASS_NAMES global, which contains the same information.
  • _decode_image – Loads image into memory and resize it. This is controlled by another global IMAGE_SHAPE. We use 256 x 256 size for each of the images, so we can help models to perform better.
  • _pre_proces_images – Utilizes previous methods and returns label and pre-processed image.
  • prepare_dataset – This is the most important method of this class. Basically, this method loads all images and it’s label and creates a unified dataset from them. It also shuffles them and splits them into train, validation and test datasets.

Finally, we can utilize this class like this:

processor = DataProcessor(data_location)
train_dataset, test_dataset, val_dataset = processor.prepare_dataset()

As the output, we have prepared datasets ready to be processed by architectures we are about to create.

Building and Training Models

Let’s start with the Wrapper class. This class is used to receive a base model, which we load from the TensorFlow, and adds additional layers on top of them. To be more clear, since large architectures are hard to train, TensorFlow provides a number of pre-trained models which you can use like VGG16, ResNet, DenseNet, etc. These models have calculated weights trained on the ImageNet dataset. However, the top layers of these models can be removed, and we can add layers that are necessary for our specific problems. That way we can utilize lower levels of these models, because on these levels networks detect features like straight line or curve, and at the same time change higher levels to detect specific features. Here is what the Wrapper class looks like:

class Wrapper(tf.keras.Model):
    def __init__(self, base_model):
        super(Wrapper, self).__init__()
        
        self.base_model = base_model
        self.average_pooling_layer = AveragePooling2D(name="polling")
        self.flatten = Flatten(name="flatten")
        self.dense = Dense(64, activation="relu")
        self.dropout = Dropout(0.5)
        self.output_layer = Dense(2, activation="softmax")
        
    def call(self, inputs):
        x = self.base_model(inputs)
        x = self.average_pooling_layer(x)
        x = self.flatten(x)
        x = self.dense(x)
        x = self.dropout(x)
        output = self.output_layer(x)
        return output

In essence, it uses the passed pre-trained model (base_model) and adds Average Pooling, Flattening, Dropout and 2 Dense layers. Note that output is Dense layer with two nodes because we have two classes – covid-19 and healthy. Now we can load pre-trained models like this:

base_learning_rate = 0.0001
steps_per_epoch = DATASET_SIZE//BATCH_SIZE
validation_steps = 20

mobile_net = MobileNetV2(input_shape=IMAGE_SHAPE, include_top=False, weights='imagenet')
mobile_net.trainable = False
mobile = Wrapper(mobile_net)
mobile.compile(Adam(lr=base_learning_rate),
              loss='binary_crossentropy',
              metrics=['accuracy'])

res_net = ResNet101V2(input_shape=IMAGE_SHAPE, include_top=False, weights='imagenet')
res_net.trainable = False
res = Wrapper(res_net)
res.compile(optimizer=Adam(lr=base_learning_rate),
              loss='binary_crossentropy',
              metrics=['accuracy'])

dense_net = DenseNet201(input_shape=IMAGE_SHAPE, include_top=False, weights='imagenet')
dense_net.trainable = False
dense = Wrapper(dense_net)
dense.compile(optimizer=Adam(lr=base_learning_rate),
              loss='binary_crossentropy',
              metrics=['accuracy'])

We loaded and compiled three architectures: MobileNet, ResNet and DenseNet. Note that for each of these architectures we have set the trainable parameter to False, which means that base models are not trained during the training process, but only the additional layers we have added. Now we can run training for each of these models:

history_mobile = mobile.fit(train_dataset,
                    epochs=EPOCHS,
                    validation_data=val_dataset,
                    validation_steps=validation_steps)

history_resnet = res.fit(train_dataset,
                    epochs=EPOCHS,
                    validation_data=val_dataset,
                    validation_steps=validation_steps)

history_densenet = dense.fit(train_dataset,
                    epochs=EPOCHS,
                    validation_data=val_dataset,
                    validation_steps=validation_steps)

Here are the plotted results of the training process of the MobileNet:

Here are the plotted results of the training process of the ResNet:

Here are the plotted results of the training process of the DenseNet:

We can see that MobileNet has poor results on the validation set, meaning it probably overfitted. However, we need to confirm that with the test dataset. ResNet and DenseNet show good results, but DenseNet seems to be a bit better and has better validation accuracy.

Evaluation

Finally, we can do evaluation of the models like this:

loss, accuracy = mobile.evaluate(test_dataset, steps = validation_steps)

print("--------MobileNet---------")
print("Loss: {:.2f}".format(loss))
print("Accuracy: {:.2f}".format(accuracy))
print("---------------------------")

loss, accuracy = res.evaluate(test_dataset, steps = validation_steps)

print("--------ResNet---------")
print("Loss: {:.2f}".format(loss))
print("Accuracy: {:.2f}".format(accuracy))
print("---------------------------")

loss, accuracy = dense.evaluate(test_dataset, steps = validation_steps)

print("--------DenseNet---------")
print("Loss: {:.2f}".format(loss))
print("Accuracy: {:.2f}".format(accuracy))
print("---------------------------")

And here are the results:

--------MobileNet---------
Loss: 0.74
Accuracy: 0.69
---------------------------

---------ResNet------------
Loss: 0.11
Accuracy: 0.98
---------------------------

---------DenseNet----------
Loss: 0.07
Accuracy: 0.96
---------------------------

As expected, DenseNet got the best loss results, however, ResNet performed better on the accuracy metric. Apart from that, we can notice that these models are not performing badly, but we can not trust them completely, simply because of the small dataset size. With bigger dataset size we can get better results and get more confidence in them. In the next couple of weeks as we get more and more X-rays these models can improve. However, once again, these models are just for education and should be further tested. They shouldn’t be used for production.

Conclusion

This pandemic will pass, like any other pandemic thus far. However, a lot of people will not make it through it. It is our duty to these people to learn from the mistakes we made and be ready for the next situation like this. Because there will be a next one. We are living in a global village. We are connected more than ever before. We are traveling more than ever before. I am not saying this is wrong, or that we should stop that, but it brings certain responsibilities. We must pay more attention to hygiene, be more careful with the elderly and we MUST build sustainable healthcare systems. I will end this article with the dramatic quote from a Spanish scientist:

“You give the footballer one million euros a month and a biological researcher 1800 euros. You are looking for a treatment now. Go to Cristiano Ronaldo or Messi and they will find you a cure.

Let’s all learn from this and focus on the things that are really important.
See you on the other end. Thank you for reading.

Nikola M. Zivkovic

Nikola M. Zivkovic

CAIO at Rubik's Code

Nikola M. Zivkovic a CAIO at Rubik’s Code and the author of book “Deep Learning for Programmers“. He is loves knowledge sharing, and he is experienced speaker. You can find him speaking at meetups, conferences and as a guest lecturer at the University of Novi Sad.

Rubik’s Code is a boutique data science and software service company with more than 10 years of experience in Machine Learning, Artificial Intelligence & Software development. Check out the services we provide.