At the beginning of every month, we decipher three research papers from the fields of machine learning, deep learning and artificial intelligence, that left the biggest impact on us in the previous month. Apart from that, at the end of the article, we add links to other papers that we have found interesting but were not in our focus that month. So, you can check those as well.

Eager to learn how to build Deep Learning systems using Tensorflow 2 and Python? Get the ebook here!


In January, we focused mainly on healthcare and computer vision papers. In our humble opinion, in the past couple of years, the big leap forward is made in bringing deep learning into the world of healthcare since it is the pretty restricted world. For years, this industry had some kind of aversion towards it, but it seems that in the past couple of years things have changed. Here, we are able to see some really cool papers that made our minds came up with various ideas and possible applications. Have fun!

An Adversarial Approach for the Robust Classification of Pneumonia from Chest Radiographs

Rubik’s Code team loves computer vision projects in health-tech. These projects are always super interesting and super demanding. The biggest problem that we can face with this type of project is that created models can perform badly on test data due to the dataset shift. For example, if we train our models using the dataset that we got from one hospital it may not generalize well and we can get bad results if we test with data from another hospital. We can face this problem even if we use data from the same hospital for training and testing. This problem is manifest itself on radiology classification tasks as well. One reason for this is that models become to depend on patient-level confounders and make predictions made on those, instead of using pathology indications. Another reason is that scanners take shots from a different angle. There are two views: anterior-posterior and posterior-anterior. Finally, there are different types of scanners, some are portable and some are fixed. These may add some kind of text on the image itself, which may affect the model and decisions it makes.
This paper addresses these problems and tries to make models for pneumonia classification independent from those confounders. To be more precise, the authors propose an approach based on adversarial neural networks to address the dataset shift. This way they are able to train models that are more reliable and that can be used across hospitals. In general, what authors have found is that model confounding can be potential model confusion can be identified by evaluating how well confounders can be predicted from a model’s output. Apart from that, by using an adversarial training process they are able to make classification independent from the radiograph’s view and obtain better generalization performance with the test data form the new hospital. The starting point was extending each sample with an indicator of the view. So, each sample is essentially a tuple containing: image, label and indicator of view. Then they used transfer learning with DenseNet-121 with SGD optimizer. As we mentioned, adversarial training is used. During this training process, two connected two neural networks train on the same data. The first network is the classifier, f. This network is trained to predict a pneumonia label y from a radiograph x. The second network is an adversary, d. This network is trained to predict the view v from the output score s of the classifier f. This is a very interesting use of this training approach. The optimization procedure consists of alternating between training the adversary network until it is optimal, then training the classifier to fool the adversary while still predicting pneumonia well.
Read the complete paper here. The code that accompanies this paper can be found here.

Convolutional Neural Networks with Intermediate Loss for 3D Super-Resolution of CT and MRI Scans

In general, MRI and CTs create low-resolution images. In order to be readable, these images need some improvements. This paper proposes how to get super-resolution on these images. What authors propose is the use of two Convolutional Neural Networks (CNN), the first one for increasing height and width and the second one for depth. Gaussian smoothing with various standard deviations is used too. The main goal of this research is to allow radiologists and oncologists to accurately segment tumors and make better treatments. This is done through three points:
  • The architecture of two CNNs with Gaussian smoothing is proposed.
  • A human evaluation study is done.
  • The code can be found online.
For the purpose of this research, two datasets are used. The first one is acquired from Colea Hospital (CH) and the second one is publicly available online – NAMIC1. In the first stage of this process, the image is up-sampled using first deep CNN. In the second step, the image is further up-sampled on the third axis. Both CNNs have the same architecture with one difference, second CNN up-samples only in one direction since it works only on one axis.
CNNs are composed of 10 convolutional layers, each followed by the ReLu layer. Filters of convolutional layers are 3×3. After the first six convolutional layers, there is an up-scaling convolutional layer. These first layers are formed of 32 filters while the following three are equal to the scaling factor. For example, if the scaling factor is 4x, the number of filters is 4. The loss of this architecture is based on the mean absolute value. Commonly, if we want our CNN to provide sharper images, we use Gaussian smoothing with standard deviation during the training process. This approach has one side effect. While CNN will perform better on non-blurred images after the training, the distribution gap between training and testing data is increased. This is why the authors used a randomly chosen standard deviation for each training image.
Read the complete paper here. The code that accompanies this paper can be found here.

Segmentation with Residual Attention U-Net and an Edge-Enhancement Approach Preserves cell shape Features

This paper aims to solve one really specific, but interesting domain problem. In general, it is hard to extrapolate gene expression dynamics in a living single cell. This task requires robust cell segmentation, which is not influenced by the shape of the cell boundaries. When we think of segmentation in the context of U-NET is the first thing that pops in our head. Authors of this article modified U-NET so it is better applicable to this problem. This task is hard due to the heterogeneity among cells and it often requires manual annotation by experts. The approach that makes the most sense is instance segmentation, which unlike semantic segmentation, treats multiple objects of the same category as individual objects. For this purpose data was acquired from Columbia University, where images were collected on the same baby hamster kidney cell cultures. These images were taken by fluorescence wide-field microscopy images of nuclear and cytoplasmic live-cell stains, differential interference contrast (DIC), and fluorescent proteins from cytoplasmically localized reporters.
As you can see there are multiple sources of the images which make this problem very interesting. These three sources are observed as three channels of the image. These are bias-field corrected and normalized first. Then these three channels are concatenated to form the pseudo-RGB image. Finally, data augmentation is done by random horizontal and vertical flipping. The architecture used for this task is called Residual Attention U-Net. The main difference between this architecture and vanilla U-Net is that they have two additional parts: Residual Blocks and Attention mechanism.
Read the complete paper here. The code that accompanies this paper can be found here.

Other Amazing Papers from this Month


In this article, we started the practice of analyzing research papers that we liked the most in that month. Did you have any favorites this month? Let us know.
Thank you for reading!
Nikola M. Zivkovic

Nikola M. Zivkovic

CAIO at Rubik's Code

Nikola M. Zivkovic a CAIO at Rubik’s Code and the author of book “Deep Learning for Programmers“. He is loves knowledge sharing, and he is experienced speaker. You can find him speaking at meetups, conferences and as a guest lecturer at the University of Novi Sad. Rubik’s Code is a boutique data science and software service company with more than 10 years of experience in Machine Learning, Artificial Intelligence & Software development. Check out the services we provide.
Deep Learning for ProgrammersStay relevant in the rising AI economy!
  • 41 Chapters including Mathematic and Python basics
  • 12 Neural Network Architectures with code examples
  • 253 Pages of Deep Learning brilliance