With its recent rise in the IT industry, computer vision (CV) has become one of the most promising fields of artificial intelligence (AI). Image processing, as such, is the foundation of CV. In this series of articles, we are going to cover the basics of it. As you follow this online tutorial you are going to learn some core concepts of Image processing with the help of the OpenCV library in Python programming language.
This bundle of e-books is specially crafted for beginners.
Everything from Python basics to the deployment of Machine Learning algorithms to production in one place.
Become a Machine Learning Superhero TODAY!
1. Installation and Setup
For this article, you’ll need the following libraries: NumPy and OpenCV. Pip is the simplest way to install external libraries in Python. Install with the following steps:
1.1 Installation of OpenCV
To install OpenCV and NumPy in Windows, use following commands:
pip install numpy pip install opencv-python
To do the same in Linux, use these commands in terminal:
$pip3 install opencv-python $pip3 install numpy
To import OpenCV and Numpy add these lines in your Python file:
import numpy as np import cv2
2. Image Representation (Mathematical representation of an Image)
Twenty first century is a century of digitalization. Digital signals such as telecommunication signals, audio signals, electrical signals are taking over. Images are no exception. Converting images into digital form makes image processing much easier. Let’s dive into how images are represented in digital form in computer memory.
2.1 Image as a Matrix
Images are represented as two-dimensional matrices. Each element of a matrix represents a pixel, and each pixel value determines the intensity of a color. Pixels position is determined by its placement in corresponding rows and columns.
There are many ways of image representation, with the two most popular being: gray scale and RGB. Gray scale picture is what we know as a black and white picture, and RGB is a polychromatic image. We will cover those concepts in depth later on.
3. Loading an Image with OpenCV
In this section we are going to load, display and save images. All images used in this article can be found here.
3.1 Loading an Image
Loading an image is a very simple task using OpenCV’s imread function. Imread takes two parameters. The first one is the path of the picture we want to load, and the second specifies the type of picture we want (Gray scale, Color or Unchanged).
When we want to load the picture in color, second parameter is 1:
img_color = cv2.imread('/content/lena_color.png', 1) #1 == cv2.IMREAD_COLOR you can pass both parameters
When we want to load the grayscale picture, second parameter is 0:
img_gray = cv2.imread('/content/lena_color.png', 0) #0 == cv2.IMREAD_GRAYSCALE you can pass both parameters
3.2 Displaying an Image
Displaying an image can be done with the help of OpenCV’s imshow function. Imshow also takes two parameters. A string representing the name of the window in which the image is going to be displayed and the name of the variable in which the image has been stored.
Displaying the color picture:
#Showing color image cv2.imshow('Colored picture', img_color)
Displaying the grayscale picture:
#Showing gray image cv2.imshow('Gray picture', img_gray)
3.3 Saving an Image
After performing certain operations, we would like to save the changed image. Luckily, there is a function for that as well. The name of the function is imwrite, and it also takes two parameters. First one is the name of the new image file and the second is the name of the variable which stores the image we would like to save.
#Saving a colored image cv2.imwrite('img_color.png', img_color) #Saving a gray image cv2.imwrite('img_gray.png', img_gray)
4. Image as an Array
As we already mentioned, images are represented as matrices, which are two-dimensional data structures where numbers are arranged into rows and columns.
This matrix has 3 rows and 4 columns.
In python we don’t have built-in matrices, but we write them as an array of arrays. Number of arrays represents the number of rows, and the number of array elements (length of an array), represents the number of columns.
For this concrete example we would write it as:
#Defining a list of lists A = [[2,-5,-11,0], [-9,4,6,13], [4,7,12,-2]]
We can also use the numpy library to create new matrices.
import numpy as np A = np.array([[2,-5,-11,0], [-9,4,6,13], [4,7,12,-2]])
Images in OpenCV are stored as numpy arrays, the difference being that the matrices used for storing pictures are much larger than this one.
So far we’ve seen that we can display both gray scale and color images. So what is the difference between them when it comes to storing?
4.1 Grayscale Image
Gray-scale image consists of one channel, or more precisely one matrix, where every pixel value represents intensity of the pixel. Pixel values range from 0-255, in uint8 or 8-bit representation (each pixel is represented by eight bits). Closer to 0 are darker shades, where 0 is color black, naturally values closer to 255 are brighter, with value 255 being color white.
Values between 0 and 255 represent shades of gray. The image we loaded before, when stored in a matrix , would look something like this:
img_gray = cv2.imread('/content/lena_color.png', 0) #attribute shape returns number of rows and columns in matrix print(img_gray.shape) # => (512, 512)
We can see that the gray picture of Lena, which was displayed before, is actually a 512×512 matrix.
4.2 RGB Image
RGB stands for Red-Green-Blue. We call them primary colors, the reason being is that we can only see those three colors. Every other color we see is just a combination of those three. Color images use that characteristic of human eyes to imitate all colors.
Basically, RGB images are three concatenated matrices, where pixel values of each matrix represent the shade of the given color (Red-Green-Blue). Combining all three of them, we get a colored picture.
Loading a colored image, would give us a matrix that looks like this:
img_color = cv2.imread('/content/lena_color.png', 1) print(img_color.shape) # => (512, 512, 3)
As we can see, the image is represented as three matrices concatenated in one array. Unlike gray scale images now have three channels and every channel represents one color. It is very important to mention that in OpenCV, the order of channels is reversed. Meaning that first is Blue, Red comes second and Green comes third (BGR).
5. Image Indexing
Sometimes we would like to change one part of an image or just one pixel. Therefore, somehow we need to access those parts and change their values. As we mentioned before, the position of every pixel is defined by rows and columns in the matrix of an image. With that on our mind we are going to show how exactly image indexing is done.
5.1 Accessing pixel values
Image indexing is very similar to list indexing in python. The only difference is that in images we have two coordinates. Let’s select one pixel from an image.
import numpy as np import cv2 #Indexing image to get value of a single pixel, first index represents row and second index represents column img_gray = cv2.imread('/content/lena_color.png', 0) pixel_value = img_color[156, 264] print(pixel_value) # => 175
As you can see the value of the pixel in the 157th row and the 265rd column is 175. It is very important to note that rows come first, and columns second. If we want to select multiple pixel values we could also do that exactly like we do it with lists, but again we have two coordinates.
For example, let’s select the fourth row in our image. As we previously saw, our image is an array with dimensions of a 516×516 , so for one row we are going to get an array with dimensions of a 1×516.
import numpy as np import cv2 #Indexing image to get value of the fourth row img_gray = cv2.imread('/content/lena_color.png', 0) fourth_row = img_gray[4, :] print(fourth_row.shape) # => (516,)
We can also get part of an image using interval indexing by both rows and columns. We are going to take all rows from 156 to 159, and all columns 4 to 7. Let’s see what we get.
import numpy as np import cv2 #Indexing image to get value of part of an image img_gray = cv2.imread('/content/lena_color.png', 0) snipped_img = img_gray[156:159, 4:7] print(fourth_row)# => [[111 110 110] # [110 109 109] # [106 110 112]]
We snipped a part of an image with this kind of indexing which is very useful. Also, in Python we have something called logical indexing. Sometimes we are going to need all the pixels which satisfy some conditions. Let’s select all pixels that are equal to 255.
import numpy as np import cv2 img_gray = cv2.imread('/content/lena_color.png', 0) print([img_gray == 255]) # => [[False, False, False, ..., False, False, False]], logical matrix True/False print(img_gray[img_gray == 255]) # => , None of pixels have value 255
As we can see our logical matrix is filled with False values, meaning none of the pixels satisfy our condition.
5.2 Changing pixel values
Now that we know how to access pixel values, we can easily change them. For example, we can assign a new, constant pixel value for every position in an image, or we can do some mathematical operations on selected pixels to transform them the way we need. This is how both methods work.
import numpy as np import cv2 #Indexing image to change its values img_gray = cv2.imread('/content/lena_color.png', 0) #Assigning new value to an image position img_gray[75, 4] = 231 print(img_gray[75, 4]) # => 231 #Doing mathematical operations on part of an image img_gray[156:159, 4:7] = img_gray[156:159, 4:7] - 6 print(img_gray[156:159, 4:7]) # => [[105 104 104] # [104 103 103] # [100 104 106]]
5.3 OpenCV Examples
Now that we’ve learned how to change pixel values, let’s see how it actually looks when we display an image. First, we’re gonna load our picture, and see what happens if we increase the value of all the pixels by a certain number.
img_gray = cv2.imread('/content/baboon.png', 0) cv2.imshow(img_gray) new_img = img_gray+100 cv2.imshow(new_img)
By increasing all the pixel values by 100, the picture as a whole should be brighter. So why are certain areas darker than before?
That’s because after the operation if the pixel value exceeds 255, It will not go over it, but rather cycle back and start from the bottom of the scale.
So if a pixel previously had a value of 240, and we added a 100, it will now have the value of 85.Usually, we do not want this to happen, but rather cap it at 255.We can do this by using OpenCV’s add function.The first parameter specifies the image, and the second one determines the amount we want to add.
Img_gray_cap= cv2.add(img_gray,100) cv2.imshow(img_gray_cap)
As we can see, there are no darker colors, which means that the values that would have exceeded the limit were capped. Let’s now perform the simple task of increasing the contrast within the image. Basically, we want to make bright pixels even brighter, and dark ones even darker.
img_gray[img_gray>150] = 255 img_gray[img_gray<100] = 0 cv2_imshow(img_gray)
Pixels greater than 150, were given the value of 255, and pixels lower than 100 were given the value of 0. The result is as follows.
Depending on the situation, you are going to use different methods of visualization, indexing, and value assignment. Image processing takes a lot of experience and knowledge. With this article, you just made the first step of learning image processing and computer vision. We encourage you to stay tuned to our series. We are going to cover more of the base concepts in OpenCV and Image processing. After you conquer the basics we are going to do some projects and more advanced concepts.
Author at Rubik's Code
Stefan Nidzovic is a student at Faculty of Technical Science, at University of Novi Sad. More precisely, department of biomedical engineering, focusing mostly on applying the knowledge of computer vision and machine learning in medicine. He is also a member of “Creative Engineering Center”, where he works on various projects, mostly in computer vision.
Author at Rubik's Code
Miloš Marinković is a student of Biomedical Engineering, at the Faculty of Technical Sciences, University of Novi Sad. Before he enrolled at the university, Miloš graduated from the gymnasium “Jovan Jovanović Zmaj” in 2019 in Novi Sad. Currently he is a member of “Creative Engineering Center”, where he was involved in a couple of image processing and embedded electronic projects. Also, Miloš works as an intern at BioSense Institute in Novi Sad, on projects which include bioinformatics, DNA sequence analysis and machine learning. When he was younger he was a member of the Serbian judo national team and he holds the black belt in judo.