Especially with the advent of Social Media, a lot of images are shared on the Internet. As a lot of image data is available, these are subject of a lot of analysis. However, images of high quality require a reasonable amount of storage space. There are many applications which stream images so that they could be analysed in a server. This means that higher bandwidth is required for transmitting the images so that sufficient data can reach the servers.
In all such applications, it would be advantageous if we could store or transfer images of smaller size without loosing critical information from the images. This implies that if we can compress the images before storing or transmitting, it will be an advantage.
Images are stored as a 2-dimensional matrix with each element representing a pixel. So, image compression could be achieved using Matrix Decomposition techniques. A matrix decomposition technique we could use for image compression is Singular Value Decomposition.
In this notebook, we explore how we could apply Singular Value Decomposition (SVD) for image compression.
import warnings
warnings.filterwarnings('ignore')
Singular Value Decomposition (SVD)¶
Let A be an m x n matrix with rank r. Then there exists an m x n matrix Σ for which the diagonal entries in D are the first r singular values of A, σ1 ≥ σ2 ≥ … ≥σr>0, and there exists an m x m orthogonal matrix U and an n x n orthogonal matrix V such that
A = U Σ VT
Any factorization A = U Σ VT, with U and V orthogonal, Σ with positive diagonal entries in D, is called a singular value decomposition (or SVD) of A. The matrices U and V are not uniquely determined by A, but the diagonal entries of Σ are necessarily the singular values of A. The columns of U in such a decomposition are called left singular vectors of A, and the columns of V are called right singular vectors of A.
Image compression using SVD¶
Low-rank approximations of A
Let k a natural number, where k <= rank(A) <= min(n, m).
Using SVD-decomposition of A, if A = UDVT, then we keep the first k values in D as it is and set the subsequent singular values to zero. Let us denote the resulting diagonal matrix by Dk. It is easy to see that we only have to keep the first k columns of U and the first k rows of V since their other columns would be multiplied by zeros anyway as shown in the figure below. To sum up, the matrix Ak = UkDkVkT is the closest matrix to A having rank k, where Uk and Vk consist of the first k columns and rows of U and V, respectively.
If A is a large matrix, that is n,m are large and k is relatively small, then the information we need to store to approximate the information content stored in A is much smaller. That is, we can reduce the storage space significantly and we are still able to store almost the same information that the original matrix has.
Here we will see how low-rank approximation of a matrix provides a solution to compress an image.
Images are represented in a rectangular array where each element corresponds to the grayscale value for that pixel. For colored images we have a 3-dimensional array of size n * m * 3, where n and m represents the number of pixels vertically and horizontally, respectively, and for each pixel we store the intensity for colors red, green and blue.
We will repeat the low-rank approximation procedure above on a larger matrix, that is, we create the low-rank approximation of a matrix that represents an image for each color separately. The resulting 3-dimensional array will be a good approximation of the original image.
Different values of k results in different compression quality, the higher the k is, the closer the compressed image to the original, but increasing k means larger matrices.
import numpy as np
from PIL import Image
# Read image and store it as array
imageArray = np.array(Image.open('./TempleBW.jpeg'))
We check the shape of the array in which the image has been uploaded.
imageArray.shape
(4032, 3024)
So, there are 4032 * 3024 pixels in this image.
We see that this is a 2-dimensional series of bytes. If we study the bytes, we will find that each byte contains a number between 0 and 255.
imageArray[:20]
array([[227, 227, 226, ..., 190, 190, 190], [226, 226, 228, ..., 192, 193, 193], [224, 226, 227, ..., 193, 195, 196], ..., [227, 226, 224, ..., 190, 190, 190], [228, 227, 225, ..., 192, 191, 191], [230, 229, 227, ..., 191, 190, 191]], dtype=uint8)
Each data point is an intensity value. First, let us normalise the intensity values. We will divide each data point by 255 so that each data point is number between 0 and 1.
image = imageArray / 255
Let us plot the image.
import matplotlib.pyplot as plt
%matplotlib inline
# Display image
fig = plt.figure(figsize = (12, 10))
a = fig.add_subplot(1, 1, 1)
plt.tick_params(bottom = False, left = False, labelbottom = False, labelleft = False)
imgplot = plt.imshow(image, cmap = 'gray')
plt.show()

Let us check the space required to store the image.
originalBytes = image.nbytes
print("The space (in bytes) needed to store this image is", originalBytes, 'bytes')
print("The space (in kilo bytes) needed to store this image is", originalBytes / 1024, 'KB')
print("The space (in mega bytes) needed to store this image is", originalBytes / 1024 / 1024, 'MB')
The space (in bytes) needed to store this image is 97542144 bytes The space (in kilo bytes) needed to store this image is 95256.0 KB The space (in kilo bytes) needed to store this image is 93.0234375 MB
Now we start the process of compressing the image using the SVD method.
# SVD decomposition
U, D, VT = np.linalg.svd(image, full_matrices = True)
# Check for the bytes to be stored
bytesToBeStored = sum([matrix.nbytes for matrix in [U, D, VT]])
print("The matrix that we store has a total size (in bytes):", bytesToBeStored)
The matrix that we store has a total size (in bytes): 203236992
Now, we will keep only first k values in the matrix D and set the remaining to 0.
Let us set k = 1000.
k = 1000
# Selecting k columns from U matrix and k rows from VT matrix
U_k = U[:, 0:k]
VT_k = VT[0:k, :]
D_k = D[0:k]
compressedBytes = sum([matrix.nbytes for matrix in [U_k, D_k, VT_k]])
print("The compressed matrix that we store now has a total size (in bytes):", compressedBytes, 'bytes')
print("The compressed matrix that we store now has a total size (in KB):", compressedBytes/1024, 'KB')
print("The compressed matrix that we store now has a total size (in MB):", compressedBytes/1024/1024, 'MB')
The compressed matrix that we store now has a total size (in bytes): 56456000 bytes The compressed matrix that we store now has a total size (in KB): 55132.8125 KB The compressed matrix that we store now has a total size (in MB): 53.84063720703125 MB
Let us see how much compression we have achieved.
ratio = compressedBytes / originalBytes * 100
print("The compression ratio between the original image size and the compressed image is %.2f%s" % (ratio, '%'))
The compression ratio between the original image size and the compressed image is 57.88%
Now let us reconstruct the compressed image.
imageApprox = np.dot(U_k, np.dot(np.diag(D_k), VT_k))
# Correct the pixels where intensity value is outside the range [0,1]
imageApprox[imageApprox < 0] = 0
imageApprox[imageApprox > 1] = 1
Now let us display our compressed image and save the compressed image.
import math
# Display the reconstructed image
fig = plt.figure(figsize=(math.ceil(12.0 * ratio / 100), math.ceil(10.0 * ratio / 100)))
a = fig.add_subplot(1, 1, 1)
plt.tick_params(bottom = False, left = False, labelbottom = False, labelleft = False)
imgplot = plt.imshow(imageApprox, cmap = 'gray')
plt.savefig('compressedBW.png')
plt.show()
plt.close()

Compressing Colour Images¶
Now, we walk through the process of compressing colour images. The process would be the same except that in colour images we have 3 channels per pixel – one for Red, one for Green and one for Blue.
The image used for this demonstation can be found at this link.
Let us load the image first and visualise it.
imageArray = np.array(Image.open('./Temple.jpeg'))
imageArray.shape
(4032, 3024, 3)
# Normalize the intensity values in each pixel
image = imageArray / 255
# Display the image
fig = plt.figure(figsize = (12, 10))
a = fig.add_subplot(1, 1, 1)
plt.tick_params(bottom = False, left = False, labelbottom = False, labelleft = False)
imgplot = plt.imshow(image)
plt.show()

Let us check the space required to store the image.
originalBytes = image.nbytes
print("The space (in bytes) needed to store this image is", originalBytes, 'bytes')
print("The space (in kilo bytes) needed to store this image is", originalBytes / 1024, 'KB')
print("The space (in mega bytes) needed to store this image is", originalBytes / 1024 / 1024, 'MB')
The space (in bytes) needed to store this image is 292626432 bytes The space (in kilo bytes) needed to store this image is 285768.0 KB The space (in kilo bytes) needed to store this image is 279.0703125 MB
Now we start the process of compressing the image using the SVD method.
First we break the image matrix into 3 matrices – one for each colour channel.
# Break the image into three different arrays based on colors
image_red = image[:, :, 0]
image_green = image[:, :, 1]
image_blue = image[:, :, 2]
Then we perform SVD for each matrix.
# SVD of three matrices
U_r, D_r, VT_r = np.linalg.svd(image_red, full_matrices = True)
U_g, D_g, VT_g = np.linalg.svd(image_green, full_matrices = True)
U_b, D_b, VT_b = np.linalg.svd(image_blue, full_matrices = True)
# Check for the bytes to be stored
bytesToBeStored = sum([matrix.nbytes for matrix in [U_r, D_r, VT_r, U_g, D_g, VT_g, U_b, D_b, VT_b]])
print("The matrices that we store have total size (in bytes):", bytesToBeStored, 'bytes')
The matrices that we store have total size (in bytes): 609710976 bytes
Now, we will keep only first k values in the matrix D and set the remaining to 0.
Let us set k = 1000.
k = 1000
# Selecting k columns from U matrix and k rows from VT matrix
U_r_k = U_r[:, 0:k]
VT_r_k = VT_r[0:k, :]
U_g_k = U_g[:, 0:k]
VT_g_k = VT_g[0:k, :]
U_b_k = U_b[:, 0:k]
VT_b_k = VT_b[0:k, :]
D_r_k = D_r[0:k]
D_g_k = D_g[0:k]
D_b_k = D_b[0:k]
compressedBytes = sum([matrix.nbytes for matrix in [U_r_k, D_r_k, VT_r_k, U_g_k, D_g_k, VT_g_k, U_b_k, D_b_k, VT_b_k]])
print("The compressed matrices that we store now have total size (in bytes):", compressedBytes, 'bytes')
print("The compressed matrices that we store now have total size (in KB):", compressedBytes/1024, 'KB')
print("The compressed matrices that we store now have total size (in MB):", compressedBytes/1024/1024, 'MB')
ratio = compressedBytes / originalBytes * 100
print("\nThe compression ratio between the original image size and the compressed image is %.2f%s" % (ratio, '%'))
The compressed matrices that we store now have total size (in bytes): 169368000 bytes The compressed matrices that we store now have total size (in KB): 165398.4375 KB The compressed matrices that we store now have total size (in MB): 161.52191162109375 MB The compression ratio between the original image size and the compressed image is 57.88%
Now let us reconstruct the compressed image.
# Reconstruct matrices for each color
image_red_approx = np.dot(U_r_k, np.dot(np.diag(D_r_k), VT_r_k))
image_green_approx = np.dot(U_g_k, np.dot(np.diag(D_g_k), VT_g_k))
image_blue_approx = np.dot(U_b_k, np.dot(np.diag(D_b_k), VT_b_k))
# Reconstruct the original image from color matrices
imageReconstructed = np.zeros((image.shape[0], image.shape[1], 3))
imageReconstructed[:, :, 0] = image_red_approx
imageReconstructed[:, :, 1] = image_green_approx
imageReconstructed[:, :, 2] = image_blue_approx
# Correct the pixels where intensity value is outside the range [0,1]
imageReconstructed[imageReconstructed < 0] = 0
imageReconstructed[imageReconstructed > 1] = 1
Now let us display and save the compressed image.
# Display the reconstructed image
fig = plt.figure(figsize=(math.ceil(12.0 * ratio / 100), math.ceil(10.0 * ratio / 100)))
a = fig.add_subplot(1, 1, 1)
plt.tick_params(bottom = False, left = False, labelbottom = False, labelleft = False)
imgplot = plt.imshow(imageReconstructed)
plt.savefig('compressed.png')
plt.show()
plt.close()

Request you to kindly leave a reply.