# 11. Image recognition¶

## 11.1. Introduction¶

In previous chapters, we saw the examples of ‘classification’, ‘regression’, ‘preprocessing’, ‘dimensionality reduction’ and ‘clustering’. In these examples we considered the numeric and categorical features. In this chapter, we will use the ‘numerical features’, but these features will represent the images.

Note

In Chapter 2, we used the the Iris-dataset which was available in the SciKit library package; and the dataset which is available in the SciKit library starts with prefix ‘load_’ e.g. load_iris.

In this chapter, we will use the dataset whose names are available in the dataset. And we need Internet connection to load them on the computer. These datasets start with ‘fetch_’ e.g. ‘fetch_olivetti_faces’, as shown in next section.

When the dataset ‘fetch_olivetti_faces’ is instantiated, then the data will be downloaded and will be saved in ~/scikit_learn_data. Further, once the data set is downloaded then it will be used from this directory.

## 11.2. Fetch the dataset¶

Lets download the dataset and see the contents of it. Note that the dataset will be downloaded during instantiation (Line 4), and not by the Line 2.

Note

In the dataset, there are images of 40 people with 10 different poses e.g. smiling and angry faces etc. Therefore, there are total 400 samples (i.e. 40x10).

  1 2 3 4 5 6 7 8 9 10 # faces_ex.py import matplotlib.pyplot as plt from sklearn.datasets import fetch_olivetti_faces faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data print("Keys:", faces.keys()) # display keys print("Total samples and image size:", faces.images.shape) print("Total samples and features:", faces.data.shape) print("Total samples and targets:", faces.target.shape) 

Following is the output of above code. Note that there are total 400 samples and the images size is (64, 64), which is stored as features of size 4096 (i.e. 64x64).

$python faces_ex.py Keys: dict_keys(['data', 'images', 'target', 'DESCR']) Total samples and image size: (400, 64, 64) Total samples and features: (400, 4096) Total samples and targets: (400,)  Note Please look at the values of the ‘images’, ‘data’ and ‘targets’ as well as below, $ python -i faces_ex.py

>>> # Sizes
>>> print(faces.images[0].shape)
(64, 64)

>>> print(faces.data[0].shape)
(4096,)

>>> print(faces.target[0].size)
1

>>> # Contents
>>> print(faces.images[0])
[[ 0.30991736  0.36776859  0.41735536 ...,  0.37190083  0.33057851
0.30578512]
[ 0.3429752   0.40495867  0.43801653 ...,  0.37190083  0.33884299
0.3140496 ]
[ 0.3429752   0.41735536  0.45041323 ...,  0.38016528  0.33884299
0.29752067]
...,
[ 0.21487603  0.20661157  0.22314049 ...,  0.15289256  0.16528925
0.17355372]
[ 0.20247933  0.2107438   0.2107438  ...,  0.14876033  0.16115703
0.16528925]
[ 0.20247933  0.20661157  0.20247933 ...,  0.15289256  0.16115703
0.1570248 ]]

>>> print(faces.data[0]) # list size =
[ 0.30991736  0.36776859  0.41735536 ...,  0.15289256  0.16115703
0.1570248 ]

>>> print(faces.target[0]) # person 0
0


## 11.3. Plot the images¶

Let’s plot the images of first 20 images, which are shown in Fig. 11.1,

Listing 11.2 Plot the images
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 # faces_ex.py import matplotlib.pyplot as plt from sklearn.datasets import fetch_olivetti_faces faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data # print("Keys:", faces.keys()) # display keys # print("Total samples and image size:", faces.images.shape) # print("Total samples and features:", faces.data.shape) # print("Total samples and targets:", faces.target.shape) images = faces.images # save images # note that images can not be saved as features, as we need 2D data for # features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y) features = faces.data # features targets = faces.target # targets fig = plt.figure() # create a new figure window for i in range(20): # display 20 images # subplot : 4 rows and 5 columns img_grid = fig.add_subplot(4, 5, i+1) # plot features as image img_grid.imshow(images[i]) plt.show() 
• Before moving further, let’s convert the Listing 11.2 into a function, so that the code can be reused. Listing 11.3 is the function which can be used to plot any number of images with desired number of rows and columns e.g. Line 26 plots 10 images with 2 rows and 5 columns.
Listing 11.3 Function for plotting the images
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 # faces_ex.py import matplotlib.pyplot as plt from sklearn.datasets import fetch_olivetti_faces # function for plotting images def plot_images(images, total_images=20, rows=4, cols=5): fig = plt.figure() # create a new figure window for i in range(total_images): # display 20 images # subplot : 4 rows and 5 columns img_grid = fig.add_subplot(rows, cols, i+1) # plot features as image img_grid.imshow(images[i]) faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data # print("Keys:", faces.keys()) # display keys # print("Total samples and image size:", faces.images.shape) # print("Total samples and features:", faces.data.shape) # print("Total samples and targets:", faces.target.shape) images = faces.images # save images # note that images can not be saved as features, as we need 2D data for # features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y) features = faces.data # features targets = faces.target # targets # plot 10 images with 2 rows and 5 columns plot_images(images, 10, 2, 5) plt.show() 

## 11.4. Prediction using SVM model¶

Since there are images of 10 people here, therefore the number of different target values are fixed, hence the problem is a ‘classification’ problem. In Chapter 2 and Chapter 3, we used the ‘KNeighborsClassifier’ and ‘LogisticRegression’ for the classification problems; in this chapter we will used the ‘Support Vector Machine (SVM)’ model for the classification.

Note

SVM looks for the line that seperates the two classes in the best way.

The code for prediction is exactly same as in Chapter 2 and Chapter 3, the only difference is that the ‘SVC (from SVM)’ model is used with ‘ kernel=”linear” (Line 49)’. Note that, by default ‘ kernel=”rbf” ‘ is used in SVC, which is required for the non-linear problems.

Listing 11.4 Prediction using SVC
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 # faces_ex.py import matplotlib.pyplot as plt from sklearn.datasets import fetch_olivetti_faces from sklearn.svm import SVC from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split # function for plotting images def plot_images(images, total_images=20, rows=4, cols=5): fig = plt.figure() # create a new figure window for i in range(total_images): # display 20 images # subplot : 4 rows and 5 columns img_grid = fig.add_subplot(rows, cols, i+1) # plot features as image img_grid.imshow(images[i]) faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data # print("Keys:", faces.keys()) # display keys # print("Total samples and image size:", faces.images.shape) # print("Total samples and features:", faces.data.shape) # print("Total samples and targets:", faces.target.shape) images = faces.images # save images # note that images can not be saved as features, as we need 2D data for # features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y) features = faces.data # features targets = faces.target # targets # # plot 10 images with 2 rows and 5 columns # plot_images(images, 10, 2, 5) # plt.show() # split the training and test data train_features, test_features, train_targets, test_targets = train_test_split( features, targets, train_size=0.8, test_size=0.2, # random but same for all run, also accuracy depends on the # selection of data e.g. if we put 10 then accuracy will be 1.0 # in this example random_state=23, # keep same proportion of 'target' in test and target data stratify=targets ) # use SVC classifier = SVC(kernel="linear") # default kernel=rbf # training using 'training data' classifier.fit(train_features, train_targets) # fit the model for training data # predict the 'target' for 'training data' prediction_training_targets = classifier.predict(train_features) self_accuracy = accuracy_score(train_targets, prediction_training_targets) print("Accuracy for training data (self accuracy):", self_accuracy) # predict the 'target' for 'test data' prediction_test_targets = classifier.predict(test_features) test_accuracy = accuracy_score(test_targets, prediction_test_targets) print("Accuracy for test data:", test_accuracy) 
• Below is the output of above code,
$python faces_ex.py Accuracy for training data (self accuracy): 1.0 Accuracy for test data: 0.9875  • Let’s print the locations of first 20 images, where the test-images and the predicted-images are different from each other. Also, plot the images to see the differences in the images. Listing 11.5 Plot first 20 images from the test-images and predicted-images   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 # faces_ex.py import matplotlib.pyplot as plt import numpy as np from sklearn.datasets import fetch_olivetti_faces from sklearn.svm import SVC from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split # function for plotting images def plot_images(images, total_images=20, rows=4, cols=5): fig = plt.figure() # create a new figure window for i in range(total_images): # display 20 images # subplot : 4 rows and 5 columns img_grid = fig.add_subplot(rows, cols, i+1) # plot features as image img_grid.imshow(images[i]) faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data # print("Keys:", faces.keys()) # display keys # print("Total samples and image size:", faces.images.shape) # print("Total samples and features:", faces.data.shape) # print("Total samples and targets:", faces.target.shape) images = faces.images # save images # note that images can not be saved as features, as we need 2D data for # features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y) features = faces.data # features targets = faces.target # targets # # plot 10 images with 2 rows and 5 columns # plot_images(images, 10, 2, 5) # plt.show() # split the training and test data train_features, test_features, train_targets, test_targets = train_test_split( features, targets, train_size=0.8, test_size=0.2, # random but same for all run, also accuracy depends on the # selection of data e.g. if we put 10 then accuracy will be 1.0 # in this example random_state=23, # keep same proportion of 'target' in test and target data stratify=targets ) # use SVC classifier = SVC(kernel="linear") # default kernel=rbf # training using 'training data' classifier.fit(train_features, train_targets) # fit the model for training data # predict the 'target' for 'training data' prediction_training_targets = classifier.predict(train_features) self_accuracy = accuracy_score(train_targets, prediction_training_targets) print("Accuracy for training data (self accuracy):", self_accuracy) # predict the 'target' for 'test data' prediction_test_targets = classifier.predict(test_features) test_accuracy = accuracy_score(test_targets, prediction_test_targets) print("Accuracy for test data:", test_accuracy) # location of error for first 20 images in test data print("Wrongly detected image-locations: ", end=' ') for i in range (20): # if images are not same then print location of images if test_targets[i] != prediction_test_targets[i]: print(i) # store test images in list faces_test = [] for i in test_targets: faces_test.append(images[i]) # store predicted images in list faces_predict = [] for i in prediction_test_targets: faces_predict.append(images[i]) # plot the first 20 images from the list plot_images(faces_test, total_images=20) plot_images(faces_predict, total_images=20) plt.show()  • Below are the outputs of above code. The plotted test-images and predicted-images are shown in Fig. 11.2 and Fig. 11.3 respectively, where we can see that the image at location 14 (see red boxes) is at error. $ python faces_ex.py
Accuracy for training data (self accuracy): 1.0
Accuracy for test data: 0.9875
Wrongly detected image-locations:  14


## 11.5. Convert features to images¶

Note

In Listing 11.5, we have used the ‘images (i.e. faces_test.append(images[i]))’ at Lines 75 and 80, to plot the images.

Also, we can convert the ‘features’ into images for plotting the images as shown in Lines 77 and 84 of Listing 11.6.

Listing 11.6 Convert features to images
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 # faces_ex.py import matplotlib.pyplot as plt import numpy as np from sklearn.datasets import fetch_olivetti_faces from sklearn.svm import SVC from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split # function for plotting images def plot_images(images, total_images=20, rows=4, cols=5): fig = plt.figure() # create a new figure window for i in range(total_images): # display 20 images # subplot : 4 rows and 5 columns img_grid = fig.add_subplot(rows, cols, i+1) # plot features as image img_grid.imshow(images[i]) faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data # print("Keys:", faces.keys()) # display keys # print("Total samples and image size:", faces.images.shape) # print("Total samples and features:", faces.data.shape) # print("Total samples and targets:", faces.target.shape) images = faces.images # save images # note that images can not be saved as features, as we need 2D data for # features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y) features = faces.data # features targets = faces.target # targets # # plot 10 images with 2 rows and 5 columns # plot_images(images, 10, 2, 5) # plt.show() # split the training and test data train_features, test_features, train_targets, test_targets = train_test_split( features, targets, train_size=0.8, test_size=0.2, # random but same for all run, also accuracy depends on the # selection of data e.g. if we put 10 then accuracy will be 1.0 # in this example random_state=23, # keep same proportion of 'target' in test and target data stratify=targets ) # use SVC classifier = SVC(kernel="linear") # default kernel=rbf # training using 'training data' classifier.fit(train_features, train_targets) # fit the model for training data # predict the 'target' for 'training data' prediction_training_targets = classifier.predict(train_features) self_accuracy = accuracy_score(train_targets, prediction_training_targets) print("Accuracy for training data (self accuracy):", self_accuracy) # predict the 'target' for 'test data' prediction_test_targets = classifier.predict(test_features) test_accuracy = accuracy_score(test_targets, prediction_test_targets) print("Accuracy for test data:", test_accuracy) # location of error for first 20 images in test data print("Wrongly detected image-locations: ", end=' ') for i in range (20): # if images are not same then print location of images if test_targets[i] != prediction_test_targets[i]: print(i) # store test images in list faces_test = [] for i in test_targets: # faces_test.append(images[i]) # convert 'features' to images faces_test.append(np.reshape(features[i], (64, 64))) # store predicted images in list faces_predict = [] for i in prediction_test_targets: # faces_predict.append(images[i]) # convert 'features' to images faces_predict.append(np.reshape(features[i], (64, 64))) # plot the first 20 images from the list plot_images(faces_test, total_images=20) plot_images(faces_predict, total_images=20) plt.show()