11. Image recognition

11.1. Introduction

In previous chapters, we saw the examples of ‘classification’, ‘regression’, ‘preprocessing’, ‘dimensionality reduction’ and ‘clustering’. In these examples we considered the numeric and categorical features. In this chapter, we will use the ‘numerical features’, but these features will represent the images.

Note

In Chapter 2, we used the the Iris-dataset which was available in the SciKit library package; and the dataset which is available in the SciKit library starts with prefix ‘load_’ e.g. load_iris.

In this chapter, we will use the dataset whose names are available in the dataset. And we need Internet connection to load them on the computer. These datasets start with ‘fetch_’ e.g. ‘fetch_olivetti_faces’, as shown in next section.

When the dataset ‘fetch_olivetti_faces’ is instantiated, then the data will be downloaded and will be saved in ~/scikit_learn_data. Further, once the data set is downloaded then it will be used from this directory.

11.2. Fetch the dataset

Lets download the dataset and see the contents of it. Note that the dataset will be downloaded during instantiation (Line 4), and not by the Line 2.

Note

In the dataset, there are images of 40 people with 10 different poses e.g. smiling and angry faces etc. Therefore, there are total 400 samples (i.e. 40x10).

Listing 11.1 Download the data
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# faces_ex.py

import matplotlib.pyplot as plt
from sklearn.datasets import fetch_olivetti_faces

faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data
print("Keys:", faces.keys()) # display keys
print("Total samples and image size:", faces.images.shape)
print("Total samples and features:", faces.data.shape)
print("Total samples and targets:", faces.target.shape)

Following is the output of above code. Note that there are total 400 samples and the images size is (64, 64), which is stored as features of size 4096 (i.e. 64x64).

$ python faces_ex.py

Keys: dict_keys(['data', 'images', 'target', 'DESCR'])
Total samples and image size: (400, 64, 64)
Total samples and features: (400, 4096)
Total samples and targets: (400,)

Note

Please look at the values of the ‘images’, ‘data’ and ‘targets’ as well as below,

$ python -i faces_ex.py

>>> # Sizes
>>> print(faces.images[0].shape)
(64, 64)

>>> print(faces.data[0].shape)
(4096,)

>>> print(faces.target[0].size)
1

>>> # Contents
>>> print(faces.images[0])
[[ 0.30991736  0.36776859  0.41735536 ...,  0.37190083  0.33057851
   0.30578512]
 [ 0.3429752   0.40495867  0.43801653 ...,  0.37190083  0.33884299
   0.3140496 ]
 [ 0.3429752   0.41735536  0.45041323 ...,  0.38016528  0.33884299
   0.29752067]
 ...,
 [ 0.21487603  0.20661157  0.22314049 ...,  0.15289256  0.16528925
   0.17355372]
 [ 0.20247933  0.2107438   0.2107438  ...,  0.14876033  0.16115703
   0.16528925]
 [ 0.20247933  0.20661157  0.20247933 ...,  0.15289256  0.16115703
   0.1570248 ]]

>>> print(faces.data[0]) # list size =
[ 0.30991736  0.36776859  0.41735536 ...,  0.15289256  0.16115703
  0.1570248 ]

>>> print(faces.target[0]) # person 0
0

11.3. Plot the images

Let’s plot the images of first 20 images, which are shown in Fig. 11.1,

Listing 11.2 Plot the images
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# faces_ex.py

import matplotlib.pyplot as plt
from sklearn.datasets import fetch_olivetti_faces

faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data
# print("Keys:", faces.keys()) # display keys
# print("Total samples and image size:", faces.images.shape)
# print("Total samples and features:", faces.data.shape)
# print("Total samples and targets:", faces.target.shape)

images = faces.images # save images

# note that images can not be saved as features, as we need 2D data for
# features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y)
features = faces.data  # features
targets = faces.target # targets

fig = plt.figure() # create a new figure window
for i in range(20): # display 20 images
    # subplot : 4 rows and 5 columns
    img_grid = fig.add_subplot(4, 5, i+1)
    # plot features as image
    img_grid.imshow(images[i])

plt.show()
../_images/face_20.png

Fig. 11.1 First 20 images in the dataset

  • Before moving further, let’s convert the Listing 11.2 into a function, so that the code can be reused. Listing 11.3 is the function which can be used to plot any number of images with desired number of rows and columns e.g. Line 26 plots 10 images with 2 rows and 5 columns.
Listing 11.3 Function for plotting the images
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# faces_ex.py

import matplotlib.pyplot as plt
from sklearn.datasets import fetch_olivetti_faces

# function for plotting images
def plot_images(images, total_images=20, rows=4, cols=5):
    fig = plt.figure() # create a new figure window
    for i in range(total_images): # display 20 images
        # subplot : 4 rows and 5 columns
        img_grid = fig.add_subplot(rows, cols, i+1)
        # plot features as image
        img_grid.imshow(images[i])

faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data
# print("Keys:", faces.keys()) # display keys
# print("Total samples and image size:", faces.images.shape)
# print("Total samples and features:", faces.data.shape)
# print("Total samples and targets:", faces.target.shape)

images = faces.images # save images

# note that images can not be saved as features, as we need 2D data for
# features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y)
features = faces.data  # features
targets = faces.target # targets

# plot 10 images with 2 rows and 5 columns
plot_images(images, 10, 2, 5)
plt.show()

11.4. Prediction using SVM model

Since there are images of 10 people here, therefore the number of different target values are fixed, hence the problem is a ‘classification’ problem. In Chapter 2 and Chapter 3, we used the ‘KNeighborsClassifier’ and ‘LogisticRegression’ for the classification problems; in this chapter we will used the ‘Support Vector Machine (SVM)’ model for the classification.

Note

SVM looks for the line that seperates the two classes in the best way.

The code for prediction is exactly same as in Chapter 2 and Chapter 3, the only difference is that the ‘SVC (from SVM)’ model is used with ‘ kernel=”linear” (Line 49)’. Note that, by default ‘ kernel=”rbf” ‘ is used in SVC, which is required for the non-linear problems.

Listing 11.4 Prediction using SVC
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# faces_ex.py

import matplotlib.pyplot as plt
from sklearn.datasets import fetch_olivetti_faces
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# function for plotting images
def plot_images(images, total_images=20, rows=4, cols=5):
    fig = plt.figure() # create a new figure window
    for i in range(total_images): # display 20 images
        # subplot : 4 rows and 5 columns
        img_grid = fig.add_subplot(rows, cols, i+1)
        # plot features as image
        img_grid.imshow(images[i])

faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data
# print("Keys:", faces.keys()) # display keys
# print("Total samples and image size:", faces.images.shape)
# print("Total samples and features:", faces.data.shape)
# print("Total samples and targets:", faces.target.shape)

images = faces.images # save images

# note that images can not be saved as features, as we need 2D data for
# features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y)
features = faces.data  # features
targets = faces.target # targets

# # plot 10 images with 2 rows and 5 columns
# plot_images(images, 10, 2, 5)
# plt.show()

# split the training and test data
train_features, test_features, train_targets, test_targets = train_test_split(
        features, targets,
        train_size=0.8,
        test_size=0.2,
        # random but same for all run, also accuracy depends on the
        # selection of data e.g. if we put 10 then accuracy will be 1.0
        # in this example
        random_state=23,
        # keep same proportion of 'target' in test and target data
        stratify=targets
    )

# use SVC
classifier = SVC(kernel="linear") # default kernel=rbf
# training using 'training data'
classifier.fit(train_features, train_targets) # fit the model for training data

# predict the 'target' for 'training data'
prediction_training_targets = classifier.predict(train_features)
self_accuracy = accuracy_score(train_targets, prediction_training_targets)
print("Accuracy for training data (self accuracy):", self_accuracy)

# predict the 'target' for 'test data'
prediction_test_targets = classifier.predict(test_features)
test_accuracy = accuracy_score(test_targets, prediction_test_targets)
print("Accuracy for test data:", test_accuracy)
  • Below is the output of above code,
$ python faces_ex.py

Accuracy for training data (self accuracy): 1.0
Accuracy for test data: 0.9875
  • Let’s print the locations of first 20 images, where the test-images and the predicted-images are different from each other. Also, plot the images to see the differences in the images.
Listing 11.5 Plot first 20 images from the test-images and predicted-images
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# faces_ex.py

import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import fetch_olivetti_faces
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# function for plotting images
def plot_images(images, total_images=20, rows=4, cols=5):
    fig = plt.figure() # create a new figure window
    for i in range(total_images): # display 20 images
        # subplot : 4 rows and 5 columns
        img_grid = fig.add_subplot(rows, cols, i+1)
        # plot features as image
        img_grid.imshow(images[i])

faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data
# print("Keys:", faces.keys()) # display keys
# print("Total samples and image size:", faces.images.shape)
# print("Total samples and features:", faces.data.shape)
# print("Total samples and targets:", faces.target.shape)

images = faces.images # save images

# note that images can not be saved as features, as we need 2D data for
# features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y)
features = faces.data  # features
targets = faces.target # targets

# # plot 10 images with 2 rows and 5 columns
# plot_images(images, 10, 2, 5)
# plt.show()

# split the training and test data
train_features, test_features, train_targets, test_targets = train_test_split(
        features, targets,
        train_size=0.8,
        test_size=0.2,
        # random but same for all run, also accuracy depends on the
        # selection of data e.g. if we put 10 then accuracy will be 1.0
        # in this example
        random_state=23,
        # keep same proportion of 'target' in test and target data
        stratify=targets
    )

# use SVC
classifier = SVC(kernel="linear") # default kernel=rbf
# training using 'training data'
classifier.fit(train_features, train_targets) # fit the model for training data

# predict the 'target' for 'training data'
prediction_training_targets = classifier.predict(train_features)
self_accuracy = accuracy_score(train_targets, prediction_training_targets)
print("Accuracy for training data (self accuracy):", self_accuracy)

# predict the 'target' for 'test data'
prediction_test_targets = classifier.predict(test_features)
test_accuracy = accuracy_score(test_targets, prediction_test_targets)
print("Accuracy for test data:", test_accuracy)


# location of error for first 20 images in test data
print("Wrongly detected image-locations: ", end=' ')
for i in range (20):
    # if images are not same then print location of images
    if test_targets[i] != prediction_test_targets[i]:
        print(i)

# store test images in list
faces_test = []
for i in test_targets:
    faces_test.append(images[i])

# store predicted images in list
faces_predict = []
for i in prediction_test_targets:
    faces_predict.append(images[i])

# plot the first 20 images from the list
plot_images(faces_test, total_images=20)
plot_images(faces_predict, total_images=20)
plt.show()
  • Below are the outputs of above code. The plotted test-images and predicted-images are shown in Fig. 11.2 and Fig. 11.3 respectively, where we can see that the image at location 14 (see red boxes) is at error.
$ python faces_ex.py
Accuracy for training data (self accuracy): 1.0
Accuracy for test data: 0.9875
Wrongly detected image-locations:  14
../_images/tst_20_img.png

Fig. 11.2 Test-images

../_images/prd_20_img.png

Fig. 11.3 Predicted images

11.5. Convert features to images

Note

In Listing 11.5, we have used the ‘images (i.e. faces_test.append(images[i]))’ at Lines 75 and 80, to plot the images.

Also, we can convert the ‘features’ into images for plotting the images as shown in Lines 77 and 84 of Listing 11.6.

Listing 11.6 Convert features to images
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# faces_ex.py

import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import fetch_olivetti_faces
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# function for plotting images
def plot_images(images, total_images=20, rows=4, cols=5):
    fig = plt.figure() # create a new figure window
    for i in range(total_images): # display 20 images
        # subplot : 4 rows and 5 columns
        img_grid = fig.add_subplot(rows, cols, i+1)
        # plot features as image
        img_grid.imshow(images[i])

faces = fetch_olivetti_faces() # download the dataset at ~/scikit_learn_data
# print("Keys:", faces.keys()) # display keys
# print("Total samples and image size:", faces.images.shape)
# print("Total samples and features:", faces.data.shape)
# print("Total samples and targets:", faces.target.shape)

images = faces.images # save images

# note that images can not be saved as features, as we need 2D data for
# features, whereas faces.images are 3D data i.e. (samples, pixel-x, pixel-y)
features = faces.data  # features
targets = faces.target # targets

# # plot 10 images with 2 rows and 5 columns
# plot_images(images, 10, 2, 5)
# plt.show()

# split the training and test data
train_features, test_features, train_targets, test_targets = train_test_split(
        features, targets,
        train_size=0.8,
        test_size=0.2,
        # random but same for all run, also accuracy depends on the
        # selection of data e.g. if we put 10 then accuracy will be 1.0
        # in this example
        random_state=23,
        # keep same proportion of 'target' in test and target data
        stratify=targets
    )

# use SVC
classifier = SVC(kernel="linear") # default kernel=rbf
# training using 'training data'
classifier.fit(train_features, train_targets) # fit the model for training data

# predict the 'target' for 'training data'
prediction_training_targets = classifier.predict(train_features)
self_accuracy = accuracy_score(train_targets, prediction_training_targets)
print("Accuracy for training data (self accuracy):", self_accuracy)

# predict the 'target' for 'test data'
prediction_test_targets = classifier.predict(test_features)
test_accuracy = accuracy_score(test_targets, prediction_test_targets)
print("Accuracy for test data:", test_accuracy)


# location of error for first 20 images in test data
print("Wrongly detected image-locations: ", end=' ')
for i in range (20):
    # if images are not same then print location of images
    if test_targets[i] != prediction_test_targets[i]:
        print(i)

# store test images in list
faces_test = []
for i in test_targets:
    # faces_test.append(images[i])
    # convert 'features' to images
    faces_test.append(np.reshape(features[i], (64, 64)))

# store predicted images in list
faces_predict = []
for i in prediction_test_targets:
    # faces_predict.append(images[i])
    # convert 'features' to images
    faces_predict.append(np.reshape(features[i], (64, 64)))

# plot the first 20 images from the list
plot_images(faces_test, total_images=20)
plot_images(faces_predict, total_images=20)
plt.show()