Handwritten Digit Recognition with CNN

CNN is primarily used in object recognition by taking images as input and then classifying them in a certain category. Handwritten digit recognition is one of that kind. We will be having a set of images which are handwritten digits with there labels from 0 to 9. Read my other post to start with CNN.

For this, we will use THE MNIST DATABASE of handwritten digits. This dataset has a training set of 60,000 examples, and a test set of 10,000 examples.

We will make a model which will get trained from 60,000 inputs, and then we will check for accuracy of our model on 10,000 test set examples. We will be using the Keras library with a Tensorflow backend for building the model and will download the dataset from Keras itself by using from keras.datasets import mnist.

So firstly we must import all the things which we will be using in our code:

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
import numpy as np

So we have imported Keras and mnist dataset. After that we have imported the model which we will be using, here it is Sequential which is prebuilt in keras. We have imported Dense layer, which will be used to predict the labels, then the Dropout layer which reduces overfitting , and then Flatten, which will help to convert a 3-d Array to 1-d. Then finally we have imported convolutional layer, pooling layer, and numpy.

batch_size = 128
num_classes = 10
epochs = 12

img_rows, img_cols = 28, 28

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000,28,28,1)
x_test = x_test.reshape(10000,28,28,1)

print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 49s 4us/step
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples

In the above part, we have initialised some variables, as our batch size will be of 128 , num_classes = 10 because we have 10 classes which are 0-9, and epochs are the no. of time we will train and test. Then in img_rows, img_cols = 28, 28 we have given the dmensions of the image. Now we have loaded the data in (x_train, y_train), (x_test, y_test) and then reshaped x_train and x_test because CNN accepts only 4-D vector, so in .reshape(60000,28,28,1) , 60000 is the number of images, 28 is the image size ( two times 28 since dimension is 28×28), and 1 is the number of channels, and similarly in .reshape(10000,28,28,1) but 10000 because only 10000 images are in test set.
After this, in last two lines, we also convert our target values into binary class matrices, which means 2 will be converted to matrix form [0,0,1,0,0,0,0,0,0,0] as num_classes is set to 10.

Now comes our model part:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

We have used sequential model and then we have added convolutional and pooling layer to the model. Here in activation we have used relu, it is used in almost all the convolutional neural networks or deep learning. We have added dropout layer in between to reduce overfitting and Dense layer for class prediction.

Now we will compile the model with a categorical cross entropy loss function, Adadelta optimizer and an accuracy metric for 12 epoches:

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
60000/60000 [==============================] - 247s 4ms/step - loss: 0.9695 - acc: 0.8849 - val_loss: 0.0556 - val_acc: 0.9831
Epoch 2/12
60000/60000 [==============================] - 219s 4ms/step - loss: 0.1036 - acc: 0.9716 - val_loss: 0.0445 - val_acc: 0.9864
Epoch 3/12
60000/60000 [==============================] - 233s 4ms/step - loss: 0.0734 - acc: 0.9783 - val_loss: 0.0417 - val_acc: 0.9862
Epoch 4/12
60000/60000 [==============================] - 213s 4ms/step - loss: 0.0586 - acc: 0.9831 - val_loss: 0.0330 - val_acc: 0.9887
Epoch 5/12
60000/60000 [==============================] - 207s 3ms/step - loss: 0.0498 - acc: 0.9852 - val_loss: 0.0295 - val_acc: 0.9913
Epoch 6/12
60000/60000 [==============================] - 224s 4ms/step - loss: 0.0423 - acc: 0.9872 - val_loss: 0.0416 - val_acc: 0.9880
Epoch 7/12
60000/60000 [==============================] - 271s 5ms/step - loss: 0.0378 - acc: 0.9886 - val_loss: 0.0360 - val_acc: 0.9896
Epoch 8/12
60000/60000 [==============================] - 199s 3ms/step - loss: 0.0355 - acc: 0.9894 - val_loss: 0.0306 - val_acc: 0.9924
Epoch 9/12
60000/60000 [==============================] - 195s 3ms/step - loss: 0.0317 - acc: 0.9906 - val_loss: 0.0316 - val_acc: 0.9911
Epoch 10/12
60000/60000 [==============================] - 215s 4ms/step - loss: 0.0303 - acc: 0.9912 - val_loss: 0.0356 - val_acc: 0.9908
Epoch 11/12
60000/60000 [==============================] - 198s 3ms/step - loss: 0.0302 - acc: 0.9911 - val_loss: 0.0343 - val_acc: 0.9918
Epoch 12/12
60000/60000 [==============================] - 209s 3ms/step - loss: 0.0296 - acc: 0.9918 - val_loss: 0.0435 - val_acc: 0.9891
Test loss: 0.043464561576
Test accuracy: 0.9891

As seen from the output we trained our model in 12 epoches and got accuracy of 98.91% and test loss of 0.043464561576.

Summary

We made a model which trains on MNIST dataset, and with 12 epoches we got 98.91% accuracy.

KerasMachine LearningNeural Networks