What will we cover?
- Understand what Convolutional Neural Network (CNN) is
- The strength of CNN
- How to use it to detect handwriting
- Extract features from pictures
- Learn Convolution, Pooling and Flatten
- How to create a CNN
Step 1: What is Computer Vision?
Computational methods for analyzing and understanding digital images.
An example could be detecting handwriting.
Assuming familiarity with Deep Neural Network a naive approach would be to map one pixel to the input network, have some hidden layers, then detect.
A Deep Neural Network could be given for images.
But actually, we are not (the network) is not interested into any specifics of the pixels in the image. Also, what if the images are moved 1 pixel to the left, then this would influence the network. Hence, this approach seems not to be very good.
Step 2: What is Image Convolution?
Image Convolution is applying a filter that adds each pixel value of an image to its neighbors, weighted according to a kernel matrix.
A few techniques are given here.
- Reducing the size of an input by sampling from regoins in the input
- Bascially reducing the size of the image
- Pooling by choosing the maximum value in each region
Step 3: What is Convolutional Neural Network (CNN)?
- We have an input image
- Apply Convolution – possible several to get several features of the image (feature maps)
- Apply pooling (this reduces the input)
- Then flatten it out to traditional network
Step 4: Handwriting detection with CNN
We will use the MNIST data base, which is a classical large datasets of handwritten digits.
Here is the code given below with some comments.
import tensorflow as tf from tensorflow.keras.utils import to_categorical from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout # https://en.wikipedia.org/wiki/MNIST_database mnist = tf.keras.datasets.mnist # Read the data (x_train, y_train), (x_test, y_test) = mnist.load_data() # Scale it to values 0 - 1 x_train = x_train / 255.0 x_test = x_test / 255.0 y_train = to_categorical(y_train) y_test = to_categorical(y_test) x_train = x_train.reshape(x_train.shape, x_train.shape, x_train.shape, 1) x_test = x_test.reshape(x_test.shape, x_test.shape, x_test.shape, 1) # Creating a model model = Sequential() model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(.5)) model.add(Dense(10, activation='softmax')) model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) model.fit(x_train, y_train, epochs=10) model.evaluate(x_test, y_test)
Which gives an accuracy on 98%.
Want to learn more?
Want to compare your result with a model using PyTorch?
This is part of a FREE 10h Machine Learning course with Python.
- 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
- 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
- 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).