## What will we cover?

- Understand what Convolutional Neural Network (CNN) is
- The strength of CNN
- How to use it to detect handwriting
- Extract features from pictures
- Learn Convolution, Pooling and Flatten
- How to create a CNN

## Step 1: What is Computer Vision?

Computational methods for analyzing and understanding digital images.

An example could be detecting handwriting.

Assuming familiarity with Deep Neural Network a naive approach would be to map one pixel to the input network, have some hidden layers, then detect.

If you are new to **Artificial Neural Network **or **Deep Neural Network**.

A Deep Neural Network could be given for images.

As follows.

But actually, we are not (the network) is not interested into any specifics of the pixels in the image. Also, what if the images are moved 1 pixel to the left, then this would influence the network. Hence, this approach seems not to be very good.

## Step 2: What is Image Convolution?

Image Convolution is applying a filter that adds each pixel value of an image to its neighbors, weighted according to a kernel matrix.

A few techniques are given here.

### Pooling

- Reducing the size of an input by sampling from regoins in the input
- Bascially reducing the size of the image

### Max-Pooling

- Pooling by choosing the maximum value in each region

## Step 3: What is Convolutional Neural Network (CNN)?

Convolutional Neural Network (CNN) is a Neural Networks that use convolution for analyzing images.

### Idea of CNN is as follows.

- We have an input image
- Apply Convolution – possible several to get several features of the image (feature maps)
- Apply pooling (this reduces the input)
- Then flatten it out to traditional network

## Step 4: Handwriting detection with CNN

We will use the **MNIST** data base, which is a classical large datasets of handwritten digits.

Here is the code given below with some comments.

```
import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# https://en.wikipedia.org/wiki/MNIST_database
mnist = tf.keras.datasets.mnist
# Read the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Scale it to values 0 - 1
x_train = x_train / 255.0
x_test = x_test / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1)
# Creating a model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
```

Which gives an accuracy on 98%.

## Want to learn more?

Want to compare your result with a model using PyTorch?

**This is part of a FREE 10h Machine Learning course with Python.**

**15 video lessons**– which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).**30 JuPyter Notebooks**– with the full code and explanation from the lectures and projects (GitHub).**15 projects**– with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).