## What will we cover?

• Understand Recurrent Neural Network (RNN)
• Build a RNN on a timeseries
• Hover over the theory of RNN (LSTM cells)
• Use the MinMaxScaler from sklearn.
• Create a RNN model with tensorflow
• Applying the Dropout techniques.
• Predict stock prices and make weather forecast using RNN.

## Step 1: Feed-forward vs Recurrent Neural Network

Neural Network that has connection only in one direction is called Feed-Forward Neural Network (Examples: Artificial Neural Network, Deep Neural Network, and Convolutional Neural Network).

A Recurrent Neural Network is a Neural Network that generates output that feeds back into its own inputs. This enables it to do one-to-many and many-to-many relationship (not possible for feed-forward neural networks).

An example of one-to-many is a network that can generate sentences (while feed-forward neural network can only generate “words” or fixed sets of outputs).

Another example is working with time-series data, which we will explore in this tutorial.

A Recurrent Neural Network can be illustrated as follows.

Examples of Recurrent Neural Network includes also.

• Voice recognition
• Video copy right violation

## Step 2: Is RNN too complex to understand?

Recurrent Neural Network (RNN) is complex – but luckily – it is not needed to understand in depth.

You don’t need to understand everything about the specific architecutre of an LSTM cell […] just that LSTM cell is meant to allow past information to be reinjected at a later time.

Quote of the author of Keras (Francios Chollet)

Let’s just leave it that and get started.

## Step 3: RNN predicting stock price

For the purpose of this tutorial we will use Apple stock price and try to make a RNN to predict stock stock price the day after.

For that we will use this file of historic Apple stock prices here. You do not need to download it, we will use it directly in the code.

```import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dropout
import matplotlib.pyplot as plt
file_url = 'https://raw.githubusercontent.com/LearnPythonWithRune/MachineLearningWithPython/main/files/aapl.csv'
# Create a train and test set
# Use the MinMaxScaler to scale the data
scaler = MinMaxScaler()
data_train = scaler.fit_transform(data_train.reshape(-1, 1))
data_test = scaler.transform(data_test.reshape(-1, 1))
# To divide data into x and y set
def data_preparation(data):
x = []
y = []

for i in range(40, len(data)):
x.append(data[i-40:i, 0])
y.append(data[i])

x = np.array(x)
y = np.array(y)

x = x.reshape(x.shape[0], x.shape[1], 1)

return x, y
x_train, y_train = data_preparation(data_train)
x_test, y_test = data_preparation(data_test)
# Create the model
model = Sequential()
# Compile the model
# Train the model
model.fit(x_train, y_train, epochs=5, batch_size=32)
# Predict with the model
y_pred = model.predict(x_test)
# Unscale it
y_unscaled = scaler.inverse_transform(y_pred)
# See the prediction accuracy
fig, ax = plt.subplots()
ax.plot(y_real[40:])
ax.plot(y_unscaled)
alt.show()
```

Resulting in.

This looks more like a moving average of the price and does not to a particular good job.

I am not surprised, as predicting stock prices is not anything easy. If you could do it with a simple model like this, then you would become rich really fast.

This is part of a FREE 10h Machine Learning course with Python.

• 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
• 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
• 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).

## What will we cover?

• What is PyTorch
• PyTorch vs Tensorflow
• Get started with PyTorch
• Work with image classification

## Step 1: What is PyTorch?

PyTorch is an optimized tensor library for deep learning using GPUs and CPUs.

What does that mean?

Well, PyTorch is an open source machine learning library and is used for computer vision and natural language processing. It is primarily developed by Facebook’s AI Research Lab.

## Step 2: PyTorch and Tensorflow

Often people worry about which framework to use not to waste time.

You probably do the same – but don’t worry, if you use either PyTorch or Tensorflow, then you are on the right track. They are the most popular Deep Learning frameworks, if you learn one, then you will have an easy time to switch to the other later.

PyTorch was release in 2016 by Facebook’s Research Lab, while Tensorflow was released in 2015 by Google Brain team.

Both are good choices for Deep Learning.

## Step 3: PyTorch and prepared datasets

PyTorch comes with a long list of prepared datasets and you can see them all here.

We will look at the MNIST dataset for handwritten digit-recognition.

In the video above we also look at the CIFAR10 data set, which consist of 32×32 images of 10 classes.

You can get a dataset by using torchvision.

```from torchvision import datasets
```

## Step 4: Getting the data and prepare data

First we need to get the data and prepare them by turning them into tensors and normalize them.

### Transforming and Normalizing

• Images are PIL objects in the MNIST dataset
• You need to be transformed to tensor (the datatype for Tensorflow)
• torchvision has transformations transform.ToTensor(), which turns NumPy arrays and PIL to Tensor
• Then you need to normalize images
• Need to determine the mean value and the standard deviation
• Then we can apply nomalization
• torchvision has transform.Normalize, which takes mean and standard deviation
```from torchvision import datasets
from torchvision import transforms
import torch
import torch.nn as nn
from torch import optim
import matplotlib.pyplot as plt
imgs = torch.stack([img_t for img_t, _ in mnist], dim=3)
print('get mean')
print(imgs.view(1, -1).mean(dim=1))
print('get standard deviation')
print(imgs.view(1, -1).std(dim=1))
```

Then we can use those values to make the transformation.

```mnist = datasets.MNIST(data_path, train=True, download=False,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307),
(0.3081))]))
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307),
(0.3081))]))
```

## Step 5: Creating and testing a Model

The model we will use will be as follows.

We can model that as follows.

```input_size = 784 # ?? 28*28
hidden_sizes = [128, 64]
output_size = 10
model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
nn.ReLU(),
nn.Linear(hidden_sizes[0], hidden_sizes[1]),
nn.ReLU(),
nn.Linear(hidden_sizes[1], output_size),
nn.LogSoftmax(dim=1))
```

Then we can train the model as follows

```train_loader = torch.utils.data.DataLoader(mnist, batch_size=64,
shuffle=True)
optimizer = optim.SGD(model.parameters(), lr=0.01)
loss_fn = nn.NLLLoss()
n_epochs = 10
for epoch in range(n_epochs):
batch_size = imgs.shape[0]
output = model(imgs.view(batch_size, -1))
loss = loss_fn(output, labels)
loss.backward()
optimizer.step()
print("Epoch: %d, Loss: %f" % (epoch, float(loss)))
```

And finally, test our model.

```val_loader = torch.utils.data.DataLoader(mnist_val, batch_size=64,
shuffle=True)

correct = 0
total = 0
batch_size = imgs.shape[0]
outputs = model(imgs.view(batch_size, -1))
_, predicted = torch.max(outputs, dim=1)
total += labels.shape[0]
correct += int((predicted == labels).sum())
print("Accuracy: %f", correct / total)
```

Reaching an accuracy of 96.44%

Want better results? Try using a CNN model.

This is part of a FREE 10h Machine Learning course with Python.

• 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
• 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
• 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).

## What will we cover?

• Understand what Convolutional Neural Network (CNN) is
• The strength of CNN
• How to use it to detect handwriting
• Extract features from pictures
• Learn Convolution, Pooling and Flatten
• How to create a CNN

## Step 1: What is Computer Vision?

Computational methods for analyzing and understanding digital images.

An example could be detecting handwriting.

Assuming familiarity with Deep Neural Network a naive approach would be to map one pixel to the input network, have some hidden layers, then detect.

If you are new to Artificial Neural Network or Deep Neural Network.

A Deep Neural Network could be given for images.

As follows.

But actually, we are not (the network) is not interested into any specifics of the pixels in the image. Also, what if the images are moved 1 pixel to the left, then this would influence the network. Hence, this approach seems not to be very good.

## Step 2: What is Image Convolution?

Image Convolution is applying a filter that adds each pixel value of an image to its neighbors, weighted according to a kernel matrix.

A few techniques are given here.

### Pooling

• Reducing the size of an input by sampling from regoins in the input
• Bascially reducing the size of the image

### Max-Pooling

• Pooling by choosing the maximum value in each region

## Step 3: What is Convolutional Neural Network (CNN)?

Convolutional Neural Network (CNN) is a Neural Networks that use convolution for analyzing images.

### Idea of CNN is as follows.

• We have an input image
• Apply Convolution – possible several to get several features of the image (feature maps)
• Apply pooling (this reduces the input)
• Then flatten it out to traditional network

## Step 4: Handwriting detection with CNN

We will use the MNIST data base, which is a classical large datasets of handwritten digits.

Here is the code given below with some comments.

```import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
# https://en.wikipedia.org/wiki/MNIST_database
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Scale it to values 0 - 1
x_train = x_train / 255.0
x_test = x_test / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1)
# Creating a model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
```

Which gives an accuracy on 98%.

Want to compare your result with a model using PyTorch?

This is part of a FREE 10h Machine Learning course with Python.

• 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
• 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
• 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).

## What will we cover?

• Understand Deep Neural Network (DNN)
• How algorithms calculate weights in DNN
• Show tools to visually understand what DNN can solve

## Step 1: What is Deep Neural Network?

Be sure to read the Artificial Neural Network Guide.

The adjective “deep” in deep learning refers to the use of multiple layers in the network (Wiki).

Usually having two or more hidden layers counts as deep.

Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning.

## Step 2: How to train and difficulties in training DNN

Training an Artificial Neural Network only relies on finding weights from input to output nodes. In a Deep Neural Network (DNN) become a bit more complex and requires more techniques.

To do that we need backpropagation, which is an algorithm for training Neural Networks with hidden layers (DNN).

• Algorithm
• Repeat
• Calculate error for output layer
• For each layer – starting with output layer
• Propagate error back one layer
• Update weights

A problem you will encounter is overfitting. Which means to fit too close to training data and not generalize well.

That is, you fit the model to the training data, but the model will not predict well on data not coming from your training data.

To deal with that, dropout is a common technique.

• Temporarily remove units – selectat random – from a neural network to prevent over reliance on certain units
• Dropout value of 20%-50%
• Better performance when dropout is used on a larger network
• Dropout at each layer of the network has shown good results.
• Original Paper

## Step 3: Play around with it

Ideas to check that

• If you have no hidden layers then you can only fit with straight lines.
• If you add hidden layers you can model the XOR function.

## Step 4: A DNN model of XOR

Let’s go crazy and fit an XOR dataset with a DNN model.

```import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.scatter(x=data['x'], y=data['y'], c=data['class id'])
plt.show()
```

This is the data we want to fit.

Then let’s create it.

Remember to insert the dropout and play around with it.

```X_train, X_test, y_train, y_test = train_test_split(data[['x', 'y']], data['class id'], random_state=42)
accuracies = []
for i in range(5):
tf.random.set_seed(i)

model = Sequential()

model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0)
_, accuracy = model.evaluate(X_test, y_test)
accuracies.append(accuracy*100)

sum(accuracies)/len(accuracies)
```

Resulting in accuracy of 98%.