## What will we cover?

• Understand Neural Networks
• How you can model other machine learning techniques
• Activation functions
• How to make simple OR function
• Different ways to calcualte weights
• What Batch sizes and Epochs are

## Step 1: What is Artificial Neural Network

Artificial Neural Network are computing systems inspired by the biological neural networks that constitute animal brains.

Often just called Neural Network.

The first Neural Network is the following simple network.

Where w1 and w2 are weights and the nodes on the left represent input nodes and the node on the right is the output node.

It can also be represented with a function: h(x1, x2) = w0 + w1*x1 + w2*x2

This is a simple calculation, and the goal of the network is to find optimal weights. But we are still missing something. We need an activation function. That is, how to interpret the output.

Here are some possible activation functions.

• Step function: 𝑔(𝑥)=1 if 𝑥≥0, else 0
• Rectified linear unit (ReLU): 𝑔(𝑥)=max(0,𝑥)
• Sigmoid activation function: sigmoid(𝑥)=1/(1+exp(−𝑥))(x)

## Step 2: How to model the OR function

We see the weights are one each. Then let’s analyse it with the activation function g, given by Step function.

• x1 = 0 and x2=0 then we have g(-1 + x1 + x2) = g(-1 + 0 + 0) = g(-1) = 0
• x1 = 1 and x2=0 then we have g(-1 + x1 + x2) = g(-1 + 1 + 0) = g(0) = 1
• x1 = 0 and x2=1 then we have g(-1 + x1 + x2) = g(-1 + 0 + 1) = g(0) = 1
• x1 = 1 and x2=1 then we have g(-1 + x1 + x2) = g(-1 + 1 + 1) = g(-1) = 1

Exactly like the OR function.

## Step 3: Neural Network in the General Case and how to Calculate Weights

In general a Neural Networks can have any number of input and output nodes, where each input node is connected with each output node.

We will later learn about Deep Neural Network – where we can have any number of layers – but for now, let’s only focus Neural Networks with input and output layer.

To calculate weights there are several options.

• Calculate the weights (wiki)
• Algorithm for minimizing the loss when training neural networks

Pseudo algorithm

• Start with a random choice of weights
• Repeat:
• Calculate the gradient based on all data ponits direction that will lead to decreasing loss
• Update wieghts accorinding to the gradient

• Expensive to calculate for all data points

### Stocastic Gradient Descent

Pseudo algorithm

• Start with a random choice of weights
• Repeat:
• Calculate the gradient based on one data point direction that will lead to decreasing loss
• Update wieghts accorinding to the gradient

### Mini-Batch Gradient Descent

Pseudo algorithm

• Start with a random choice of weights
• Repeat:
• Calculate the gradient based on one small batch of data ponits direction that will lead to decreasing loss
• Update wieghts accorinding to the gradient

## Step 4: Perceptron

The perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class.

• Only capable of learning linearly separable decision boundary.
• It cannot model the XOR function (we need multi-layer perceptrons (multi-layer neural network))
• It can take multiple inputs and map linearly to one output with an activation function.

Let’s try some example to show it.

import numpy as np
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Sequential
import matplotlib.pyplot as plt
data = np.random.randn(200, 3)
data[:100, :2] += (10, 10)
data[:100, 2] = 0
data[100:, 2] = 1
fig, ax = plt.subplots()
ax.scatter(x=data[:,0], y=data[:,1], c=data[:,2])
plt.show()


This should be simple to validate if we can create a Neural Networks model to separate the two classes.

## Step 5: Creating a Neural Network

First let’s create a train and test set.

X = data[:,:2]
y = data[:,2]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)


Then we need to create the model and set batch size and epochs.

• Batch size: a set of N samples.
• Epoch: an arbitrary cutoff, generally defined as “one pass over the entire dataset”.
model = Sequential()
model.fit(X_train, y_train, epochs=1000, batch_size=32, verbose=0)
model.evaluate(X_test, y_test)


Which should give 1.000 (100%) accuracy.

This can be visualized by.

y_pred = model.predict(X)
y_pred = np.where(y_pred < .5, 0, 1)
fig, ax = plt.subplots()
ax.scatter(x=X[:,0], y=X[:,1], c=y_pred)
plt.show()


In the video we also show how to visualize the prediction in the different way.

This is part of a FREE 10h Machine Learning course with Python.

• 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
• 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
• 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).

## What will we cover?

In this lesson we will learn about Unsupervised learning.

• Understand how Unsupervised Learning is different from Supervised Learning
• How it can organize data without knowledge
• Understand how k-Means Clustering works
• Train a 𝑘-Means Cluster model

## Step 1: What is Unsupervised Learning?

Machine Learning is often divided into 3 main categories.

• Supervised: where you tell the algorithm what categories each data item is in. Each data item from the training set is tagged with the right answer.
• Unsupervised: is when the learning algorithm is not told what to do with it and it should make the structure itself.
• Reinforcement: teaches the machine to think for itself based on past action rewards.

Where we see that Unsupervised is one of the main groups.

Unsupervised learning is a type of algorithm that learns patterns from untagged data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a compact internal representation of its world and then generate imaginative content from it. In contrast to supervised learning where data is tagged by an expert, e.g. as a “ball” or “fish”, unsupervised methods exhibit self-organization that captures patterns as probability densities…

https://en.wikipedia.org/wiki/Unsupervised_learning

## Step 2: k-Means Clustering

What is clustering?

Organize a set of objects into groups in such a way that similar objects tend to be in the same group.

What is k-Means Clustering?

Algorithm for clustering data based on repeatedly assigning points to clusters and updating those clusters’ centers. Example of how it works in steps.
• First we chose random cluster centroids (hollow point), then assign points to neareast centroid.
• Then we update the centroid to be centered to the points.
• Repeat

This can be repeated a specific number of times or until only small change in centroids positions.

## Step 3: Create an Example

Let’s create some random data to demonstrate it.

import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
# Generate some numbers
data = np.random.randn(400,2)
data[:100] += 5, 5
data[100:200] += 10, 10
data[200:300] += 10, 5
data[300:] += 5, 10
fig, ax = plt.subplots()
ax.scatter(x=data[:,0], y=data[:,1])
plt.show()


This shows some random data in 4 clusters.

Then the following code demonstrates how it works. You can change max_iter to be the number iteration – try to do it for 1, 2, 3, etc.

model = KMeans(n_clusters=4, init='random', random_state=42, max_iter=10, n_init=1)
model.fit(data)
y_pred = model.predict(data)
fig, ax = plt.subplots()
ax.scatter(x=data[:,0], y=data[:,1], c=y_pred)
ax.scatter(x=model.cluster_centers_[:,0], y=model.cluster_centers_[:,1], c='r')
plt.show() After 1st iteration – the cluster centers are are no optimal After 10 iteration it is all in place

This is part of a FREE 10h Machine Learning course with Python.

• 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
• 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
• 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).

## What will we cover?

• Understand how Reinforcement Learning works
• Learn about Agent and Environment
• How it iterates and gets rewards based on action
• How to continuously learn new things
• Create own Reinforcement Learning from scratch

## Step 1: Reinforcement Learning simply explained

Reinforcement Learning is like training a dog. You and the dog talk different languages. This makes it difficult to explain the dog what you want.

A common way to train a dog is like Reinforcement Learning. When the dog does something good, it get’s a reward. This teaches the dog that you want it to do it.

Said differently, if we relate it to the illustration above. The Agent is the dog. The dog is exposed to an Environment called a state. Based on this Agent (the dog) takes an Action. Based on whether you (the owner) likes the Action, you Reward the Agent.

The goal of the Agent is to get the most Reward. This way it makes it possible for you the owner to get the desired behaviour with adjusting the Reward according to the Actions.

## Step 2: Markov Decision Process

The model for decision-making represents States (from the Environment), Actions (from the Agent), and the Rewards.

Written a bit mathematical.

• S is the set of States
• Actions(s) is the set of Actions when in state s
• The transition model is P(s´, s, a)
• The Reward function R(s, a, s’)

## Step 3: Q-Learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence “model-free”), and it can handle problems with stochastic transitions and rewards without requiring adaptations. (wiki)

This can be modeled by a learning function Q(s, a), which estimates the value of performing action a when in state s.

It works as follows

• Start with Q(s, a) = 0 for all s, a
• Update Q when we take an action

𝑄(𝑠,𝑎)=𝑄(𝑠,𝑎)+𝛼(Q(s,a)=Q(s,a)+α(reward+𝛾max(𝑠′,𝑎′)−𝑄(𝑠,𝑎))=(1−𝛼)𝑄(𝑠,𝑎)+𝛼(+γmax(s′,a′)−Q(s,a))=(1−α)Q(s,a)+α(reward+𝛾max(𝑠′,𝑎′))+γmax(s′,a′))

### The ϵ-Greedy Decision Making

The idea behind it is to either explore or exploit

• With probability ϵ take a random move
• Otherwise, take action 𝑎a with maximum 𝑄(𝑠,𝑎)

Let’s demonstrate it with code.

## Step 3: Code Example

Assume we have the following Environment

• You start at a random point.
• You can either move left or right.
• You loose if you hit a red box
• You win if you hit the green box

Quite simple, but how can you program an Agent using Reinforcement Learning? And how can you do it from scratch.

The great way is to use an object representing the field (environment).

To implement it all there are some background resources if needed.

#### What if there are more states?

import numpy as np
import random
class Field:
def __init__(self):
self.states = [-1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
self.state = random.randrange(0, len(self.states))

def done(self):
if self.states[self.state] != 0:
return True
else:
return False

# action: 0 => left
# action: 1 => right
def get_possible_actions(self):
actions = [0, 1]
if self.state == 0:
actions.remove(0)
if self.state == len(self.states) - 1:
actions.remove(1)
return actions
def update_next_state(self, action):
if action == 0:
if self.state == 0:
return self.state, -10
self.state -= 1
if action == 1:
if self.state == len(self.states) - 1:
return self.state, -10
self.state += 1

reward = self.states[self.state]
return self.state, reward
field = Field()
q_table = np.zeros((len(field.states), 2))
alpha = .5
epsilon = .5
gamma = .5
for _ in range(10000):
field = Field()
while not field.done():
actions = field.get_possible_actions()
if random.uniform(0, 1) < epsilon:
action = random.choice(actions)
else:
action = np.argmax(q_table[field.state])

cur_state = field.state
next_state, reward = field.update_next_state(action)

q_table[cur_state, action] = (1 - alpha)*q_table[cur_state, action] + alpha*(reward + gamma*np.max(q_table[next_state]))


## Step 4: A more complex Example

Check out the video to see a More complex example.