Reinforcement Learning offers several advantages and opens up exciting opportunities in the field of machine learning:
In this lesson, you will explore the foundations and practical aspects of Reinforcement Learning. The following topics will be covered:
By the end of this lesson, you will have a strong foundation in Reinforcement Learning and be equipped to apply these techniques to solve complex problems, paving the way for exciting opportunities in the field of machine learning.
Reinforcement Learning is like training a dog. You and the dog talk different languages. This makes it difficult to explain the dog what you want.
A common way to train a dog is like Reinforcement Learning. When the dog does something good, it get’s a reward. This teaches the dog that you want it to do it.
Said differently, if we relate it to the illustration above. The Agent is the dog. The dog is exposed to an Environment called a state. Based on this Agent (the dog) takes an Action. Based on whether you (the owner) likes the Action, you Reward the Agent.
The goal of the Agent is to get the most Reward. This way it makes it possible for you the owner to get the desired behaviour with adjusting the Reward according to the Actions.
The model for decision-making represents States (from the Environment), Actions (from the Agent), and the Rewards.
Written a bit mathematical.
Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in particular state. It does not require a model of the environment (hence “model-free”), and it can handle problems with stochastic transitions and rewards without requiring adaptations. (wiki)
This can be modeled by a learning function Q(s, a), which estimates the value of performing action a when in state s.
It works as follows
𝑄(𝑠,𝑎)=𝑄(𝑠,𝑎)+𝛼(Q(s,a)=Q(s,a)+α(reward+𝛾max(𝑠′,𝑎′)−𝑄(𝑠,𝑎))=(1−𝛼)𝑄(𝑠,𝑎)+𝛼(+γmax(s′,a′)−Q(s,a))=(1−α)Q(s,a)+α(reward+𝛾max(𝑠′,𝑎′))+γmax(s′,a′))
The idea behind it is to either explore or exploit
Let’s demonstrate it with code.
Assume we have the following Environment
Quite simple, but how can you program an Agent using Reinforcement Learning? And how can you do it from scratch.
The great way is to use an object representing the field (environment).
To implement it all there are some background resources if needed.
import numpy as np
import random
class Field:
def __init__(self):
self.states = [-1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
self.state = random.randrange(0, len(self.states))
def done(self):
if self.states[self.state] != 0:
return True
else:
return False
# action: 0 => left
# action: 1 => right
def get_possible_actions(self):
actions = [0, 1]
if self.state == 0:
actions.remove(0)
if self.state == len(self.states) - 1:
actions.remove(1)
return actions
def update_next_state(self, action):
if action == 0:
if self.state == 0:
return self.state, -10
self.state -= 1
if action == 1:
if self.state == len(self.states) - 1:
return self.state, -10
self.state += 1
reward = self.states[self.state]
return self.state, reward
field = Field()
q_table = np.zeros((len(field.states), 2))
alpha = .5
epsilon = .5
gamma = .5
for _ in range(10000):
field = Field()
while not field.done():
actions = field.get_possible_actions()
if random.uniform(0, 1) < epsilon:
action = random.choice(actions)
else:
action = np.argmax(q_table[field.state])
cur_state = field.state
next_state, reward = field.update_next_state(action)
q_table[cur_state, action] = (1 - alpha)*q_table[cur_state, action] + alpha*(reward + gamma*np.max(q_table[next_state]))
Check out the video to see a More complex example.
In the next lesson you will learn Unsupervised Learning with k-Means Clustering.
This is part of a FREE 10h Machine Learning course with Python.
Build and Deploy an AI App with Python Flask, OpenAI API, and Google Cloud: In…
Python REST APIs with gcloud Serverless In the fast-paced world of application development, building robust…
App Development with Python using Docker Are you an aspiring app developer looking to level…
Why Value-driven Data Science is the Key to Your Success In the world of data…
Harnessing the Power of Project-Based Learning and Python for Machine Learning Mastery In today's data-driven…
Is Python the right choice for Machine Learning? Should you learn Python for Machine Learning?…