What will we cover in this tutorial?
How do you detect movements in a webcam stream? Also, how do you insert objects in a live webcam stream? Further, how do you change the position of the object based on the movements?
We will learn all that in this tutorial. The end result can be seen in the video below.
Step 1: Understand the flow of webcam processing
A webcam stream is processed frame-by-frame.

As the above illustration shows, when the webcam captures the next frame, the actual processing often happens on a copy of the original frame. When all the updates and calculations are done, they are inserted in the original frame.
This is interesting. To extract information from the webcam frame we need to work with the frame and find the features we are looking for.
In our example, we need to find movement and based on that see if that movement is touching our object.
A simple flow without any processing would look like this.
import cv2
# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
# To detect movement (to get the background)
background_subtractor = cv2.createBackgroundSubtractorMOG2()
# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
# Read the a frame from webcam
_, frame = cap.read()
# Flip the frame
frame = cv2.flip(frame, 1)
# Show the frame in a window
cv2.imshow('WebCam', frame)
# Check if q has been pressed to quit
if cv2.waitKey(1) == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
The above code will create a direct stream from your webcam to a window.
Step 2: Insert a logo – do it with a class that we will extend later
Here we want to insert a logo in a fixed position in our webcam stream. This can be achieved be the following code. The main difference is the new object Object defined and created.
The object briefly explained
- The object will represent the logo we want to insert.
- It will keep the current position (which is static so far)
- The logo itself.
- The mask used to insert it later (when insert_object is called).
- The constructor (__init__(…)) does the stuff only needed once. Read the logo (it assumes you have a file named logo.png in the same folder), resize it, creating a mask (by gray scaling and thresholding), setting the initial positions of the logo.
Before the while-loop the object obj is created. All that is needed at this stage is to insert the logo in each frame.
import cv2
import numpy as np
# Object class to insert logo
class Object:
def __init__(self, start_x=100, start_y=100, size=50):
self.logo_org = cv2.imread('logo.png')
self.size = size
self.logo = cv2.resize(self.logo_org, (size, size))
img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
_, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
self.logo_mask = logo_mask
self.x = start_x
self.y = start_y
def insert_object(self, frame):
roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
roi[np.where(self.logo_mask)] = 0
roi += self.logo
# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
# Read the a frame from webcam
_, frame = cap.read()
# Flip the frame
frame = cv2.flip(frame, 1)
# Insert the object into the frame
obj.insert_object(frame)
# Show the frame in a window
cv2.imshow('WebCam', frame)
# Check if q has been pressed to quit
if cv2.waitKey(1) == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
This will result in the following output (when you put me in front of the webcam – that said, if you do it, expect that you sit in the picture and not me (just want to avoid any uncomfortable surprises for you when you show up in the window)).

For more details on how to insert a logo in a live webcam stream, you can read this tutorial.
Step 3: Detect movement in the frame
Detecting movement is not a simple task. Depending on your needs, it can be solved quite simple. In this tutorial we only need to detect simple movement. That is, if you are in the frame and sit still, we do not care to detect it. We only care to detect the actual movement.
We can solve that problem by using the library function createBackgroundSubtractorMOG2(), which can “remove” the background from your frame. It is far from a perfect solution, but it is sufficient for what we want to achieve.
As we only want to see if there is movement or not, and not how much the difference is from previous detected background, we will use a threshold function to make the image black and white based on that. We set the threshold quite high, as it will also remove noise from the image.
It might happen that in your settings (lightening etc.) you need to adjust that value. See the comments in the code how to do that.
import cv2
import numpy as np
# Object class to insert logo
class Object:
def __init__(self, start_x=100, start_y=100, size=50):
self.logo_org = cv2.imread('logo.png')
self.size = size
self.logo = cv2.resize(self.logo_org, (size, size))
img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
_, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
self.logo_mask = logo_mask
self.x = start_x
self.y = start_y
def insert_object(self, frame):
roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
roi[np.where(self.logo_mask)] = 0
roi += self.logo
# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
# To detect movement (to get the background)
background_subtractor = cv2.createBackgroundSubtractorMOG2()
# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
# Read the a frame from webcam
_, frame = cap.read()
# Flip the frame
frame = cv2.flip(frame, 1)
# Get the foreground mask (it is gray scale)
fg_mask = background_subtractor.apply(frame)
# Convert the gray scale to black and white with a threshold
# Change the 250 threshold fitting your webcam and needs
# - Setting it lower will make it more sensitive (also to noise)
_, fg_mask = cv2.threshold(fg_mask, 250, 255, cv2.THRESH_BINARY)
# Insert the object into the frame
obj.insert_object(frame)
# Show the frame in a window
cv2.imshow('WebCam', frame)
# To see the foreground mask
cv2.imshow('fg_mask', fg_mask)
# Check if q has been pressed to quit
if cv2.waitKey(1) == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
This results in the following output.

As you see, it does a decent job to detect movement. Sometimes it happens that you create a shadow after your movements. Hence, it is not perfect.
Step 4: Detecting movement where the object is and move it accordingly
This is the tricky part. But let’s break it down simple.
- We need to detect if the mask, we created in previous step, is overlapping with the object (logo).
- If so, we want to move the object (logo).
That is what we want to achieve.
How do we do that?
- Detect if there is an overlap by using the same mask we create for the logo and see if it overlaps with any points on the mask of the movement.
- If so, we move the object by choosing a random movement. Measure how much overlap is. Then choose another random movement. See if the overlap is less.
- Continue this a few times and chose the random movement with the least overlap.
This turns out to by chance to move away from the overlapping areas. This is the power of introducing some randomness, which simplifies the algorithm a lot.
A more precise approach would be to calculate in which direction the least mask is close to the object (logo). This becomes quite complicated and needs a lot of calculations. Hence, we chose to have this simple approach, which has both a speed element and direction element that works fairly well.
All we need to do, is to add a update_position function to our class and call it before we insert the logo.
import cv2
import numpy as np
# Object class to insert logo
class Object:
def __init__(self, start_x=100, start_y=100, size=50):
self.logo_org = cv2.imread('logo.png')
self.size = size
self.logo = cv2.resize(self.logo_org, (size, size))
img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
_, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
self.logo_mask = logo_mask
self.x = start_x
self.y = start_y
self.on_mask = False
def insert_object(self, frame):
roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
roi[np.where(self.logo_mask)] = 0
roi += self.logo
def update_position(self, mask):
height, width = mask.shape
# Check if object is overlapping with moving parts
roi = mask[self.y:self.y + self.size, self.x:self.x + self.size]
check = np.any(roi[np.where(self.logo_mask)])
# If object has moving parts, then find new position
if check:
# To save the best possible movement
best_delta_x = 0
best_delta_y = 0
best_fit = np.inf
# Try 8 different positions
for _ in range(8):
# Pick a random position
delta_x = np.random.randint(-15, 15)
delta_y = np.random.randint(-15, 15)
# Ensure we are inside the frame, if outside, skip and continue
if self.y + self.size + delta_y > height or self.y + delta_y < 0 or \
self.x + self.size + delta_x > width or self.x + delta_x < 0:
continue
# Calculate how much overlap
roi = mask[self.y + delta_y:self.y + delta_y + self.size, self.x + delta_x:self.x + delta_x + self.size]
check = np.count_nonzero(roi[np.where(self.logo_mask)])
# If perfect fit (no overlap), just return
if check == 0:
self.x += delta_x
self.y += delta_y
return
# If a better fit found, save it
elif check < best_fit:
best_fit = check
best_delta_x = delta_x
best_delta_y = delta_y
# After for-loop, update to best fit (if any found)
if best_fit < np.inf:
self.x += best_delta_x
self.y += best_delta_y
return
# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
# To detect movement (to get the background)
background_subtractor = cv2.createBackgroundSubtractorMOG2()
# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
# Read the a frame from webcam
_, frame = cap.read()
# Flip the frame
frame = cv2.flip(frame, 1)
# Get the foreground mask (it is gray scale)
fg_mask = background_subtractor.apply(frame)
# Convert the gray scale to black and white with a threshold
# Change the 250 threshold fitting your webcam and needs
# - Setting it lower will make it more sensitive (also to noise)
_, fg_mask = cv2.threshold(fg_mask, 250, 255, cv2.THRESH_BINARY)
# Find a new position for object (logo)
# - fg_mask contains all moving parts
# - updated position will be the one with least moving parts
obj.update_position(fg_mask)
# Insert the object into the frame
obj.insert_object(frame)
# Show the frame in a window
cv2.imshow('WebCam', frame)
# To see the fg_mask uncomment the line below
# cv2.imshow('fg_mask', fg_mask)
# Check if q has been pressed to quit
if cv2.waitKey(1) == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Step 5: Test it
Well, this is the fun part. See a live demo in the video below.
What is next step?
I would be happy to hear any suggestions from you. I see a lot of potential improvements, but the conceptual idea is explained and showed in this tutorial.
Python for Finance: Unlock Financial Freedom and Build Your Dream Life
Discover the key to financial freedom and secure your dream life with Python for Finance!
Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.
Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.
Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.
Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!
Python for Finance a 21 hours course that teaches investing with Python.
Learn pandas, NumPy, Matplotlib for Financial Analysis & learn how to Automate Value Investing.
“Excellent course for anyone trying to learn coding and investing.” – Lorenzo B.
