What will we cover in this tutorial?
How do you detect movements in a webcam stream? Also, how do you insert objects in a live webcam stream? Further, how do you change the position of the object based on the movements?
We will learn all that in this tutorial. The end result can be seen in the video below.
Step 1: Understand the flow of webcam processing
A webcam stream is processed frame-by-frame.

As the above illustration shows, when the webcam captures the next frame, the actual processing often happens on a copy of the original frame. When all the updates and calculations are done, they are inserted in the original frame.
This is interesting. To extract information from the webcam frame we need to work with the frame and find the features we are looking for.
In our example, we need to find movement and based on that see if that movement is touching our object.
A simple flow without any processing would look like this.
import cv2 # Get the webcam (default webcam is 0) cap = cv2.VideoCapture(0) # If your webcam does not support 640 x 480, this will find another resolution cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640) cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480) # To detect movement (to get the background) background_subtractor = cv2.createBackgroundSubtractorMOG2() # This will create an object obj = Object() # Loop forever (or until break) while True: # Read the a frame from webcam _, frame = cap.read() # Flip the frame frame = cv2.flip(frame, 1) # Show the frame in a window cv2.imshow('WebCam', frame) # Check if q has been pressed to quit if cv2.waitKey(1) == ord('q'): break # When everything done, release the capture cap.release() cv2.destroyAllWindows()
The above code will create a direct stream from your webcam to a window.
Step 2: Insert a logo – do it with a class that we will extend later
Here we want to insert a logo in a fixed position in our webcam stream. This can be achieved be the following code. The main difference is the new object Object defined and created.
The object briefly explained
- The object will represent the logo we want to insert.
- It will keep the current position (which is static so far)
- The logo itself.
- The mask used to insert it later (when insert_object is called).
- The constructor (__init__(…)) does the stuff only needed once. Read the logo (it assumes you have a file named logo.png in the same folder), resize it, creating a mask (by gray scaling and thresholding), setting the initial positions of the logo.
Before the while-loop the object obj is created. All that is needed at this stage is to insert the logo in each frame.
import cv2 import numpy as np # Object class to insert logo class Object: def __init__(self, start_x=100, start_y=100, size=50): self.logo_org = cv2.imread('logo.png') self.size = size self.logo = cv2.resize(self.logo_org, (size, size)) img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY) _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY) self.logo_mask = logo_mask self.x = start_x self.y = start_y def insert_object(self, frame): roi = frame[self.y:self.y + self.size, self.x:self.x + self.size] roi[np.where(self.logo_mask)] = 0 roi += self.logo # Get the webcam (default webcam is 0) cap = cv2.VideoCapture(0) # If your webcam does not support 640 x 480, this will find another resolution cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640) cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480) # This will create an object obj = Object() # Loop forever (or until break) while True: # Read the a frame from webcam _, frame = cap.read() # Flip the frame frame = cv2.flip(frame, 1) # Insert the object into the frame obj.insert_object(frame) # Show the frame in a window cv2.imshow('WebCam', frame) # Check if q has been pressed to quit if cv2.waitKey(1) == ord('q'): break # When everything done, release the capture cap.release() cv2.destroyAllWindows()
This will result in the following output (when you put me in front of the webcam – that said, if you do it, expect that you sit in the picture and not me (just want to avoid any uncomfortable surprises for you when you show up in the window)).




For more details on how to insert a logo in a live webcam stream, you can read this tutorial.
Step 3: Detect movement in the frame
Detecting movement is not a simple task. Depending on your needs, it can be solved quite simple. In this tutorial we only need to detect simple movement. That is, if you are in the frame and sit still, we do not care to detect it. We only care to detect the actual movement.
We can solve that problem by using the library function createBackgroundSubtractorMOG2(), which can “remove” the background from your frame. It is far from a perfect solution, but it is sufficient for what we want to achieve.
As we only want to see if there is movement or not, and not how much the difference is from previous detected background, we will use a threshold function to make the image black and white based on that. We set the threshold quite high, as it will also remove noise from the image.
It might happen that in your settings (lightening etc.) you need to adjust that value. See the comments in the code how to do that.
import cv2 import numpy as np # Object class to insert logo class Object: def __init__(self, start_x=100, start_y=100, size=50): self.logo_org = cv2.imread('logo.png') self.size = size self.logo = cv2.resize(self.logo_org, (size, size)) img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY) _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY) self.logo_mask = logo_mask self.x = start_x self.y = start_y def insert_object(self, frame): roi = frame[self.y:self.y + self.size, self.x:self.x + self.size] roi[np.where(self.logo_mask)] = 0 roi += self.logo # Get the webcam (default webcam is 0) cap = cv2.VideoCapture(0) # If your webcam does not support 640 x 480, this will find another resolution cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640) cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480) # To detect movement (to get the background) background_subtractor = cv2.createBackgroundSubtractorMOG2() # This will create an object obj = Object() # Loop forever (or until break) while True: # Read the a frame from webcam _, frame = cap.read() # Flip the frame frame = cv2.flip(frame, 1) # Get the foreground mask (it is gray scale) fg_mask = background_subtractor.apply(frame) # Convert the gray scale to black and white with a threshold # Change the 250 threshold fitting your webcam and needs # - Setting it lower will make it more sensitive (also to noise) _, fg_mask = cv2.threshold(fg_mask, 250, 255, cv2.THRESH_BINARY) # Insert the object into the frame obj.insert_object(frame) # Show the frame in a window cv2.imshow('WebCam', frame) # To see the foreground mask cv2.imshow('fg_mask', fg_mask) # Check if q has been pressed to quit if cv2.waitKey(1) == ord('q'): break # When everything done, release the capture cap.release() cv2.destroyAllWindows()
This results in the following output.




As you see, it does a decent job to detect movement. Sometimes it happens that you create a shadow after your movements. Hence, it is not perfect.
Step 4: Detecting movement where the object is and move it accordingly
This is the tricky part. But let’s break it down simple.
- We need to detect if the mask, we created in previous step, is overlapping with the object (logo).
- If so, we want to move the object (logo).
That is what we want to achieve.
How do we do that?
- Detect if there is an overlap by using the same mask we create for the logo and see if it overlaps with any points on the mask of the movement.
- If so, we move the object by choosing a random movement. Measure how much overlap is. Then choose another random movement. See if the overlap is less.
- Continue this a few times and chose the random movement with the least overlap.
This turns out to by chance to move away from the overlapping areas. This is the power of introducing some randomness, which simplifies the algorithm a lot.
A more precise approach would be to calculate in which direction the least mask is close to the object (logo). This becomes quite complicated and needs a lot of calculations. Hence, we chose to have this simple approach, which has both a speed element and direction element that works fairly well.
All we need to do, is to add a update_position function to our class and call it before we insert the logo.
import cv2 import numpy as np # Object class to insert logo class Object: def __init__(self, start_x=100, start_y=100, size=50): self.logo_org = cv2.imread('logo.png') self.size = size self.logo = cv2.resize(self.logo_org, (size, size)) img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY) _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY) self.logo_mask = logo_mask self.x = start_x self.y = start_y self.on_mask = False def insert_object(self, frame): roi = frame[self.y:self.y + self.size, self.x:self.x + self.size] roi[np.where(self.logo_mask)] = 0 roi += self.logo def update_position(self, mask): height, width = mask.shape # Check if object is overlapping with moving parts roi = mask[self.y:self.y + self.size, self.x:self.x + self.size] check = np.any(roi[np.where(self.logo_mask)]) # If object has moving parts, then find new position if check: # To save the best possible movement best_delta_x = 0 best_delta_y = 0 best_fit = np.inf # Try 8 different positions for _ in range(8): # Pick a random position delta_x = np.random.randint(-15, 15) delta_y = np.random.randint(-15, 15) # Ensure we are inside the frame, if outside, skip and continue if self.y + self.size + delta_y > height or self.y + delta_y < 0 or \ self.x + self.size + delta_x > width or self.x + delta_x < 0: continue # Calculate how much overlap roi = mask[self.y + delta_y:self.y + delta_y + self.size, self.x + delta_x:self.x + delta_x + self.size] check = np.count_nonzero(roi[np.where(self.logo_mask)]) # If perfect fit (no overlap), just return if check == 0: self.x += delta_x self.y += delta_y return # If a better fit found, save it elif check < best_fit: best_fit = check best_delta_x = delta_x best_delta_y = delta_y # After for-loop, update to best fit (if any found) if best_fit < np.inf: self.x += best_delta_x self.y += best_delta_y return # Get the webcam (default webcam is 0) cap = cv2.VideoCapture(0) # If your webcam does not support 640 x 480, this will find another resolution cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640) cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480) # To detect movement (to get the background) background_subtractor = cv2.createBackgroundSubtractorMOG2() # This will create an object obj = Object() # Loop forever (or until break) while True: # Read the a frame from webcam _, frame = cap.read() # Flip the frame frame = cv2.flip(frame, 1) # Get the foreground mask (it is gray scale) fg_mask = background_subtractor.apply(frame) # Convert the gray scale to black and white with a threshold # Change the 250 threshold fitting your webcam and needs # - Setting it lower will make it more sensitive (also to noise) _, fg_mask = cv2.threshold(fg_mask, 250, 255, cv2.THRESH_BINARY) # Find a new position for object (logo) # - fg_mask contains all moving parts # - updated position will be the one with least moving parts obj.update_position(fg_mask) # Insert the object into the frame obj.insert_object(frame) # Show the frame in a window cv2.imshow('WebCam', frame) # To see the fg_mask uncomment the line below # cv2.imshow('fg_mask', fg_mask) # Check if q has been pressed to quit if cv2.waitKey(1) == ord('q'): break # When everything done, release the capture cap.release() cv2.destroyAllWindows()
Step 5: Test it
Well, this is the fun part. See a live demo in the video below.
What is next step?
I would be happy to hear any suggestions from you. I see a lot of potential improvements, but the conceptual idea is explained and showed in this tutorial.