OpenCV + Python + Webcam: How to Track and Replace Object

What will we cover in this tutorial?

In this tutorial we will look into how you can track an object with a specific color and replace it with a new object. The inserted new object will be scaled to the size of the object tracked. This will be done on a live stream from the webcam.

Understand the process from webcam and feeding it to a window

First thing to understand is that when processing a live stream from a webcam you are actually processing it frame by frame.

Hence, the base code is as follows.

import cv2

# Get the webcam
cap = cv2.VideoCapture(0)

while True:
    # Step 1: Capture the frame
    _, frame =

    # Step 2: Show the frame with blurred background
    cv2.imshow("Webcam", frame)
    # If q is pressed terminate
    if cv2.waitKey(1) == ord('q'):

# Release and destroy all windows

First we import the OpenCV library cv2. If you need help to install it read this tutorial. Then you capture the webcam by calling the cv2.VideoCapture(0), where we assume you have 1 webcam and it is the first one (0).

The the while-loop where you capture the video stream frame by frame. It is done calling the, which returns a return code and the frame (we ignore the return code _).

To show the frame we read from the webcam, we call the cv2.imshow(“Webcam”, frame), which will create a window with the frame (image from your webcam).

The final part of the while-loop is checking if the key q has been pressed, if so, break out of the while-loop and release webcam and destroy all windows.

That is how processing works for webcam flow. The processing will be between step 1 and step 2 in the above code. Pro-processing and setup is most often done before the while-loop.

The process flow to identify and track object to insert scaled logo

In the last section we looked at how a webcam stream is processed. Then in this section we will explain the process for how to identify a object by color, scale the object we want to insert, and how to insert it into the frame.

The process is depicted in the image below followed by an explanation of all the steps.

The process of finding area to insert logo, masking it out, inserting and showing the frame.

The steps are described here.

  1. This is the step where we capture the raw frame from the webcam.
  2. To easier identify a specific color object in the frame, we convert the image to the HSV color model. It contains of Hue, Saturation, and Volume.
  3. Make a mask with all object of the specific color. This is where the HSV color model makes it easy.
  4. To make it more visible and easier for detection, we dilate the mask.
  5. Then we find all the contours in the mask.
  6. We loop over all the contours found. Ideally we only find one, but there might be small objects, which we will discard.
  7. Based on the contour found, get the size of it, which we use to scale (resize) the logo we want to insert.
  8. Resize the logo to fit the size of the contour.
  9. As the logo is not square, we need to create a mask to insert it.
  10. To insert it easily, we create a RIO (region of image) where the contour is. This is nothing needed, just makes it easier to avoid a lot of extra calculations. If you know NumPy, it is a view into it.
  11. Then we insert the logo using the mask.
  12. Finally, time to show the frame.

The implementation

The code following the steps described in the previous section is found here.

import cv2
import time
import imutils
import numpy as np

# Get the webcam
cap = cv2.VideoCapture(0)
# Setup the width and the height (your cam might not support these settings)
width = 640
height = 480
cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)

# Read the logo to use later
logo_org = cv2.imread('logo.png')

# Time is just used to get the Frames Per Second (FPS)
last_time = time.time()
while True:
    # Step 1: Capture the frame
    _, frame =

    # Step 2: Convert to the HSV color space
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    # Step 3: Create a mask based on medium to high Saturation and Value
    # - Hue 8-10 is about orange, which we will use
    # - These values can be changed (the lower ones) to fit your environment
    mask = cv2.inRange(hsv, (8, 180, 180), (10, 255, 255))
    # Step 4: This dilates with two iterations (makes it more visible)
    thresh = cv2.dilate(mask, None, iterations=2)
    # Step 5: Finds contours and converts it to a list
    contours = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contours = imutils.grab_contours(contours)

    # Step 6: Loops over all objects found
    for contour in contours:
        # Skip if contour is small (can be adjusted)
        if cv2.contourArea(contour) < 750:

        # Step 7: Get the box boundaries
        (x, y, w, h) = cv2.boundingRect(contour)
        # Compute size
        size = (h + w)//2

        # Check if logo will be inside frame
        if y + size < height and x + size < width:
            # Step 8: Resize logo
            logo = cv2.resize(logo_org, (size, size))
            # Step 9: Create a mask of logo
            img2gray = cv2.cvtColor(logo, cv2.COLOR_BGR2GRAY)
            _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)

            # Step 10: Region of Image (ROI), where we want to insert logo
            roi = frame[y:y+size, x:x+size]

            # Step 11: Mask out logo region and insert
            roi[np.where(logo_mask)] = 0
            roi += logo

    # (Extra) Add a FPS label to image
    text = f"FPS: {int(1 / (time.time() - last_time))}"
    last_time = time.time()
    cv2.putText(frame, text, (10, 20), cv2.FONT_HERSHEY_PLAIN, 2, (0, 255, 0), 2)

    # Step 12: Show the frame
    cv2.imshow("Webcam", frame)
    # If q is pressed terminate
    if cv2.waitKey(1) == ord('q'):

# Release and destroy all windows

Time to test it.

Testing the code

When using your webcam, you might need to change the colors. I used the following setting for the blue marker in my video.

    mask = cv2.inRange(hsv, (110, 120, 120), (130, 255, 255))

The two 3-tuples are HSV color space representation. The item of the tuples is setting the Hue. Here is 110 and 130. That means the color range we want to mask out is from 110-130, which you can see is in the blue range (image below). The other two are Saturation from 120-255 and Value from 120-255. To fit your camera and light settings, you need to change that range.

Where you can see the HSV color specter here.

HSV color space for OpenCV

You might need to choose different values.


Recent Posts

Build and Deploy an AI App

Build and Deploy an AI App with Python Flask, OpenAI API, and Google Cloud: In…

4 days ago

Building Python REST APIs with gcloud Serverless

Python REST APIs with gcloud Serverless In the fast-paced world of application development, building robust…

4 days ago

Accelerate Your Web App Development Journey with Python and Docker

App Development with Python using Docker Are you an aspiring app developer looking to level…

5 days ago

Data Science Course Made Easy: Unlocking the Path to Success

Why Value-driven Data Science is the Key to Your Success In the world of data…

1 week ago

15 Machine Learning Projects: From Beginner to Pro

Harnessing the Power of Project-Based Learning and Python for Machine Learning Mastery In today's data-driven…

2 weeks ago

Unlock the Power of Python: 17 Project-Based Lessons from Zero to Machine Learning

Is Python the right choice for Machine Learning? Should you learn Python for Machine Learning?…

2 weeks ago