Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    OpenCV: Understand and Implement a Motion Tracker

    What will we cover in this tutorial?

    We will build and explain how a simple motion tracker works using OpenCV.

    The resulting program will be able to track objects you define from the stream on a webcam.

    Step 1: Understand the color histograms

    An image in OpenCV is represented in a NumPy array. If you are new to NumPy arrays, they are basically fixed dimensional arrays with a fixed type. You can get a short introduction in this tutorial.

    For simplicity let’s look at an example here.

    import cv2
    img = cv2.imread("pics/smile-00000.jpeg")

    This will print the shape of the NumPy (the dimensions of the NumPy array) as well as the NumPy array as well. Some of the output is given below.

    (6000, 4000, 3)
    [[[ 99 102  93]
      [ 87 104  91]
      [ 84 103  82]

    Where we see the picture is of size 6000×4000 pixels, where each pixel is represented by 3 integers. Here the each pixel is represented in a BGR representation of 3 integers in the range of 0-255 representing the intensity of Blue, Green, and Read, respectively.

    A histogram is of an image is counting how many occurrences there are of each pixel representation. A histogram of the above would normally be represented by three graphs, one for each of the colors Blue, Green, and Read.

    It turns out, that such a representation can act quite similar as a fingerprint of the object in the picture. If we see a similar fingerprint, it could be the same object.

    To optimize the process, one can represent colors in other ways. In this tutorial we will use the HSV (Hue , Saturation, Value). This has the advantage, that the color (or hue) information is stored in the first coordinate. This means that we can get the a decent fingerprint of an image by only one graph instead of 3 (like with the BGR) representation.

    The above is an example of a histogram.

    Step 2: How does the motion tracker work

    On a high level we see the process in the following image.

    First notice that there is a pre-processing part and a continuous processing. The pre-processing is done once to capture a histogram of the object we want to track, while the continuous processing is done for each frame coming from the webcam and uses the histogram to find the object on the new frame.

    1. The first thing is to capture the frame from the webcam.
    2. Here the frame is converted to HSV to have a simple histogram. Also, the object should be identified and put into a box (frame).
    3. A histogram based on the framed object is calculated and normalized.
    4. The continuous processing starts the same way as the pre-processing. Capture a frame from the webcam.
    5. Convert that frame into HSV.
    6. Take the HSV converted frame and use the histogram from the pre-processing to back project (basically the reverse process of making a histogram) and find a the best match in the near neighborhood of the former location of the object by using mean shift.
    7. Finally, update the box around the current detected position of the object.

    Continue steps 4 to 7 until you get bored.

    Ready to implement it.

    Step 3: Implementation of the motion tracker

    Using the numbering above with comments in the code.

    import numpy as np
    import cv2
    def main(width=640, height=480):
        cap = cv2.VideoCapture(0)
        cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
        cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
        # Step 1: Capture the first frame of the webcam
        _, frame = cap.read()
        # For ease, let's flip it
        frame = cv2.flip(frame, 1)
        # Step 2: First we frame the object
        x, y, w, h = 300, 200, 100, 50
        track_window = (x, y, w, h)
        # set up the ROI for tracking
        roi = frame[y:y+h, x:x+w]
        # Step 2 (continued): Change the color space
        # This step seems not to be necessary?
        # HSV :  Hue , Saturation, Value) - smart way to convert colors
        hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
        # Step 2 (continued): Make a mask based on hue 0-180 : why is that good?
        # For HSV, Hue range is [0,179], Saturation range is [0,255] and Value range is [0,255]
        mask = cv2.inRange(hsv_roi, np.array((0., 60., 32.)), np.array((180., 255., 255.)))
        # Step 3: Calculate the histogram
        roi_hist = cv2.calcHist([hsv_roi], [0], mask, [180], [0, 180])
        # List of arguments:
        # images: Source arrays. They all should have the same depth, CV_8U, CV_16U or CV_32F , and the same size.
        # - Each of them can have an arbitrary number of channels.
        # channels: List of the dims channels used to compute the histogram.
        # - The first array channels are numerated from 0 to images[0].channels()-1 ,
        # - the second array channels are counted from images[0].channels() to images[0].channels() + images[1].channels()-1, and so on.
        # MASK: Optional mask. If the matrix is not empty, it must be an 8-bit array of the same size as images[i] .
        # - The non-zero mask elements mark the array elements counted in the histogram.
        # histSize	Array of histogram sizes in each dimension.
        # ranges: Array of the dims arrays of the histogram bin boundaries in each dimension.
        # Normalize the histogram to the range 0 - 255 (needed for calcBackProject)
        cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)
        # Setup the termination criteria, either 10 iteration or move by at least 1 pt
        # - Needed for meanShift to know when to terminate
        termination_criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)
        while True:
            # Step 4: Capture the next frame
            _, frame = cap.read()
            frame = cv2.flip(frame, 1)
            # Step 5: Change the color space to HSV
            hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
            # Step 6: Basically the reverse process of Histogram
            dst = cv2.calcBackProject([hsv], [0], roi_hist, [0, 180], 1)
            # Apply meanShift to get the new location
            _, track_window = cv2.meanShift(dst, track_window, termination_criteria)
            # Step 7: Draw it on image
            x, y, w, h = track_window
            frame = cv2.rectangle(frame, (x, y), (x+w, y+h), 255, 2)
            # Update the frame
            cv2.imshow('Tracking Frame', frame)
            k = cv2.waitKey(30)
            if k == 27:
        # Release the webcam and destroy the window
    if __name__ == "__main__":

    Let’s see how it works (in poor lightening in my living room).

    Python for Finance: Unlock Financial Freedom and Build Your Dream Life

    Discover the key to financial freedom and secure your dream life with Python for Finance!

    Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.

    Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.

    Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.

    Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!

    Python for Finance a 21 hours course that teaches investing with Python.

    Learn pandas, NumPy, Matplotlib for Financial Analysis & learn how to Automate Value Investing.

    “Excellent course for anyone trying to learn coding and investing.” – Lorenzo B.

    Leave a Comment