Using Numba for Efficient Frame Modifications in OpenCV

What will we cover in this tutorial?

We will compare the speed for using Numba optimization when making calculations and modifications on frames from a video stream using OpenCV.

In this tutorial we will divide each frame into same size boxes and calculate the average color for each box. Then make a frame which colors each box to that color.

See the effect down in the video. These calculations are expensive in Python, hence we will compare the performance by using Numba.

Step 1: Understand the process requirements

Each video frame from OpenCV is an image represented by a NumPy array. In this example we will use the webcam to capture a video stream and do the calculations and modifications live on the stream. This sets high requirements to the processing time of each frame.

To keep a fluid motion picture we need to show each frame in 1/25 of a second. That leaves at most 0.04 seconds for each frame, from capture, process, and update the window with the video stream.

While the capture and updating the window takes time, it leaves is a great uncertainty how fast the frame processing (calculations and modifications) should be, but a upper bound is 0.04 seconds per frame.

Step 2: The calculations and modifications on each frame

Let’s have some fun. The calculations and modification we want to apply to each frame are as follows.

  • Calculations. We divide each frame into small 6×16 pixels areas and calculate the average color for each area. To get the average color we calculate the average of each channel (BGR).
  • Modification. For each area we will change the color for each area and fill it entirely with the average color.

This can be done by adding this function to process each frame.

def process(frame, box_height=6, box_width=16):
    height, width, _ = frame.shape
    for i in range(0, height, box_height):
        for j in range(0, width, box_width):
            roi = frame[i:i + box_height, j:j + box_width]
            b_mean = np.mean(roi[:, :, 0])
            g_mean = np.mean(roi[:, :, 1])
            r_mean = np.mean(roi[:, :, 2])
            roi[:, :, 0] = b_mean
            roi[:, :, 1] = g_mean
            roi[:, :, 2] = r_mean
    return frame

The frame will be divided into areas of the box size (box_height x box_width). For each box (roi: Region of Interest) the average (mean) value of each of the 3 color channels (b_mean, g_mean, r_mean) and overwriting the area to the average color.

Step 3: Testing performance for this frame process

To get an estimate of the time spend in function process, the cProfile library is quite good. It gives a profiling of time spent in each function call. This is great, since we can get an measure of how much time is spent in the function process.

We can accomplish that by running this code.

import cv2
import numpy as np
import cProfile


def process(frame, box_height=6, box_width=16):
    height, width, _ = frame.shape
    for i in range(0, height, box_height):
        for j in range(0, width, box_width):
            roi = frame[i:i + box_height, j:j + box_width]
            b_mean = np.mean(roi[:, :, 0])
            g_mean = np.mean(roi[:, :, 1])
            r_mean = np.mean(roi[:, :, 2])
            roi[:, :, 0] = b_mean
            roi[:, :, 1] = g_mean
            roi[:, :, 2] = r_mean
    return frame


def main(iterations=300):
    # Get the webcam (default webcam is 0)
    cap = cv2.VideoCapture(0)
    # If your webcam does not support 640 x 480, this will find another resolution
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

    for _ in range(iterations):
        # Read the a frame from webcam
        _, frame = cap.read()
        # Flip the frame
        frame = cv2.flip(frame, 1)
        frame = cv2.resize(frame, (640, 480))

        frame = process(frame)

        # Show the frame in a window
        cv2.imshow('WebCam', frame)

        # Check if q has been pressed to quit
        if cv2.waitKey(1) == ord('q'):
            break

    # When everything done, release the capture
    cap.release()
    cv2.destroyAllWindows()

cProfile.run("main()")

Where the interesting output line is given here.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      300    7.716    0.026   50.184    0.167 TEST2.py:8(process)

Which says we use 0.026 seconds per call in the process function. This is good, if we the overhead from the other functions in the main loop is less accumulated to 0.014 seconds.

If we investigate further the calls.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      300    5.132    0.017    5.132    0.017 {method 'read' of 'cv2.VideoCapture' objects}
      300    0.073    0.000    0.073    0.000 {resize}
      300    2.848    0.009    2.848    0.009 {waitKey}
      300    0.120    0.000    0.120    0.000 {flip}
      300    0.724    0.002    0.724    0.002 {imshow}

Which gives an overhead of approximately 0.028 seconds (0.017 + 0.009 + 0.002) from read, resize, flip, imshow and waitKey calls in each iteration. This adds up to a total of 0.054 seconds per frame or a frame rate of 18.5 frames per seconds (FPS).

This is too slow to make it running smooth.

Please notice that cProfile does add some overhead to measure the time.

Step 4: Introducing the Numba to optimize performance

The Numba library is designed to just-in-time compiling code to make NumPy loops faster. Wow. That is just what we need here. Let’s just jump right into it and see how it will do.

import cv2
import numpy as np
from numba import jit
import cProfile


@jit(nopython=True)
def process(frame, box_height=6, box_width=16):
    height, width, _ = frame.shape
    for i in range(0, height, box_height):
        for j in range(0, width, box_width):
            roi = frame[i:i + box_height, j:j + box_width]
            b_mean = np.mean(roi[:, :, 0])
            g_mean = np.mean(roi[:, :, 1])
            r_mean = np.mean(roi[:, :, 2])
            roi[:, :, 0] = b_mean
            roi[:, :, 1] = g_mean
            roi[:, :, 2] = r_mean
    return frame


def main(iterations=300):
    # Get the webcam (default webcam is 0)
    cap = cv2.VideoCapture(0)
    # If your webcam does not support 640 x 480, this will find another resolution
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

    for _ in range(iterations):
        # Read the a frame from webcam
        _, frame = cap.read()
        # Flip the frame
        frame = cv2.flip(frame, 1)
        frame = cv2.resize(frame, (640, 480))

        frame = process(frame)

        # Show the frame in a window
        cv2.imshow('WebCam', frame)

        # Check if q has been pressed to quit
        if cv2.waitKey(1) == ord('q'):
            break

    # When everything done, release the capture
    cap.release()
    cv2.destroyAllWindows()

main(iterations=1)
cProfile.run("main(iterations=300)")

Notice that we call the main loop with one iteration. This is done to call the process function once before we measure the performance as it will compile the code in the first call and keep it compiled.

The result is as follows.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      300    1.187    0.004    1.187    0.004 TEST2.py:7(pixels)

Which estimates a 0.004 seconds per call. This results in a total time of 0.032 seconds per iteration (0.028 + 0.004). This is sufficient to keep the performance of more than 24 frames-per-second (FPS).

Also, this improves the performance by a factor 6.5 times (7.717 / 1.187).

Conclusion

We got the desired speedup to have a live stream from the webcam and process it frame by frame by using Numba. The speedup was approximately 6.5 times.

Average vs Weighted Average Effect in Video using OpenCV

What will we cover in this tutorial?

Compare the difference of using weighted average and normal average over the last frames streaming from your webcam using OpenCV in Python.

The effect can be seen in the video below and code used to create that is provided below.

Example output Normal Average vs Weighted Average vs One Frame

The code

The code is straight forward and not optimized. The average is calculated by using a deque from the collection library from Python to create a circular buffer.

The two classes of AverageBuffer and WeightedAverageBuffer share the same code for the constructor and apply, but have each their implementation of get_frame which calculates the average and weighted average, respectively.

Please notice, that the code is not written for efficiency and the AverageBuffer has some easy wins in performance if calculated more efficiently.

An important point to see here, is that the frames are saved as float32 in the buffers. This is necessary when we do the actual calculations on the frames later, where we multiply them by a factor, say 4.

Example. The frames are uint8, which are integers 0 to 255. Say we multiply the frame by 4, and the value is 128. This will give 128*4 = 512, which as an uint8 is 0. Hence, we get an undesirable effect. Therefore we convert them to float32 to avoid this.

import cv2
import numpy as np
from collections import deque


class AverageBuffer:
    def __init__(self, maxlen):
        self.buffer = deque(maxlen=maxlen)
        self.shape = None

    def apply(self, frame):
        self.shape = frame.shape
        self.buffer.append(frame)

    def get_frame(self):
        mean_frame = np.zeros(self.shape, dtype='float32')
        for item in self.buffer:
            mean_frame += item
        mean_frame /= len(self.buffer)
        return mean_frame.astype('uint8')


class WeightedAverageBuffer(AverageBuffer):
    def get_frame(self):
        mean_frame = np.zeros(self.shape, dtype='float32')
        i = 0
        for item in self.buffer:
            i += 4
            mean_frame += item*i
        mean_frame /= (i*(i + 1))/8.0
        return mean_frame.astype('uint8')

# Setup camera
cap = cv2.VideoCapture(0)
# Set a smaller resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 320)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 240)

average_buffer = AverageBuffer(30)
weighted_buffer = WeightedAverageBuffer(30)

while True:
    # Capture frame-by-frame
    _, frame = cap.read()
    frame = cv2.flip(frame, 1)
    frame = cv2.resize(frame, (320, 240))

    frame_f32 = frame.astype('float32')
    average_buffer.apply(frame_f32)
    weighted_buffer.apply(frame_f32)

    cv2.imshow('WebCam', frame)
    cv2.imshow("Average", average_buffer.get_frame())
    cv2.imshow("Weighted average", weighted_buffer.get_frame())

    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Create a Line Drawing from Webcam Stream using OpenCV in Python

What will we cover in this tutorial?

How to convert a webcam stream into a black and white line drawing using OpenCV and Python. Also, how to adjust the parameters while running the live stream.

See result here.

The things you need to use

There are two things you need to use in order to get a good line drawing of your image.

  1. GaussianBlur to smooth out the image, as detecting lines is sensitive to noise.
  2. Canny that detects the lines.

The Gaussian blur is advised to use a 5×5 filter. The Canny then has to threshold parameters. To find the optimal values for your setting, we have inserted two trackbars where you can set them to any value as see the results.

You can read more about Canny Edge Detection here.

If you need to install OpenCV please read this tutorial.

The code is given below.

import cv2
import numpy as np

# Setup camera
cap = cv2.VideoCapture(0)
# Set a smaller resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)


def nothing(x):
    pass


canny = "Canny"
cv2.namedWindow(canny)
cv2.createTrackbar('Threshold 1', canny, 0, 255, nothing)
cv2.createTrackbar('Threshold 2', canny, 0, 255, nothing)

while True:
    # Capture frame-by-frame
    _, frame = cap.read()
    frame = cv2.flip(frame, 1)

    t1 = cv2.getTrackbarPos('Threshold 1', canny)
    t2 = cv2.getTrackbarPos('Threshold 2', canny)
    gb = cv2.GaussianBlur(frame, (5, 5), 0)
    can = cv2.Canny(gb, t1, t2)

    cv2.imshow(canny, can)

    frame[np.where(can)] = 255
    cv2.imshow('WebCam', frame)
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Twitter-API + Python: Mapping all Your Followers Locations on a Choropleth Map

What will we cover in this tutorial?

How to find all the locations of your followers on Twitter and create a choropleth map (maps where the color of each shape is based on the value of an associated variable) with all countries. This will all be done by using Python.

This is done in connection with my interest of where the followers are from on my Twitter account. Today my result looks like this.

The Choropleth map of the followers of PythonWithRune on Twitter

Step 1: How to get the followers from your Twitter account

If you are new to Twitter API you will need to create a developer account to get your secret key. You can follow this tutorial to create you developer account and get the needed tokens.

When that is done, you can use the tweepy library to connect to the Twitter API. The library function api.followers_ids(api.me().id) will give you a list of all your followers by user-id.

import tweepy

# Used to connect to the Twitter API
def get_twitter_api():
    # You need your own keys/secret/tokens here
    consumer_key = "--- INSERT YOUR KEY HERE ---"
    consumer_secret = "--- INSERT YOUR SECRET HERE ---"
    access_token = "--- INSERT YOUR TOKEN HERE ---"
    access_token_secret = "--- INSERT YOUR TOKEN SECRET HERE ---"

    # authentication of consumer key and secret
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)

    # authentication of access token and secret
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api


# This function is used to process it all
def process():
    # Connecting to the twitter api
    api = get_twitter_api()

    # Get the list of all your followers - it only gives user-id's
    # - we need to gather all user data after
    followers = api.followers_ids(api.me().id)
    print("Followers", len(followers))


if __name__ == "__main__":
    process()

Which will print out the number of followers you have on your account.

Step 2: Get the location of your followers

How do we transform the twitter user-ids to a location?

We need to look them all up. Luckily, not one-by-one. We can do it in chunks of 100 users per call.

The function api.lookup_users(…) can lookup 100 users per call with users-ids or user-names.

import tweepy

# Used to connect to the Twitter API
def get_twitter_api():
    # You need your own keys/secret/tokens here
    consumer_key = "--- INSERT YOUR KEY HERE ---"
    consumer_secret = "--- INSERT YOUR SECRET HERE ---"
    access_token = "--- INSERT YOUR TOKEN HERE ---"
    access_token_secret = "--- INSERT YOUR TOKEN SECRET HERE ---"

    # authentication of consumer key and secret
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)

    # authentication of access token and secret
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api


# This function is used to process it all
def process():
    # Connecting to the twitter api
    api = get_twitter_api()

    # Get the list of all your followers - it only gives user-id's
    # - we need to gather all user data after
    followers = api.followers_ids(api.me().id)
    print("Followers", len(followers))

    # We need to chunk it up in sizes of 100 (max for api.lookup_users)
    followers_chunks = [followers[i:i + 100] for i in range(0, len(followers), 100)]
    # Process each chunk - we can call for 100 users per call
    for follower_chunk in followers_chunks:
        # Get a list of users (with location data)
        users = api.lookup_users(user_ids=follower_chunk)
        # Process each user to get location
        for user in users:
            # Print user location
            print(user.location)


if __name__ == "__main__":
    process()

Before you execute this code, you should now it will print all the locations that all your followers have set.

Step 3: Map all user locations to the same format

When users write their locations, it is done in various ways. As this example shows.

India
Kenya
Temecula, CA
Atlanta, GA
Florida, United States
Hyderabad, India
Atlanta, GA
Agadir / Khouribga, Morocco
Miami, FL
Republic of the Philippines
Tampa, FL
Sammamish, WA
Coffee-machine

And as the last example shows, it might not be a real location. Hence, we need to see if we can find the location by asking a service. For this purpose, we will use the GeoPy library, which is a client for several popular geocoding web services.

Hence, for each of the user specified locations (as the examples above) we will call GeoPy and use the result from it as the location. This will bring everything in the same format or clarify if the location exists.

import tweepy
from geopy.exc import GeocoderTimedOut
from geopy.geocoders import Nominatim


# Used to connect to the Twitter API
def get_twitter_api():
    # You need your own keys/secret/tokens here
    consumer_key = "--- INSERT YOUR KEY HERE ---"
    consumer_secret = "--- INSERT YOUR SECRET HERE ---"
    access_token = "--- INSERT YOUR TOKEN HERE ---"
    access_token_secret = "--- INSERT YOUR TOKEN SECRET HERE ---"

    # authentication of consumer key and secret
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)

    # authentication of access token and secret
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api


# Used to map the twitter user location description to a standard format
def lookup_location(location):
    geo_locator = Nominatim(user_agent="LearnPython")
    try:
        location = geo_locator.geocode(location, language='en')
    except GeocoderTimedOut:
        return None
    return location


# This function is used to process it all
def process():
    # Connecting to the twitter api
    api = get_twitter_api()

    # Get the list of all your followers - it only gives user-id's
    # - we need to gather all user data after
    followers = api.followers_ids(api.me().id)
    print("Followers", len(followers))

    # Used to store all the locations from users
    locations = {}

    # We need to chunk it up in sizes of 100 (max for api.lookup_users)
    followers_chunks = [followers[i:i + 100] for i in range(0, len(followers), 100)]
    # Process each chunk - we can call for 100 users per call
    for follower_chunk in followers_chunks:
        # Get a list of users (with location data)
        users = api.lookup_users(user_ids=follower_chunk)
        # Process each user to get location
        for user in users:
            # Call used to transform users description of location to same format
            location = lookup_location(user.location)
            # Add it to our counter
            if location:
                location = location.address
                location = location.split(',')[-1].strip()
            if location in locations:
                locations[location] += 1
            else:
                locations[location] = 1


if __name__ == "__main__":
    process()

As you see, it will count the occurrences of each location found. The split and strip is used to get the country and leave out the rest of the address if any.

Step 4: Reformat the locations into a Pandas DataFrame

We want to reformat the locations into a DataFrame to be able to join (merge) it with GeoPandas, which contains the choropleth map we want to use.

To convert the locations into a DataFrame we need to restructure it. This will also helps us to remove duplicates. As an example, United States and United States of America both appear. To handle that we will map all country names to a 3 letter code. We will use the pycountry library for that.

import tweepy
import pycountry
import pandas as pd
from geopy.exc import GeocoderTimedOut
from geopy.geocoders import Nominatim


# Used to connect to the Twitter API
def get_twitter_api():
    # You need your own keys/secret/tokens here
    consumer_key = "--- INSERT YOUR KEY HERE ---"
    consumer_secret = "--- INSERT YOUR SECRET HERE ---"
    access_token = "--- INSERT YOUR TOKEN HERE ---"
    access_token_secret = "--- INSERT YOUR TOKEN SECRET HERE ---"

    # authentication of consumer key and secret
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)

    # authentication of access token and secret
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api


# Helper function to map country names to alpha_3 representation
# Some are not supported - and are hard-coded in
# Function used to map country names from GeoPandas and the country names from geo_locator
def lookup_country_code(country):
    try:
        alpha_3 = pycountry.countries.lookup(country).alpha_3
        return alpha_3
    except LookupError:
        if country == 'The Netherlands':
            country = 'NLD'
        elif country == 'Democratic Republic of the Congo':
            country = 'COG'
        return country


# Used to map the twitter user location description to a standard format
def lookup_location(location):
    geo_locator = Nominatim(user_agent="LearnPython")
    try:
        location = geo_locator.geocode(location, language='en')
    except GeocoderTimedOut:
        return None
    return location


# This function is used to process it all
def process():
    # Connecting to the twitter api
    api = get_twitter_api()

    # Get the list of all your followers - it only gives user-id's
    # - we need to gather all user data after
    followers = api.followers_ids(api.me().id)
    print("Followers", len(followers))

    # Used to store all the locations from users
    locations = {}

    # We need to chunk it up in sizes of 100 (max for api.lookup_users)
    followers_chunks = [followers[i:i + 100] for i in range(0, len(followers), 100)]
    # Process each chunk - we can call for 100 users per call
    for follower_chunk in followers_chunks:
        # Get a list of users (with location data)
        users = api.lookup_users(user_ids=follower_chunk)
        # Process each user to get location
        for user in users:
            # Call used to transform users description of location to same format
            location = lookup_location(user.location)
            # Add it to our counter
            if location:
                location = location.address
                location = location.split(',')[-1].strip()
            if location in locations:
                locations[location] += 1
            else:
                locations[location] = 1

    # We reformat the output fo locations
    # Done for two reasons
    # - 1) Some locations have two entries (e.g., United States and United States of America)
    # - 2) To map them into a simple format to join it with GeoPandas
    reformat = {'alpha_3': [], 'followers': []}
    for location in locations:
        print(location, locations[location])
        loc = lookup_country_code(location)
        if loc in reformat['alpha_3']:
            index = reformat['alpha_3'].index(loc)
            reformat['followers'][index] += locations[location]
        else:
            reformat['alpha_3'].append(loc)
            reformat['followers'].append(locations[location])

    # Convert the reformat into a dictionary to join (merge) with GeoPandas
    followers = pd.DataFrame.from_dict(reformat)
    pd.set_option('display.max_columns', 50)
    pd.set_option('display.width', 1000)
    pd.set_option('display.max_rows', 300)
    print(followers.sort_values(by=['followers'], ascending=False))

if __name__ == "__main__":
    process()

That makes it ready to join (merge) with GeoPandas.

Step 5: Merge it with GeoPandas and show the choropleth map

Now for the fun part. We only need to load the geo data from GeoPandas and merge our newly created DataFrame with it. Finally, plot and show it using matplotlib.pyplot.

import tweepy
import pycountry
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
from geopy.exc import GeocoderTimedOut
from geopy.geocoders import Nominatim


# Used to connect to the Twitter API
def get_twitter_api():
    # You need your own keys/secret/tokens here
    consumer_key = "--- INSERT YOUR KEY HERE ---"
    consumer_secret = "--- INSERT YOUR SECRET HERE ---"
    access_token = "--- INSERT YOUR TOKEN HERE ---"
    access_token_secret = "--- INSERT YOUR TOKEN SECRET HERE ---"

    # authentication of consumer key and secret
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)

    # authentication of access token and secret
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth, wait_on_rate_limit=True)
    return api


# Helper function to map country names to alpha_3 representation
# Some are not supported - and are hard-coded in
# Function used to map country names from GeoPandas and the country names from geo_locator
def lookup_country_code(country):
    try:
        alpha_3 = pycountry.countries.lookup(country).alpha_3
        return alpha_3
    except LookupError:
        if country == 'The Netherlands':
            country = 'NLD'
        elif country == 'Democratic Republic of the Congo':
            country = 'COG'
        return country


# Used to map the twitter user location description to a standard format
def lookup_location(location):
    geo_locator = Nominatim(user_agent="LearnPython")
    try:
        location = geo_locator.geocode(location, language='en')
    except GeocoderTimedOut:
        return None
    return location


# This function is used to process it all
def process():
    # Connecting to the twitter api
    api = get_twitter_api()

    # Get the list of all your followers - it only gives user-id's
    # - we need to gather all user data after
    followers = api.followers_ids(api.me().id)
    print("Followers", len(followers))

    # Used to store all the locations from users
    locations = {}

    # We need to chunk it up in sizes of 100 (max for api.lookup_users)
    followers_chunks = [followers[i:i + 100] for i in range(0, len(followers), 100)]
    # Process each chunk - we can call for 100 users per call
    for follower_chunk in followers_chunks:
        # Get a list of users (with location data)
        users = api.lookup_users(user_ids=follower_chunk)
        # Process each user to get location
        for user in users:
            # Call used to transform users description of location to same format
            location = lookup_location(user.location)
            # Add it to our counter
            if location:
                location = location.address
                location = location.split(',')[-1].strip()
            if location in locations:
                locations[location] += 1
            else:
                locations[location] = 1

    # We reformat the output fo locations
    # Done for two reasons
    # - 1) Some locations have two entries (e.g., United States and United States of America)
    # - 2) To map them into a simple format to join it with GeoPandas
    reformat = {'alpha_3': [], 'followers': []}
    for location in locations:
        print(location, locations[location])
        loc = lookup_country_code(location)
        if loc in reformat['alpha_3']:
            index = reformat['alpha_3'].index(loc)
            reformat['followers'][index] += locations[location]
        else:
            reformat['alpha_3'].append(loc)
            reformat['followers'].append(locations[location])

    # Convert the reformat into a dictionary to join (merge) with GeoPandas
    followers = pd.DataFrame.from_dict(reformat)
    pd.set_option('display.max_columns', 50)
    pd.set_option('display.width', 1000)
    pd.set_option('display.max_rows', 300)
    print(followers.sort_values(by=['followers'], ascending=False))

    # Read the GeoPandas
    world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
    # Remove the columns not needed
    world = world.drop(['pop_est', 'continent', 'iso_a3', 'gdp_md_est'], axis=1)
    # Map the same naming convention as followers (the above DataFrame)
    # - this step is needed, because the iso_a3 column was missing a few countries 
    world['iso_a3'] = world.apply(lambda row: lookup_country_code(row['name']), axis=1)
    # Merge the tables (DataFrames)
    table = world.merge(followers, how="left", left_on=['iso_a3'], right_on=['alpha_3'])

    # Plot the data in a graph
    table.plot(column='followers', figsize=(8, 6))
    plt.show()


if __name__ == "__main__":
    process()

Resulting in the following output (for PythonWithRune twitter account (not yours)).

OpenCV + Python: Move Objects Around in a Live Webcam Stream Using Your Hands

What will we cover in this tutorial?

How do you detect movements in a webcam stream? Also, how do you insert objects in a live webcam stream? Further, how do you change the position of the object based on the movements?

We will learn all that in this tutorial. The end result can be seen in the video below.

The end result of this tutorial

Step 1: Understand the flow of webcam processing

A webcam stream is processed frame-by-frame.

Illustration: Webcam processing flow

As the above illustration shows, when the webcam captures the next frame, the actual processing often happens on a copy of the original frame. When all the updates and calculations are done, they are inserted in the original frame.

This is interesting. To extract information from the webcam frame we need to work with the frame and find the features we are looking for.

In our example, we need to find movement and based on that see if that movement is touching our object.

A simple flow without any processing would look like this.

import cv2


# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# To detect movement (to get the background)
background_subtractor = cv2.createBackgroundSubtractorMOG2()

# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
    # Read the a frame from webcam
    _, frame = cap.read()
    # Flip the frame
    frame = cv2.flip(frame, 1)

    # Show the frame in a window
    cv2.imshow('WebCam', frame)

    # Check if q has been pressed to quit
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

The above code will create a direct stream from your webcam to a window.

Step 2: Insert a logo – do it with a class that we will extend later

Here we want to insert a logo in a fixed position in our webcam stream. This can be achieved be the following code. The main difference is the new object Object defined and created.

The object briefly explained

  • The object will represent the logo we want to insert.
  • It will keep the current position (which is static so far)
  • The logo itself.
  • The mask used to insert it later (when insert_object is called).
  • The constructor (__init__(…)) does the stuff only needed once. Read the logo (it assumes you have a file named logo.png in the same folder), resize it, creating a mask (by gray scaling and thresholding), setting the initial positions of the logo.

Before the while-loop the object obj is created. All that is needed at this stage is to insert the logo in each frame.

import cv2
import numpy as np


# Object class to insert logo
class Object:
    def __init__(self, start_x=100, start_y=100, size=50):
        self.logo_org = cv2.imread('logo.png')
        self.size = size
        self.logo = cv2.resize(self.logo_org, (size, size))
        img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
        _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
        self.logo_mask = logo_mask
        self.x = start_x
        self.y = start_y

    def insert_object(self, frame):
        roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
        roi[np.where(self.logo_mask)] = 0
        roi += self.logo


# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
    # Read the a frame from webcam
    _, frame = cap.read()
    # Flip the frame
    frame = cv2.flip(frame, 1)

    # Insert the object into the frame
    obj.insert_object(frame)

    # Show the frame in a window
    cv2.imshow('WebCam', frame)

    # Check if q has been pressed to quit
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

This will result in the following output (when you put me in front of the webcam – that said, if you do it, expect that you sit in the picture and not me (just want to avoid any uncomfortable surprises for you when you show up in the window)).

The logo at a fixed position.

For more details on how to insert a logo in a live webcam stream, you can read this tutorial.

Step 3: Detect movement in the frame

Detecting movement is not a simple task. Depending on your needs, it can be solved quite simple. In this tutorial we only need to detect simple movement. That is, if you are in the frame and sit still, we do not care to detect it. We only care to detect the actual movement.

We can solve that problem by using the library function createBackgroundSubtractorMOG2(), which can “remove” the background from your frame. It is far from a perfect solution, but it is sufficient for what we want to achieve.

As we only want to see if there is movement or not, and not how much the difference is from previous detected background, we will use a threshold function to make the image black and white based on that. We set the threshold quite high, as it will also remove noise from the image.

It might happen that in your settings (lightening etc.) you need to adjust that value. See the comments in the code how to do that.

import cv2
import numpy as np


# Object class to insert logo
class Object:
    def __init__(self, start_x=100, start_y=100, size=50):
        self.logo_org = cv2.imread('logo.png')
        self.size = size
        self.logo = cv2.resize(self.logo_org, (size, size))
        img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
        _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
        self.logo_mask = logo_mask
        self.x = start_x
        self.y = start_y

    def insert_object(self, frame):
        roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
        roi[np.where(self.logo_mask)] = 0
        roi += self.logo


# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# To detect movement (to get the background)
background_subtractor = cv2.createBackgroundSubtractorMOG2()

# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
    # Read the a frame from webcam
    _, frame = cap.read()
    # Flip the frame
    frame = cv2.flip(frame, 1)

    # Get the foreground mask (it is gray scale)
    fg_mask = background_subtractor.apply(frame)
    # Convert the gray scale to black and white with a threshold
    # Change the 250 threshold fitting your webcam and needs
    # - Setting it lower will make it more sensitive (also to noise)
    _, fg_mask = cv2.threshold(fg_mask, 250, 255, cv2.THRESH_BINARY)

    # Insert the object into the frame
    obj.insert_object(frame)

    # Show the frame in a window
    cv2.imshow('WebCam', frame)
    # To see the foreground mask
    cv2.imshow('fg_mask', fg_mask)

    # Check if q has been pressed to quit
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

This results in the following output.

Output – again, don’t expect to see me when you run this example on your computer

As you see, it does a decent job to detect movement. Sometimes it happens that you create a shadow after your movements. Hence, it is not perfect.

Step 4: Detecting movement where the object is and move it accordingly

This is the tricky part. But let’s break it down simple.

  • We need to detect if the mask, we created in previous step, is overlapping with the object (logo).
  • If so, we want to move the object (logo).

That is what we want to achieve.

How do we do that?

  • Detect if there is an overlap by using the same mask we create for the logo and see if it overlaps with any points on the mask of the movement.
  • If so, we move the object by choosing a random movement. Measure how much overlap is. Then choose another random movement. See if the overlap is less.
  • Continue this a few times and chose the random movement with the least overlap.

This turns out to by chance to move away from the overlapping areas. This is the power of introducing some randomness, which simplifies the algorithm a lot.

A more precise approach would be to calculate in which direction the least mask is close to the object (logo). This becomes quite complicated and needs a lot of calculations. Hence, we chose to have this simple approach, which has both a speed element and direction element that works fairly well.

All we need to do, is to add a update_position function to our class and call it before we insert the logo.

import cv2
import numpy as np


# Object class to insert logo
class Object:
    def __init__(self, start_x=100, start_y=100, size=50):
        self.logo_org = cv2.imread('logo.png')
        self.size = size
        self.logo = cv2.resize(self.logo_org, (size, size))
        img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
        _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
        self.logo_mask = logo_mask
        self.x = start_x
        self.y = start_y
        self.on_mask = False

    def insert_object(self, frame):
        roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
        roi[np.where(self.logo_mask)] = 0
        roi += self.logo

    def update_position(self, mask):
        height, width = mask.shape

        # Check if object is overlapping with moving parts
        roi = mask[self.y:self.y + self.size, self.x:self.x + self.size]
        check = np.any(roi[np.where(self.logo_mask)])

        # If object has moving parts, then find new position
        if check:
            # To save the best possible movement
            best_delta_x = 0
            best_delta_y = 0
            best_fit = np.inf
            # Try 8 different positions
            for _ in range(8):
                # Pick a random position
                delta_x = np.random.randint(-15, 15)
                delta_y = np.random.randint(-15, 15)

                # Ensure we are inside the frame, if outside, skip and continue
                if self.y + self.size + delta_y > height or self.y + delta_y < 0 or \
                        self.x + self.size + delta_x > width or self.x + delta_x < 0:
                    continue

                # Calculate how much overlap
                roi = mask[self.y + delta_y:self.y + delta_y + self.size, self.x + delta_x:self.x + delta_x + self.size]
                check = np.count_nonzero(roi[np.where(self.logo_mask)])
                # If perfect fit (no overlap), just return
                if check == 0:
                    self.x += delta_x
                    self.y += delta_y
                    return
                # If a better fit found, save it
                elif check < best_fit:
                    best_fit = check
                    best_delta_x = delta_x
                    best_delta_y = delta_y

            # After for-loop, update to best fit (if any found)
            if best_fit < np.inf:
                self.x += best_delta_x
                self.y += best_delta_y
                return


# Get the webcam (default webcam is 0)
cap = cv2.VideoCapture(0)
# If your webcam does not support 640 x 480, this will find another resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# To detect movement (to get the background)
background_subtractor = cv2.createBackgroundSubtractorMOG2()

# This will create an object
obj = Object()
# Loop forever (or until break)
while True:
    # Read the a frame from webcam
    _, frame = cap.read()
    # Flip the frame
    frame = cv2.flip(frame, 1)
    # Get the foreground mask (it is gray scale)
    fg_mask = background_subtractor.apply(frame)
    # Convert the gray scale to black and white with a threshold
    # Change the 250 threshold fitting your webcam and needs
    # - Setting it lower will make it more sensitive (also to noise)
    _, fg_mask = cv2.threshold(fg_mask, 250, 255, cv2.THRESH_BINARY)

    # Find a new position for object (logo)
    # - fg_mask contains all moving parts
    # - updated position will be the one with least moving parts
    obj.update_position(fg_mask)
    # Insert the object into the frame
    obj.insert_object(frame)

    # Show the frame in a window
    cv2.imshow('WebCam', frame)
    # To see the fg_mask uncomment the line below
    # cv2.imshow('fg_mask', fg_mask)

    # Check if q has been pressed to quit
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Step 5: Test it

Well, this is the fun part. See a live demo in the video below.

The final result

What is next step?

I would be happy to hear any suggestions from you. I see a lot of potential improvements, but the conceptual idea is explained and showed in this tutorial.

Create Cartoon Characters in Live Webcam Stream with OpenCV and Python

What will we cover in this tutorial?

How to convert the foreground characters of a live webcam feed to become cartoons, while keeping the background as it is.

In this tutorial we will show how this can be done using OpenCV and Python in a few lines of code. The result can be seen in the YouTube video below.

Step 1: Find the moving parts

The big challenge is to identify what is the background and what is the foreground.

This can be done in various ways, but we want to keep it quite accurate and not just identifying boxes around moving objects. We actually want to have the contour of the objects and fill them all out.

While this sounds easy, it is a bit challenging. Still, we will try to do it as simple as possible.

First step is to keep the last frame and subtract it from the current frame. This will give all the moving parts. This should be done on a gray scale image.

import cv2
import numpy as np

# Setup camera
cap = cv2.VideoCapture(0)
# Set a smaller resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Just a dummy frame, will be overwritten
last_foreground = np.zeros((480, 640), dtype='uint8')
while True:
    # Capture frame-by-frame
    _, frame = cap.read()
    # Only needed if you webcam does not support 640x480
    frame = cv2.resize(frame, (640, 480))
    # Flip it to mirror you
    frame = cv2.flip(frame, 1)
    # Convert to gray scale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Keep the foreground
    foreground = gray
    # Take the absolute difference
    abs_diff = cv2.absdiff(foreground, last_foreground)
    # Update the last foreground image
    last_foreground = foreground

    cv2.imshow('WebCam (Mask)', abs_diff)
    cv2.imshow('WebCam (frame)', frame)
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

This results in the following output with a gray scale contour of the moving part of the image. If you need help installing OpenCV read this tutorial.

Step 2: Using a threshold

To make the contour more visible you can use a threshold (cv2.threshold(…)).

import cv2
import numpy as np

# Setup camera
cap = cv2.VideoCapture(0)
# Set a smaller resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Just a dummy frame, will be overwritten
last_foreground = np.zeros((480, 640), dtype='uint8')
while True:
    # Capture frame-by-frame
    _, frame = cap.read()
    # Only needed if you webcam does not support 640x480
    frame = cv2.resize(frame, (640, 480))
    # Flip it to mirror you
    frame = cv2.flip(frame, 1)
    # Convert to gray scale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Keep the foreground
    foreground = gray
    # Take the absolute difference
    abs_diff = cv2.absdiff(foreground, last_foreground)
    # Update the last foreground image
    last_foreground = foreground

    _, mask = cv2.threshold(abs_diff, 20, 255, cv2.THRESH_BINARY)
 
    cv2.imshow('WebCam (Mask)', mask)
    cv2.imshow('WebCam (frame)', frame)
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Resulting in this output

Using the threshold makes the image black and white. This helps it to become easier to detect the moving parts.

Step 3: Fill out the enclosed contours

To fill out the enclosed contours you can use morphologyEx. Also, we have used dilate to make the lines more thick and enclose the part better.

import cv2
import numpy as np

# Setup camera
cap = cv2.VideoCapture(0)
# Set a smaller resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Just a dummy frame, will be overwritten
last_foreground = np.zeros((480, 640), dtype='uint8')
while True:
    # Capture frame-by-frame
    _, frame = cap.read()
    # Only needed if you webcam does not support 640x480
    frame = cv2.resize(frame, (640, 480))
    # Flip it to mirror you
    frame = cv2.flip(frame, 1)
    # Convert to gray scale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Keep the foreground
    foreground = gray
    # Take the absolute difference
    abs_diff = cv2.absdiff(foreground, last_foreground)
    # Update the last foreground image
    last_foreground = foreground

    _, mask = cv2.threshold(abs_diff, 20, 255, cv2.THRESH_BINARY)
    mask = cv2.dilate(mask, None, iterations=3)
    se = np.ones((85, 85), dtype='uint8')
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, se)

    cv2.imshow('WebCam (Mask)', mask)
    cv2.imshow('WebCam (frame)', frame)
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Resulting in the following output.

Me happy, next to a white shadows ghost of myself

Step 4: Creating cartoon effect and mask it into the foreground

The final step is to create a cartoon version of the frame (cv2.stylization()).

    frame_effect = cv2.stylization(frame, sigma_s=150, sigma_r=0.25)

And mask it out out with the foreground mask. This will result in the following code.

import cv2
import numpy as np

# Setup camera
cap = cv2.VideoCapture(0)
# Set a smaller resolution
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Just a dummy frame, will be overwritten
last_foreground = np.zeros((480, 640), dtype='uint8')
while True:
    # Capture frame-by-frame
    _, frame = cap.read()
    # Only needed if you webcam does not support 640x480
    frame = cv2.resize(frame, (640, 480))
    # Flip it to mirror you
    frame = cv2.flip(frame, 1)
    # Convert to gray scale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Keep the foreground
    foreground = gray
    # Take the absolute difference
    abs_diff = cv2.absdiff(foreground, last_foreground)
    # Update the last foreground image
    last_foreground = foreground

    _, mask = cv2.threshold(abs_diff, 20, 255, cv2.THRESH_BINARY)
    mask = cv2.dilate(mask, None, iterations=3)
    se = np.ones((85, 85), dtype='uint8')
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, se)

    frame_effect = cv2.stylization(frame, sigma_s=150, sigma_r=0.25)
    idx = (mask > 1)
    frame[idx] = frame_effect[idx]

    # cv2.imshow('WebCam (Mask)', mask)
    cv2.imshow('WebCam (frame)', frame)
    if cv2.waitKey(1) == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

Step 5: Try it in real life

I must say the cartoon effect is heavy (is slow). But other than that, it works fine.

Understand How Color to Gray Scale Works Using OpenCV

From color to gray scale

The first thing to understand is that when we convert a color image to a gray scale image it will lose information. That means, you cannot convert a color image to gray scale and back to a color image without losing quality.

import cv2

img = cv2.imread("image.jpeg")
img = cv2.resize(img, (200, 300))
cv2.imshow("Original", img)

# OpenCV can convert it to gray scale for you
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("Gray", gray)

# And convert it back to color
color = cv2.cvtColor(gray, cv2.COLOR_GRAY2BGR)
cv2.imshow("Color", color)

Resulting in the following output.

Output

And as you see, the conversion back to color is only adding the gray scale value to the 3 channels of colors RGB.

Why do we lose information?

The key to understand is that a color image has three channels for each pixel, while a gray scale image only has one channel.

See the following illustration.

Describing the frame

As the above shows a gray scale frame only contains one number for each pixel, and the color image contains 3 numbers.

So how does the conversion happen?

How OpenCV converts to gray scale image

If you look in the documentation of cvtColor(…) you can find the conversion calculations.


Hence, we can make the same calculations.

import cv2
import numpy as np


img = cv2.imread("image.jpeg")
img = cv2.resize(img, (200, 300))
cv2.imshow("Original", img)

# The channels are BGR, hence the order is opposite
gray = img[:, :, 2]*0.299 + img[:, :, 1]*0.587 + img[:, :, 0]*0.114
gray = gray.astype(np.uint8)
cv2.imshow("Gray", gray)

cvt_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow("CVT_Gray", cvt_gray)

Resulting in the following output.

Output

Which seem to quite close.

The conversion is no unique

Looking at wikipedia, there are other common ways to convert to gray scale.

Let’s try and see if we can see the difference.

import cv2
import numpy as np


img = cv2.imread("image.jpeg")
img = cv2.resize(img, (200, 300))

gray = img[:, :, 2]*0.299 + img[:, :, 1]*0.587 + img[:, :, 0]*0.114
gray = gray.astype(np.uint8)
cv2.imshow("Gray", gray)

gray = img[:, :, 2]*0.2126 + img[:, :, 1]*0.7152 + img[:, :, 0]*0.0722
gray = gray.astype(np.uint8)
cv2.imshow("Gray HDTV", gray)

gray = img[:, :, 2]*0.2627 + img[:, :, 1]*0.6780 + img[:, :, 0]*0.0593
gray = gray.astype(np.uint8)
cv2.imshow("Gray HDR", gray)

Resulting in the following.

I think you need higher resolution to really appreciate the difference.

Convert the color channels directly to gray scale

If you look at the conversions, they all favor green as the main value. Let’s see if we can see the difference if we only use the channels to convert to gray scale.

import cv2
import numpy as np


img = cv2.imread("image.jpeg")
img = cv2.resize(img, (200, 300))

gray = img[:, :, 2]
gray = gray.astype(np.uint8)
cv2.imshow("Red", gray)

gray = img[:, :, 1]
gray = gray.astype(np.uint8)
cv2.imshow("Green", gray)

gray = img[:, :, 0]
gray = gray.astype(np.uint8)
cv2.imshow("Blue", gray)

The result.

Where it is more easy to see the difference.

Create Cartoon Background in Webcam Stream using OpenCV

What will we cover in this tutorial?

How to create this effect.

Create this effect in a few lines of code

The idea behind the code

The idea behind the above effect is simple. We will use a background subtractor, which will get the background of an image and make a mask of the foreground.

The it simple follows this structure.

  1. Capture a frame from the webcam.
  2. Get the foreground mask fg_mask.
  3. To get greater effect dilate the fg_mask.
  4. From the original frame, create a cartoon frame.
  5. Use the zero entries of fg_mask as index to copy the cartoon frame into frame. That is, it overwrites all pixel corresponding to a zero (black) value in fg_mask to the values of the cartoon in the original frame. That results in that we only get cartoon effect in the background and not on the objects.
  6. Show the frame with background cartoon effect.

The code you need to create the above effect

This is all done by using OpenCV. If you need help to install OpenCV I suggest you read this tutorial. Otherwise the code follows the above steps.

import cv2

backSub = cv2.createBackgroundSubtractorKNN(history=200)

cap = cv2.VideoCapture(0)

while True:
    _, frame = cap.read()

    fg_mask = backSub.apply(frame)
    fg_mask = cv2.dilate(fg_mask, None, iterations=2)

    _, cartoon = cv2.pencilSketch(frame, sigma_s=50, sigma_r=0.3, shade_factor=0.02)

    idx = (fg_mask < 1)
    frame[idx] = cartoon[idx]
    cv2.imshow('Frame', frame)
    cv2.imshow('FG Mask', fg_mask)

    keyboard = cv2.waitKey(1)
    if keyboard == ord('q'):
        break

OpenCV + Python + Webcam: Create a Simple Game (Avoid the falling logo)

What will we cover in this tutorial?

Is it a bird? Is it a plain? No, it is falling objects from the sky.

With OpenCV you can get a stream of frames from your webcam. Process the data in Python to create an easy prototype of a simple game, where you should avoid the falling objects. We will cover how to built that in this tutorial.

Step 1: The game explained

The game is quite simple and is built based on a few ideas.

  1. Setup a live stream from your webcam.
  2. Insert falling objects starting from the top of the frame at a random vertical position.
  3. If the object hits you (the player) you get subtracted 1 point in your score.
  4. On the other hand, if you avoid the object and it hits the bottom of the frame without hitting you, you gain one point in your score.
  5. Play until bored or tired.

Step 2: How can this game be built easy?

You basically need three components to create this game.

Firstly, we need a way to take and process each frame from the webcam. That is, create a live stream that can show you what is happening in the view of the frame. This will only require a way to read a frame from the webcam and show it on the screen. If this is done repeatedly, you have a live stream.

Secondly, something that can make objects that fall down in your frame from a random position. That is, it should remember where the object was in the last frame and insert in a new, lower position in the new frame.

Thirdly, something that can detect where you are in the frame. Hence, if the object and you are in the same position. You are subtracted a point from your score and a new object is created at the top.

The great news is that we can make all the simple by using Python.

Step 3: Get a live stream from a webcam

a thing that is required as a prior condition for something else to happen or exist.

Google meaning

A simple stream from the webcam can be created be the following code.

import cv2


# Capture the webcam
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)


while True:
    # Get a frame
    _, frame = cap.read()

    # Update the frame in the window
    cv2.imshow("Webcam", frame)

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and destroy windows
cap.release()
cv2.destroyAllWindows()

If you have troubles installing OpenCV, please read this tutorial. You might need to change the width and height of the webcam. You can find all the possible resolutions your webcam support by following this tutorial. The reason to lower the resolution is to increase the processing time and not slow the game.

Another approach, where you can keep the full resolution, is to only resize the images you make the processing on.

Step 4: Motion detection

The idea behind a simple motion detector is to have a picture of the background. Then for each frame you will subtract the background from the frame. This will identify all new object in the frame.

To get a good picture of the background it might be an idea to let the webcam film for a few frames, as it often needs to adjust.

The idea is mapped out here.

import cv2
import numpy as np

# Capture the webcam
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# To capture the background - take a few iterations to stabilize view
while True:
    # Get the next frame
    _, bg_frame = cap.read()
    bg_frame = cv2.flip(bg_frame, 1)

    # Update the frame in the window
    cv2.imshow("Webcam", bg_frame)

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) &amp; 0xFF == ord('q'):
        break

# Processing of frames are done in gray
bg_gray = cv2.cvtColor(bg_frame, cv2.COLOR_BGR2GRAY)
# We blur it to minimize reaction to small details
bg_gray = cv2.GaussianBlur(bg_gray, (5, 5), 0)


# This is where the game loop starts
while True:
    # Get the next frame
    _, frame = cap.read()
    frame = cv2.flip(frame, 1)

    # Processing of frames are done in gray
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # We blur it to minimize reaction to small details
    gray = cv2.GaussianBlur(gray, (5, 5), 0)

    # Get the difference from last_frame
    delta_frame = cv2.absdiff(bg_gray, gray)
    # Have some threshold on what is enough movement
    thresh = cv2.threshold(delta_frame, 100, 255, cv2.THRESH_BINARY)[1]
    # This dilates with two iterations
    thresh = cv2.dilate(thresh, None, iterations=2)
    cv2.imshow("track", thresh)

   # Update the frame in the window
    cv2.imshow("Webcam", frame)

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and destroy windows
cap.release()
cv2.destroyAllWindows()

First let the background be as you want to (empty without you in it). Then press q, to capture the background image used to subtract in the second loop.

The output of the second loop could look similar to this (Maybe with you instead of me).

Output

For a more detailed explanation of a motion tracker, you can read this tutorial on how to make a motion detector.

Step 5: Adding falling objects

This will be done by inserting an object in our frame and simply moving it downwards frame-by-frame.

To have all functionality related to the object I made an object class Object.

The full code is here.

import cv2
import numpy as np

# Capture the webcam
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# To capture the background - take a few iterations to stabilize view
while True:
    # Get the next frame
    _, bg_frame = cap.read()
    bg_frame = cv2.flip(bg_frame, 1)

    # Update the frame in the window
    cv2.imshow("Webcam", bg_frame)

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) and 0xFF == ord('q'):
        break

# Processing of frames are done in gray
bg_gray = cv2.cvtColor(bg_frame, cv2.COLOR_BGR2GRAY)
# We blur it to minimize reaction to small details
bg_gray = cv2.GaussianBlur(bg_gray, (5, 5), 0)


# Read the logo to use later
class Object:
    def __init__(self, size=50):
        self.logo_org = cv2.imread('logo.png')
        self.size = size
        self.logo = cv2.resize(self.logo_org, (size, size))
        img2gray = cv2.cvtColor(self.logo, cv2.COLOR_BGR2GRAY)
        _, logo_mask = cv2.threshold(img2gray, 1, 255, cv2.THRESH_BINARY)
        self.logo_mask = logo_mask
        self.speed = 15
        self.x = 100
        self.y = 0
        self.score = 0

    def insert_object(self, frame):
        roi = frame[self.y:self.y + self.size, self.x:self.x + self.size]
        roi[np.where(self.logo_mask)] = 0
        roi += self.logo

    def update_position(self, tresh):
        height, width = tresh.shape
        self.y += self.speed
        if self.y + self.size > height:
            self.y = 0
            self.x = np.random.randint(0, width - self.size - 1)
            self.score += 1

        # Check for collision
        roi = tresh[self.y:self.y + self.size, self.x:self.x + self.size]
        check = np.any(roi[np.where(self.logo_mask)])
        if check:
            self.score -= 1
            self.y = 0
            self.x = np.random.randint(0, width - self.size - 1)
            # self.speed += 1
        return check


# Let's create the object that will fall from the sky
obj = Object()

# This is where the game loop starts
while True:
    # Get the next frame
    _, frame = cap.read()
    frame = cv2.flip(frame, 1)

    # Processing of frames are done in gray
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # We blur it to minimize reaction to small details
    gray = cv2.GaussianBlur(gray, (5, 5), 0)

    # Get the difference from last_frame
    delta_frame = cv2.absdiff(bg_gray, gray)
    # Have some threshold on what is enough movement
    thresh = cv2.threshold(delta_frame, 100, 255, cv2.THRESH_BINARY)[1]
    # This dilates with two iterations
    thresh = cv2.dilate(thresh, None, iterations=2)
    # cv2.imshow("track", thresh)

    hit = obj.update_position(thresh)
    obj.insert_object(frame)

    # To make the screen white when you get hit
    if hit:
        frame[:, :, :] = 255

    text = f"Score: {obj.score}"
    cv2.putText(frame, text, (10, 20), cv2.FONT_HERSHEY_PLAIN, 2, (0, 255, 0), 2)
    # Update the frame in the window
    cv2.imshow("Webcam", frame)

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and destroy windows
cap.release()
cv2.destroyAllWindows()

But does it work?

Trying the game

No I am not too old for that stuff. Or, maybe I was just having one of those days.

It works, but of course it could be improved in many ways.

OpenCV + Python + Webcam: Create a Ghost Effect

What will we cover in this tutorial?

A ghost effect is when multiple images are combined into one image. In this tutorial we will see how this effect can be made effectively and easy. We will make it with a trackbar, so the effect can be adjusted.

Step 1: Understand how to create the ghost effect

We will start simple with only two images. Consider the following two images, which has the same background.

They can be combined using the OpenCV library, as the following code shows.

import cv2


img1 = cv2.imread("image1.png")
img2 = cv2.imread("image2.png")

img3 = cv2.addWeighted(src1=img1, alpha=0.5, src2=img2, beta=0.5, gamma=0.0)

cv2.imwrite("image3.png", img3)

The cv2.imread(…) reads the images into the variables img1 and img2, where the above two images are named image1.png and image2.png, respectively.

Then the cv2.addWeighted(…) is where the magic happens. The src1 and src2 parameters take each an image, while the alpha and beta determines how much weight each image should have. Here we have chose 50% (0.5) each. It is a good rule to let it add up to 100% (1.0).

Hence, the resulting image will be a 50% weighted and composed of the two input images. The result can be seen here.

The function cv2.addWeighted(…) can be used to create a ghost effect in a live stream from a webcam.

Step 2: Understanding the webcam stream

To understand how a processing flow from webcam works it is easiest to illustrate it by some simple code. If you are new to OpenCV and need it installed, please read this tutorial.

import cv2


# Capture the webcam
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Pre-preprocessing should be done here 

while True:
    # Capture the frame from the webcam
    _, frame = cap.read()

    # Processing should be done here

    # Show the frame to a window
    cv2.imshow("Webcam", frame)

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and destroy windows
cap.release()
cv2.destroyAllWindows()

The above code shows how a simple flow of capturing a frame from the webcam and showing it in a windows works. It is important to notice, that each image (or frame) from the webcam is handled individually.

This is handy, if we want to process it.

Step 3: Adding a ghost effect in the processing pipeline

We know from Step 1 how to make simple ghost effect with two images. If we use that simple way, we can actually do it frame-by-frame by saving the old frame. This also makes the effect to last more than one frame back.

import cv2


# Capture the webcam
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Pre-preprocessing should be done here
_, last_frame = cap.read()
while True:
    # Capture the frame from the webcam
    _, frame = cap.read()

    # Processing
    if frame.shape == last_frame.shape:
        frame = cv2.addWeighted(src1=frame, alpha=0.5, src2=last_frame, beta=0.5, gamma=0.0)

    # Show the frame to a window
    cv2.imshow("Webcam", frame)

    # Update last_frame
    last_frame = frame

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and destroy windows
cap.release()
cv2.destroyAllWindows()

That creates a simple ghost effect. While it is not very strong, you can change the values of alpha and beta.

But we can actually add a trackbar in order to change the value.

Step 4: Adding a trackbar to the window

This is just an add on to the above that will enable you to change the shadow effect while you stream your webcam. To add a trackbar we need a few things. First we need a variable that will be accessed anywhere in the code to keep the state of the ghost effect (ghost_effect). Also, we need a named window (cv2.namedWindow(…)) to access the same window to setup the trackbar in the window we stream in from the webcam.

Then we have the callback function on_ghost_trackbar(val) to update the value, both int he named window and the global variable ghost_effect. Then the call to cv2.createTrackbar(…) will set the callback function to on_ghost_trackbar. This will ensure that on any update (every time you move the trackbar) the function on_ghost_trackbar is called with the new value, where you update the ghost_effect variable, which is used to update the ghost effect in the cv2.addWeighted(…) call.

import cv2

# A global variable with the ghost effect
ghost_effect = 0.0
# Setup a window that can be referenced
window = "Webcam"
cv2.namedWindow(window)


# Used by the trackbar to change the ghost effect
def on_ghost_trackbar(val):
    global ghost_effect
    global window

    ghost_effect = val / 100.0
    cv2.setTrackbarPos("Shadow", window, val)


# Capture the webcam
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Create a trackbar
cv2.createTrackbar("Ghost effect", window, 0, 100, on_ghost_trackbar)


# Get the first frame
_, last_frame = cap.read()
while True:
    # Get the next frame
    _, frame = cap.read()

    # Add the ghost effect
    if frame.shape == last_frame.shape:
        frame = cv2.addWeighted(src1=frame, alpha=1 - ghost_effect, src2=last_frame, beta=ghost_effect, gamma=0.0)

    # Update the frame in the window
    cv2.imshow(window, frame)
    
    # Update last_frame
    last_frame = frame

    # Check if q is pressed, terminate if so
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and destroy windows
cap.release()
cv2.destroyAllWindows()

While the code becomes a bit more complex to add the trackbar to the window in the GUI, it has the same functionality.

Step 5: Test the ghost effect

Now we just need to test it to see if we got the desired effect.

Have fun and play with that.