OpenCV and Python: Simple Noise-tolerant Motion Detector

What will we cover in this tutorial?

A simple motion detector can be created by the difference of two images of the same frame. A base image of all the static elements, compared with a new image of the same frame, to identify changes.

The challenge with such an approach is that it is not noise-tolerant and it can be difficult to ensure the correct base image.

In this tutorial we will see how to create a simple noise-tolerant motion detector.

The result of this tutorial

Step 1: Understand how we can make it noise-tolerant

In the previous tutorial we made a simple car counter. The approach was to compare the last two frames and take the difference between them. While it works it is not very robust to simple noise and will create a lot of false positives.

To deal with that we will modify this approach simply by making the last frame to consist of the average of the last 30 frames, and the current frame to be the weighted average of the last 10 frames.

The process will be like the illustration shows.

Process
  1. The first step is to resize the image for two reasons. First, we don’t need the detail level further down. Second, it will make processing faster.
  2. Then we blur the image to minimize the impact of small details.
  3. This step is to add the image to a foreground list, which will be used to calculate the first image. This list can be 10 frames. We will calculate the foreground image from it using a weighted average to give more weight to never frames.
  4. The frame is also added to the longer background list. The output of that list is a normal average.
  5. The difference is calculated of the output of the foreground list and the background list.
  6. After that the movement are detected on the difference image.
  7. The areas with movements will be encapsulated with boxes.
  8. Finally, the boxes will be resized back to the original frame and added on it.

That is the process.

Step 2: The implementation of the above algorithm

The code will be commented with the above numbering to better understand it. The focus has been to keep it simple. If you need to install OpenCV you can read this tutorial.

import cv2
import numpy as np
import imutils
import time
from collections import deque


# Input to Step 5: Helper function
# Calculate the foreground frame based on frames
def get_movement(frames, shape):
    movement_frame = np.zeros(shape, dtype='float32')
    i = 0
    for f in frames:
        i += 1
        movement_frame += f * i
    movement_frame = movement_frame / ((1 + i) / 2 * i)
    movement_frame[movement_frame > 254] = 255
    return movement_frame


# Input to Step 5: Helper function
# Calculate the background frame based on frames
# This function has obvious improvement potential
# - Could avoid to recalculate full list every time
def get_background(frames, shape):
    bg = np.zeros(shape, dtype='float32')
    for frame in frames:
        bg += frame
    bg /= len(frames)
    bg[bg > 254] = 255
    return bg


# Detect and return boxes of moving parts
def detect(frame, bg_frames, fg_frames, threshold=20, min_box=200):
    # Step 3-4: Add the frame to the our list of foreground and background frames
    fg_frames.append(frame)
    bg_frames.append(frame)

    # Input to Step 5: Calculate the foreground and background frame based on the lists
    fg_frame = get_movement(list(fg_frames), frame.shape)
    bg_frame = get_background(list(bg_frames), frame.shape)

    # Step 5: Calculate the difference to detect movement
    movement = cv2.absdiff(fg_frame, bg_frame)
    movement[movement < threshold] = 0
    movement[movement > 0] = 254
    movement = movement.astype('uint8')
    movement = cv2.cvtColor(movement, cv2.COLOR_BGR2GRAY)
    movement[movement > 0] = 254
    # As we don't return the movement frame, we show it here for debug purposes
    # Should be removed before release
    cv2.imshow('Movement', movement)

    # Step 6: Find the list of contours
    contours = cv2.findContours(movement, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contours = imutils.grab_contours(contours)

    # Step 7: Convert them to boxes
    boxes = []
    for contour in contours:
        # Ignore small boxes
        if cv2.contourArea(contour) < min_box:
            continue
        # Convert the contour to a box and append it to the list
        box = cv2.boundingRect(contour)
        boxes.append(box)

    return boxes


def main(width=640, height=480, scale_factor=2):
    # Create the buffer of our lists
    bg_frames = deque(maxlen=30)
    fg_frames = deque(maxlen=10)

    # Get the webcam
    cap = cv2.VideoCapture(0)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)

    # We want to see how many frames per second we process
    last_time = time.time()
    while True:
        # Step 0: Read the webcam frame (ignore return code)
        _, frame = cap.read()

        # Resize the frame
        frame = cv2.resize(frame, (width, height))
        # Step 1: Scale down to improve speed (only takes integer scale factors)
        work_frame = cv2.resize(frame, (width // scale_factor, height // scale_factor))
        # Step 2: Blur it and convert the frame to float32
        work_frame = cv2.GaussianBlur(work_frame, (5, 5), 0)
        work_frame_f32 = work_frame.astype('float32')

        # Step 3-7 (steps in function): Detect all the boxes around the moving parts
        boxes = detect(work_frame_f32, bg_frames, fg_frames)

        # Step 8: Draw all boxes (remember to scale back up)
        for x, y, w, h in boxes:
            cv2.rectangle(frame, (x * scale_factor, y * scale_factor), ((x + w) * scale_factor, (y + h) * scale_factor),
                          (0, 255, 0), 2)

        # Add the Frames Per Second (FPS) and show frame
        text = "FPS:" + str(int(1 / (time.time() - last_time)))
        last_time = time.time()
        cv2.putText(frame, text, (10, 20), cv2.FONT_HERSHEY_PLAIN, 2, (0, 255, 0), 2)
        cv2.imshow('Webcam', frame)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()


if __name__ == "__main__":
    main()

Step 3: Test it

Let’s try it out in real life.

The above trial was done while recording from my laptop, which makes it quite slow (need a new one). The frame rate (FPS) without recording was about 30 FPS.

The detection can be set to the environment you like. You have the following parameters to adjust with.

  • width (default: 640): The size of the webcam frame. Only has a small impact on performance. Notice, if you change this, it has impacts the processing image sizes through scale_factor.
  • height (default: 480): The same as with width.
  • scale_factor (default 2): which scales the processed image down. As it is implemented it can only do it with
  • threshold (default: 20): It adjust how much change in the movement in order to detect it. The lower, the more it will detect.
  • min_box (default: 200): The size of the smallest box it will detect.
  • bg_frame = deque(maxlen(30)): The number of frames to keep in the background list. Should be larger than fg_frame.
  • fg_frame = deque(maxlen(10)): The number of frames to keep in the foreground list.

Those are the most obvious parameters you can adjust in your program. Notice, than some are affecting each other. It is kept that way to keep the code simple.

Next steps

If you increase the bg_frame to say, 120, you will see it slows down the processing. This is due to the long calculation done in get_background function. This can obviously be improved as it is a simple average we calculate. The reason I did not do it, is to keep it simple and understandable.

Also, I would like to try it out in real life to see how to fine-tune it at home. See how to set parameters to not get false positive due to changes in the lighting (if the sun suddenly shines in the room, or a cloud covers it).

How to Get Started with Yolo in Python

What will we cover in this tutorial?

How do you start with YOLO in Python? What to download? This tutorial will also cover a simple guide to how to use it in Python. The code has is as simple as possible with explanation.

Step 1: Download the Yolo stuff

The easy was to get things working is to just download the repository from GitHub as a zip file. You find the darknet repository here.

You can also reach and download it as a zip directly form here. The zip-file should be unpacked in the folder, where you develop you code. I renamed the resulting folder to yolo.

The next thing you need is the trained model, which you find on https://pjreddie.com/darknet/yolo/. Look for the following on the page and click on the weights.

We will use the YOLOv3-tiny, which you also can get directly from here.

The downloaded file should be placed in the folder where you develop your code.

Step 2: Load the network and apply it on an image

The code below is structured as follows. First you configure the location of the downloaded repository. Remember, I put it in the folder where I run my program and renamed it to yolo.

It then loads the labels of the possible objects, which a located in a file called coco.names. This is simply because the labels the network will give are indices into the names of coco.names. Further, it assigns some random colors to the labels, such that different labels have different colors.

After that it will read the network. Then it divides it into layers. It is a it unintuitive, but in the case of yolov3-tiny.cfg, it needs only two layers which it gets there.

It loads the image (from the repository), transforms it into a blob that the network understands and runs it on it.

import numpy as np
import time
import cv2
import os


DARKNET_PATH = 'yolo'

# Read labels that are used on object
labels = open(os.path.join(DARKNET_PATH, "data", "coco.names")).read().splitlines()
# Make random colors with a seed, such that they are the same next time
np.random.seed(0)
colors = np.random.randint(0, 255, size=(len(labels), 3)).tolist()

# Give the configuration and weight files for the model and load the network.
net = cv2.dnn.readNetFromDarknet(os.path.join(DARKNET_PATH, "cfg", "yolov3-tiny.cfg"), "yolov3-tiny.weights")
# Determine the output layer, now this piece is not intuitive
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]

# Load the image
image = cv2.imread(os.path.join(DARKNET_PATH, "data", "dog.jpg"))
# Get the shape
h, w = image.shape[:2]
# Load it as a blob and feed it to the network
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)
start = time.time()
# Get the output
layer_outputs = net.forward(ln)
end = time.time()

Then we need to parse the result in layer_outputs.

Step 3: Parse the result form layer_outputs (Yolo output)

This is at first a bit tricky. You need first to understand the overall flow.

First, you will run through all the results in the layers (we have two layers). Second, you will remove overlapped results, as there might be multiple boxes that identify the same object, just from a bit different boundary boxes. Third, and finally, you need to draw the remaining boxes with labels (and colors) on the image.

To go through that process we need three lists to keep track of it. One for the actual boxes that encapsulates the identified object (boxes). Then the corresponding confidence (confidence), that is, how sure is the algorithm. Finally, the class id, which is used to identify the name we have in the labels (class_ids).

The detection is the result, which in the 4 first entries has the position and size of the identified object. Then the following entries contains the confidence score on all the possible objects in the network.

# Initialize the lists we need to interpret the results
boxes = []
confidences = []
class_ids = []

# Loop over the layers
for output in layer_outputs:
    # For the layer loop over all detections
    for detection in output:
        # The detection first 4 entries contains the object position and size
        scores = detection[5:]
        # Then it has detection scores - it takes the one with maximal score
        class_id = np.argmax(scores).item()
        # The maximal score is the confidence
        confidence = scores[class_id].item()

        # Ensure we have some reasonable confidence, else ignorre
        if confidence > 0.3:
            # The first four entries have the location and size (center, size)
            # It needs to be scaled up as the result is given in relative size (0.0 to 1.0)
            box = detection[0:4] * np.array([w, h, w, h])
            center_x, center_y, width, height = box.astype(int).tolist()

            # Calculate the upper corner
            x = center_x - width//2
            y = center_y - height//2

            # Add our findings to the lists
            boxes.append([x, y, width, height])
            confidences.append(confidence)
            class_ids.append(class_id)

# Only keep the best boxes of the overlapping ones
idxs = cv2.dnn.NMSBoxes(boxes, confidences, 0.3, 0.3)

# Ensure at least one detection exists - needed otherwise flatten will fail
if len(idxs) > 0:
    # Loop over the indexes we are keeping
    for i in idxs.flatten():
        # Get the box information
        x, y, w, h = boxes[i]

        # Make a rectangle
        cv2.rectangle(image, (x, y), (x + w, y + h), colors[class_ids[i]], 2)
        # Make and add text
        text = "{}: {:.4f}".format(labels[class_ids[i]], confidences[i])
        cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX,
                    0.5, colors[class_ids[i]], 2)

# Write the image with boxes and text
cv2.imwrite("example.png", image)
Resulting image

The full code together

The full source code put together.

import numpy as np
import time
import cv2
import os


DARKNET_PATH = 'yolo'

# Read labels that are used on object
labels = open(os.path.join(DARKNET_PATH, "data", "coco.names")).read().splitlines()
# Make random colors with a seed, such that they are the same next time
np.random.seed(0)
colors = np.random.randint(0, 255, size=(len(labels), 3)).tolist()

# Give the configuration and weight files for the model and load the network.
net = cv2.dnn.readNetFromDarknet(os.path.join(DARKNET_PATH, "cfg", "yolov3-tiny.cfg"), "yolov3-tiny.weights")
# Determine the output layer, now this piece is not intuitive
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]

# Load the image
image = cv2.imread(os.path.join(DARKNET_PATH, "data", "dog.jpg"))
# Get the shape
h, w = image.shape[:2]
# Load it as a blob and feed it to the network
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)
start = time.time()
# Get the output
layer_outputs = net.forward(ln)
end = time.time()


# Initialize the lists we need to interpret the results
boxes = []
confidences = []
class_ids = []

# Loop over the layers
for output in layer_outputs:
    # For the layer loop over all detections
    for detection in output:
        # The detection first 4 entries contains the object position and size
        scores = detection[5:]
        # Then it has detection scores - it takes the one with maximal score
        class_id = np.argmax(scores).item()
        # The maximal score is the confidence
        confidence = scores[class_id].item()

        # Ensure we have some reasonable confidence, else ignorre
        if confidence > 0.3:
            # The first four entries have the location and size (center, size)
            # It needs to be scaled up as the result is given in relative size (0.0 to 1.0)
            box = detection[0:4] * np.array([w, h, w, h])
            center_x, center_y, width, height = box.astype(int).tolist()

            # Calculate the upper corner
            x = center_x - width//2
            y = center_y - height//2

            # Add our findings to the lists
            boxes.append([x, y, width, height])
            confidences.append(confidence)
            class_ids.append(class_id)

# Only keep the best boxes of the overlapping ones
idxs = cv2.dnn.NMSBoxes(boxes, confidences, 0.3, 0.3)

# Ensure at least one detection exists - needed otherwise flatten will fail
if len(idxs) > 0:
    # Loop over the indexes we are keeping
    for i in idxs.flatten():
        # Get the box information
        x, y, w, h = boxes[i]

        # Make a rectangle
        cv2.rectangle(image, (x, y), (x + w, y + h), colors[class_ids[i]], 2)
        # Make and add text
        text = "{}: {:.4f}".format(labels[class_ids[i]], confidences[i])
        cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX,
                    0.5, colors[class_ids[i]], 2)

# Write the image with boxes and text
cv2.imwrite("example.png", image)

OpenCV: A Simple Approach to Counting Cars

KISS – Keep it simple s…

In this tutorial we will make a simple car counter using OpenCV from Python. It will not be a perfect solution, but it will be easy to understand and in some cases better.

The counter will take advantage of the simple assumptions that objects that move through a defined box on the right side of road are cars driving in one direction. And objects moving through a defined box of the left side of the road are cars driving the other direction.

This is of course not a perfect assumption, but it makes things easier. There is no need to identify if it is car or not. This is actually an advantage, since by the default car cascade classifiers might not recognize cars from the angle your camera is set. At least, I had problems with that. I could train my own cascade classifier, but why not try to do something smart.

Step 1: Get a live feed from the webcam in OpenCV

First you need to ensure you have installed OpenCV. If you use PyCharm we can recommend you read this tutorial on how to set it up.

To get a live feed from your webcam can be achieved by the following lines of code.

import cv2


cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

while cap.isOpened():
    _, frame = cap.read()

    cv2.imshow("Car counter", frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

The cv2.VideoCapture(0) assumes that you only have one webcam. If you have more, you might need to change 0 to something else.

The cap.set(…) are setting the width and height of the camera frames. In order to get good performance it is good to scale down. This can also be achieved with scaling the picture you make processing on after down.

Then cap.read() reads the next frame. It also returns a return value, but we ignore that value with the underscore (_). The cv2.imshow(…) will create a window with showing the frame. Finally, the cv2.waitkey(1) waits 1 millisecond and check if q was pressed. If so, it will break out and release the camera and destroy the window.

Step 2: Identify moving objects with OpenCV

The simple idea is that to compare each frame with the previous one. If there is a difference, we have a moving object. Of course, a bit more complex, as we also want to identify where the objects are and avoid identifying differences due to noise in the picture.

As most processing on moving images we will start by converting them to gray tones (cv2.cvtColor(…)). Then we will use blurring to minimize details in the picture (cv2.GaussianBlur(…)). This helps us with falsely identifying moving things that are just because of noise and minor changes.

When that is done, we compare that converted frame with the one from previous frame (cv2.absdiff(…)). This gives you an idea of what has changed. We keep a threshold (cv2.threshold(…)) on it and then dilate (cv2.dilate(…)) change to make it easier to identify with cv2.findContours(…).

It boils down to the following code.

import cv2
import imutils


cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# We will keep the last frame in order to see if there has been any movement
last_frame = None

while cap.isOpened():
    _, frame = cap.read()

    # Processing of frames are done in gray
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # We blur it to minimize reaction to small details
    gray = cv2.GaussianBlur(gray, (21, 21), 0)

    # Need to check if we have a last_frame, if not get it
    if last_frame is None:
        last_frame = gray
        continue

    # Get the difference from last_frame
    delta_frame = cv2.absdiff(last_frame, gray)
    last_frame = gray
    # Have some threshold on what is enough movement
    thresh = cv2.threshold(delta_frame, 25, 255, cv2.THRESH_BINARY)[1]
    # This dilates with two iterations
    thresh = cv2.dilate(thresh, None, iterations=2)
    # Returns a list of objects
    contours = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    # Converts it
    contours = imutils.grab_contours(contours)

    # Loops over all objects found
    for contour in contours:
        # Get's a bounding box and puts it on the frame
        (x, y, w, h) = cv2.boundingRect(contour)
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)

    # Let's show the frame in our window
    cv2.imshow("Car counter", frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

If you don’t look out for what is happening it could turn out to a picture like this one (I am sure you take more care than me).

Example frame of moving objects.

One thing to notice is, that we could make a lower limit on the sizes of the moving objects. This can be achieved by inserting a check before we make the green boxes.

Step 3: Creating a helper class to track counts

To make our life easier we introduce a helper class to represent a box on the screen that keeps track on how many objects have been moving though it.

class Box:
    def __init__(self, start_point, width_height):
        self.start_point = start_point
        self.end_point = (start_point[0] + width_height[0], start_point[1] + width_height[1])
        self.counter = 0
        self.frame_countdown = 0

    def overlap(self, start_point, end_point):
        if self.start_point[0] >= end_point[0] or self.end_point[0] <= start_point[0] or \
                self.start_point[1] >= end_point[1] or self.end_point[1] <= start_point[1]:
            return False
        else:
            return True

The class will take the staring point (start_point) and the width and height (width_height) to the constructor. As we will need start_point and end_point when drawing the box in the frame we calculate that immediately in the constructor (__init__(…)).

Further, we will have a counter to keep track on how many object have passed through the box. There is also a frame_countdown, which is used to minimize multiple counts of the same moving object. What can happen is that in one frame the moving object is identified, while in the next it is not, but then it is identified again. If that all happens within the box, it will count the object twice. Hence, we will have countdown that says we need at minimum number of frames between identified moving objects before we can assume it is a new one.

Step 4: Using the helper class and start the counting

We need to add all the code together here.

It requires a few things. Before we enter the main while loop, we need to setup the boxes we want to count moving objects in. Here we setup two, which will be one for each direction the cars can drive. Inside the contours loop, we set a lower limit of the contour sizes. Then we go through all the boxes and update the appropriate variables and build the string text. After that, it will print the text in the frame as well as add all the boxes to it.

import cv2
import imutils


class Box:
    def __init__(self, start_point, width_height):
        self.start_point = start_point
        self.end_point = (start_point[0] + width_height[0], start_point[1] + width_height[1])
        self.counter = 0
        self.frame_countdown = 0

    def overlap(self, start_point, end_point):
        if self.start_point[0] >= end_point[0] or self.end_point[0] <= start_point[0] or \
                self.start_point[1] >= end_point[1] or self.end_point[1] <= start_point[1]:
            return False
        else:
            return True


cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# We will keep the last frame in order to see if there has been any movement
last_frame = None

# To build a text string with counting status
text = ""

# The boxes we want to count moving objects in
boxes = []
boxes.append(Box((100, 200), (10, 80)))
boxes.append(Box((300, 350), (10, 80)))

while cap.isOpened():
    _, frame = cap.read()

    # Processing of frames are done in gray
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # We blur it to minimize reaction to small details
    gray = cv2.GaussianBlur(gray, (5, 5), 0)

    # Need to check if we have a lasqt_frame, if not get it
    if last_frame is None or last_frame.shape != gray.shape:
        last_frame = gray
        continue

    # Get the difference from last_frame
    delta_frame = cv2.absdiff(last_frame, gray)
    last_frame = gray
    # Have some threshold on what is enough movement
    thresh = cv2.threshold(delta_frame, 25, 255, cv2.THRESH_BINARY)[1]
    # This dilates with two iterations
    thresh = cv2.dilate(thresh, None, iterations=2)
    # Returns a list of objects
    contours = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    # Converts it
    contours = imutils.grab_contours(contours)

    # Loops over all objects found
    for contour in contours:
        # Skip if contour is small (can be adjusted)
        if cv2.contourArea(contour) < 500:
            continue

        # Get's a bounding box and puts it on the frame
        (x, y, w, h) = cv2.boundingRect(contour)
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)

        # The text string we will build up
        text = "Cars:"
        # Go through all the boxes
        for box in boxes:
            box.frame_countdown -= 1
            if box.overlap((x, y), (x + w, y + h)):
                if box.frame_countdown <= 0:
                    box.counter += 1
                # The number might be adjusted, it is just set based on my settings
                box.frame_countdown = 20
            text += " (" + str(box.counter) + " ," + str(box.frame_countdown) + ")"

    # Set the text string we build up
    cv2.putText(frame, text, (10, 20), cv2.FONT_HERSHEY_PLAIN, 2, (0, 255, 0), 2)

    # Let's also insert the boxes
    for box in boxes:
        cv2.rectangle(frame, box.start_point, box.end_point, (255, 255, 255), 2)

    # Let's show the frame in our window
    cv2.imshow("Car counter", frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Step 5: Real life test on counting cars (not just moving objects)

The real question is, does it work or did we oversimplify the problem. If it works we have created a very small piece of code (comparing to other implementations), which can count cars all day long.

I adjusted the parameters a bit and got the following with my first real trial.

Counting correctly.

Please notice, that it does not start from zero in the video. But it counts the number of cars in each direction correctly. As expected, it counts when each car reaches the white bar.

The number of cars is the first number, while the second is just visible for me to see if my guess of skipping frames was useable.

Are we done?

Not at all. This was just to see if we could make some simple and fast to count cars. My first problem was, that given the trained sets of car recognition (car cascade classifiers) was not happy about the angle on the cars from my window. I first thought of training my own cascade classifier, but I thought it was fun to try something more simple.

There are a lot of parameters which can be tuned to make it more reliable, but the main fact is, that it was counting correctly in the given test. I can see one challenge, if a big truck drives by from the left to the right, it might get in the way of the other counter. This could be a potential challenge with this simple approach.

Install OpenCV 4 in PyCharm

What will we cover?

You want to start you first OpenCV project in PyCharm.

import cv2

And you get.

From PyCharm

You press Install package cv2, but you get.

Error message from PyChar (lower right corner).

What to do? No worries. We will cover that in this survival guide and it is not complex.

If you only want the solution go to Step 3.

Or see the tutorial on YouTube.

Step 1: Understand how PyCharm works with a Virtual Environment

When you create a new project in PyCharm you get promoted by this screen (PyCharm 2020.2).

Creating a project OpenCV in PyCharm

It says Python interpreter: New Virtualenv environment. What does that mean?

Well, it creates a isolated environment to have your project in. Then each project can have it own dependencies and libraries without impacting other projects.

Remember kindergarten? There was only one sandbox, and there was not enough for multiple projects in it. Like building a sand castle, making a river, and what ever you did as kid. The problem was, if you wanted to build a castle while your fellow kindergarten friends wanted to play mountain collapse (you know when a mountain collapses). Then their game would destroy your well engineered 5 feet tall castle. It was the beginning of a riot.

Think of a kindergarten where there is one sandbox for each project you could image. One for castle building. One for mountain collapse. You see. Now everyone can play in their own world or environment.

The virtual environment is like that. You can go crazy in it without destroying other awesome projects you do. Hence, if you feel like making a mountain collapse project, you should not fear it will destroy your well engineered castle project.

Step 2: How does this virtual environment work, and why does it matter for OpenCV?

Good question.

If you follow some manual online you might end up installing OpenCV on your base system and not in the virtual environment in your project.

But where is the virtual environment located. It depends on two things. First, where are your projects located. Second, what is the name of your project.

I used the default location when I installed PyCharm, which is PyCharmProjects in my home folder. Further, in this case I called the project OpenCV.

If I open a command line I can type the following to get to the location.

Command line terminal

Then you will see a folder called venv, which is short for virtual environment. Go into that folder and follow down in the bin (binary) folder.

Command line terminal

Now you are located where you can install the OpenCV library.

Step 3: Method 1 – Install OpenCV library in your virtual environment

Go to PyCharm menu and choose Preferences…

On the left side find Project (with the name of the project you are working on) and choose subitem Python Interpreter.

Press the little plus-sign in the bottom of the window and an install will show .

Write opencv-python in the window that opens and press Install

And you are ready.

If it worked (no read line under cv2) then skip ahead to Step 5 to try it out.

Step 4: Method 2 (if Method 1 fails) Install the OpenCV library in your virtual environment

Use pip is the package manager system for Python. You want to ensure you use the pip from the above library.

./pip install opencv-python
From command line terminal

You might get a bit different output, as I already had the library cached.

Back in PyCharm it will update and look like this.

Back in PyCharm the red line disappeared

Now you are ready for your first test.

Step 5: Testing that OpenCV works

Let’s find a picture.

Castle

Download the above image and save it as Castle.png in your project folder.

import cv2

img = cv2.imread("Castle.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.imshow("Over the Clouds", img)
cv2.imshow("Over the Clouds - gray", gray)

cv2.waitKey(0)
cv2.destroyAllWindows()

Which should result in something like this.

The end result