What will we cover in this tutorial?
A simple motion detector can be created by the difference of two images of the same frame. A base image of all the static elements, compared with a new image of the same frame, to identify changes.
The challenge with such an approach is that it is not noise-tolerant and it can be difficult to ensure the correct base image.
In this tutorial we will see how to create a simple noise-tolerant motion detector.
Step 1: Understand how we can make it noise-tolerant
In the previous tutorial we made a simple car counter. The approach was to compare the last two frames and take the difference between them. While it works it is not very robust to simple noise and will create a lot of false positives.
To deal with that we will modify this approach simply by making the last frame to consist of the average of the last 30 frames, and the current frame to be the weighted average of the last 10 frames.
The process will be like the illustration shows.
- The first step is to resize the image for two reasons. First, we don’t need the detail level further down. Second, it will make processing faster.
- Then we blur the image to minimize the impact of small details.
- This step is to add the image to a foreground list, which will be used to calculate the first image. This list can be 10 frames. We will calculate the foreground image from it using a weighted average to give more weight to never frames.
- The frame is also added to the longer background list. The output of that list is a normal average.
- The difference is calculated of the output of the foreground list and the background list.
- After that the movement are detected on the difference image.
- The areas with movements will be encapsulated with boxes.
- Finally, the boxes will be resized back to the original frame and added on it.
That is the process.
Step 2: The implementation of the above algorithm
The code will be commented with the above numbering to better understand it. The focus has been to keep it simple. If you need to install OpenCV you can read this tutorial.
import cv2 import numpy as np import imutils import time from collections import deque # Input to Step 5: Helper function # Calculate the foreground frame based on frames def get_movement(frames, shape): movement_frame = np.zeros(shape, dtype='float32') i = 0 for f in frames: i += 1 movement_frame += f * i movement_frame = movement_frame / ((1 + i) / 2 * i) movement_frame[movement_frame > 254] = 255 return movement_frame # Input to Step 5: Helper function # Calculate the background frame based on frames # This function has obvious improvement potential # - Could avoid to recalculate full list every time def get_background(frames, shape): bg = np.zeros(shape, dtype='float32') for frame in frames: bg += frame bg /= len(frames) bg[bg > 254] = 255 return bg # Detect and return boxes of moving parts def detect(frame, bg_frames, fg_frames, threshold=20, min_box=200): # Step 3-4: Add the frame to the our list of foreground and background frames fg_frames.append(frame) bg_frames.append(frame) # Input to Step 5: Calculate the foreground and background frame based on the lists fg_frame = get_movement(list(fg_frames), frame.shape) bg_frame = get_background(list(bg_frames), frame.shape) # Step 5: Calculate the difference to detect movement movement = cv2.absdiff(fg_frame, bg_frame) movement[movement < threshold] = 0 movement[movement > 0] = 254 movement = movement.astype('uint8') movement = cv2.cvtColor(movement, cv2.COLOR_BGR2GRAY) movement[movement > 0] = 254 # As we don't return the movement frame, we show it here for debug purposes # Should be removed before release cv2.imshow('Movement', movement) # Step 6: Find the list of contours contours = cv2.findContours(movement, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) contours = imutils.grab_contours(contours) # Step 7: Convert them to boxes boxes =  for contour in contours: # Ignore small boxes if cv2.contourArea(contour) < min_box: continue # Convert the contour to a box and append it to the list box = cv2.boundingRect(contour) boxes.append(box) return boxes def main(width=640, height=480, scale_factor=2): # Create the buffer of our lists bg_frames = deque(maxlen=30) fg_frames = deque(maxlen=10) # Get the webcam cap = cv2.VideoCapture(0) cap.set(cv2.CAP_PROP_FRAME_WIDTH, width) cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height) # We want to see how many frames per second we process last_time = time.time() while True: # Step 0: Read the webcam frame (ignore return code) _, frame = cap.read() # Resize the frame frame = cv2.resize(frame, (width, height)) # Step 1: Scale down to improve speed (only takes integer scale factors) work_frame = cv2.resize(frame, (width // scale_factor, height // scale_factor)) # Step 2: Blur it and convert the frame to float32 work_frame = cv2.GaussianBlur(work_frame, (5, 5), 0) work_frame_f32 = work_frame.astype('float32') # Step 3-7 (steps in function): Detect all the boxes around the moving parts boxes = detect(work_frame_f32, bg_frames, fg_frames) # Step 8: Draw all boxes (remember to scale back up) for x, y, w, h in boxes: cv2.rectangle(frame, (x * scale_factor, y * scale_factor), ((x + w) * scale_factor, (y + h) * scale_factor), (0, 255, 0), 2) # Add the Frames Per Second (FPS) and show frame text = "FPS:" + str(int(1 / (time.time() - last_time))) last_time = time.time() cv2.putText(frame, text, (10, 20), cv2.FONT_HERSHEY_PLAIN, 2, (0, 255, 0), 2) cv2.imshow('Webcam', frame) if cv2.waitKey(1) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows() if __name__ == "__main__": main()
Step 3: Test it
Let’s try it out in real life.
The above trial was done while recording from my laptop, which makes it quite slow (need a new one). The frame rate (FPS) without recording was about 30 FPS.
The detection can be set to the environment you like. You have the following parameters to adjust with.
- width (default: 640): The size of the webcam frame. Only has a small impact on performance. Notice, if you change this, it has impacts the processing image sizes through scale_factor.
- height (default: 480): The same as with width.
- scale_factor (default 2): which scales the processed image down. As it is implemented it can only do it with
- threshold (default: 20): It adjust how much change in the movement in order to detect it. The lower, the more it will detect.
- min_box (default: 200): The size of the smallest box it will detect.
- bg_frame = deque(maxlen(30)): The number of frames to keep in the background list. Should be larger than fg_frame.
- fg_frame = deque(maxlen(10)): The number of frames to keep in the foreground list.
Those are the most obvious parameters you can adjust in your program. Notice, than some are affecting each other. It is kept that way to keep the code simple.
If you increase the bg_frame to say, 120, you will see it slows down the processing. This is due to the long calculation done in get_background function. This can obviously be improved as it is a simple average we calculate. The reason I did not do it, is to keep it simple and understandable.
Also, I would like to try it out in real life to see how to fine-tune it at home. See how to set parameters to not get false positive due to changes in the lighting (if the sun suddenly shines in the room, or a cloud covers it).