Batch Process Face Detection in 3 Steps with OpenCV

What will you learn?

You want to extract or identify faces on a bunch of images, but how do you do that without becoming a Machine Learning expert?

Here you will learn how to do it without any Machine Learning skills.

Many Machine Learning things are done so often you can just use pre-built Machine Learning models. Here you will learn the task of finding faces and extract locations of them.

Step 1: Pre-built OpenCV models to detect faces

When you think of detecting faces on images, you might get scared. I’ve been there, but there is nothing to be scared of, because some awesome people already did all the hard work for you.

They built a model, which can detect faces on images.

All you need to do, is, to feed it with images and let it do all the work.

This boils down to the following.

  1. We need to know what model to use.
  2. How to feed it with images.
  3. How to use the results it brings and convert it to something useful.

This is what the rest of this tutorial will teach you.

We will use OpenCV and their pre-built detection model haarcascade.

First you should download and install the requirements.

This can be done either by cloning this repository.

Or download the files as a zip-file and unpack them.

You should install opencv-python library. This can be done as follows.

pip install opencv-python

You can also use the requirements.txt file to install it.

pip install -r requirements.txt

Step 2: Detect a face

We will use this image to start with.

The picture is part of the repository from step 1.

Now let’s explore the code in face_detection.py.

# importing opencv
import cv2
# using cv2.CascadeClassifier
# See https://docs.opencv.org/3.4/db/d28/tutorial_cascade_classifier.html
# See more Cascade Classifiers https://github.com/opencv/opencv/tree/4.x/data/haarcascades
face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
img = cv2.imread("sample_images/sample-00.jpg")
# changing the image to gray scale for better face detection
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(
    gray,
    scaleFactor=2,  # Big reduction
    minNeighbors=5  # 4-6 range
)
# drawing a rectangle to the image.
# for loop is used to access all the coordinates of the rectangle.
for x, y, w, h in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 5)
# showing the detected face followed by the waitKey method.
cv2.imshow("image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

First notice, that the opencv-python package is imported by import cv2.

Then also, notice we need to run this code in from where the file haarcascade_frontalface_default.xml is located.

After that you will read the image into the variable img. Notice, that this assumes you run the file like they are structure in the GitHub (downloaded in step 1).

When you work with images, you often do not need the level of details given in it. Therefore, the first thing we doit to gray scale the image.

After we have gray scaled the image we use the face detection model (face_cascade.detectMultiScale).

This will give the result faces, which is an iterable.

We want to insert rectangles of the images in the original image (not the gray scaled).

Finally, we show the image and wait until someone hist a key.

Step 3: Batch process face detection

To batch process face detection, a great idea is to build a class to do the face detections. It could be designed in many ways. But the idea is to decouple the filename processing from the actual face detection.

One way to do it could be as follows.

import os
import cv2

class FaceDetector:
    def __init__(self, scale_factor=2, min_neighbors=5):
        self.face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
        self.scale_factor = scale_factor
        self.min_neighbors = min_neighbors
        self.img = None
    def read_image(self, filename):
        self.img = cv2.imread(filename)
    def detect_faces(self):
        gray = cv2.cvtColor(self.img, cv2.COLOR_BGR2GRAY)
        faces = self.face_cascade.detectMultiScale(
            gray,
            scaleFactor=self.scale_factor,
            minNeighbors=self.min_neighbors
        )
        # drawing a rectangle to the image.
        # for loop is used to access all the coordinates of the rectangle.
        for x, y, w, h in faces:
            cv2.rectangle(self.img, (x, y), (x + w, y + h), (0, 255, 0), 5)
        return self.img

face_detector = FaceDetector()
for filename in os.listdir('sample_images/'):
    print(filename)
    face_detector.read_image(f'sample_images/{filename}')
    img = face_detector.detect_faces()
    cv2.imshow("image", img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

If you want to write the files to storage with face detections, you should exchange the the line cv2.imshow with the following.

    cv2.imwrite(filename, img)

Want to learn more Machine Learning?

You will surprised how easy Machine Learning has become. There are many great and easy to use libraries. All you need to learn is how to train them and use them to predict.

If you want to learn more?

Then I created this 10 hours free Machine Learning course, which will cover all you need.

  • 15 video lessons – which explain Machine Learning concepts, demonstrate models on real data, introduce projects and show a solution (YouTube playlist).
  • 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
  • 15 projects – with step guides to help you structure your solutions and solution explained in the end of video lessons (GitHub).

The Ultimate Pic Chart Guide for Matplotlib

What will you learn?

Pie charts are one of the most powerful visualizations when presenting them. With a few tricks you can make them look professional with a free tool like Matplotlib.

In the end of this tutorial you will know how to make pie charts and customize it even further.

Basic Pie Chart

First you need to make a basic Pie chart with matplotlib.

import matplotlib.pyplot as plt
v = [2, 5, 3, 1, 4]
labels = ["A", "B", "C", "D", "E"]
plt.pie(v, labels=labels)
plt.show()

This will create a chart based on the values in v with the labels in labels.

Based on the above Pie Chart we can continue to build further understanding of how to create more advanced charts.

Exploding Segment in Pie Charts

An exploding segment in a pie chart is simply moving segments of the pie chart out.

The following example will demonstrate it.

import matplotlib.pyplot as plt
v = [2, 5, 3, 1, 4]
labels = ["A", "B", "C", "D", "E"]
explode = [0, 0.1, 0, 0.2, 0]
plt.pie(v, labels=labels, explode=explode)
plt.show()

Though not very pretty, it shows you how to control each segment.

Now let’s learn a bit more about how to style it.

Styling Pie Charts

The following list sets the most used parameters for the pie chart.

  • labels The labels.
  • colors The colors.
  • explode Indicates offset of each segment.
  • startangle Angle to start from.
  • counterclock Default True and sets direction.
  • shadow Enables shadow effect.
  • wedgeprops Example {"edgecolor":"k",'linewidth': 1}.
  • autopct Format indicating percentage labels "%1.1f%%".
  • pctdistance Controls the position of percentage labels.

We already know the labels from above. But let’s add some more to see the effect.

import matplotlib.pyplot as plt
v = [2, 5, 3, 1, 4]
labels = ["A", "B", "C", "D", "E"]
colors = ["blue", "red", "orange", "purple", "brown"]
explode = [0, 0, 0.1, 0, 0]
wedge_properties = {"edgecolor":"k",'linewidth': 1}
plt.pie(v, labels=labels, explode=explode, colors=colors, startangle=30,
           counterclock=False, shadow=True, wedgeprops=wedge_properties,
           autopct="%1.1f%%", pctdistance=0.7)
plt.title("Color pie chart")
plt.show()

This does a decent job.

Donut Chart

A great chart to play with is the Donut chart.

Actually, pretty simple by setting wedgeprops as this example shows.

import matplotlib.pyplot as plt
v1 = [2, 5, 3, 1, 4]
labels1 = ["A", "B", "C", "D", "E"]
width = 0.3
wedge_properties = {"width":width}
plt.pie(v1, labels=labels1, wedgeprops=wedge_properties)
plt.show()

The width is taken from outside and in.

Legends on Pie Chart

You can add a legend, which uses the labels. Also, notice that you can set the placement (loc) of the legend.

import matplotlib.pyplot as plt
labels = 'Dalmatians', 'Beagles', 'Labradors', 'German Shepherds'
sizes = [6, 5, 20, 9]
fig, ax = plt.subplots()
ax.pie(sizes, labels=labels, autopct='%.1f%%')
ax.legend(labels, loc='lower left')
plt.show()

Nested Donut Pie Chart

This one is needed in any situation to show a bit off.

import matplotlib.pyplot as plt
v1 = [2, 5, 3, 1, 4]
labels1 = ["A", "B", "C", "D", "E"]
v2 = [4, 1, 3, 4, 1]
labels2 = ["V", "W", "X", "Y", "Z"]
width = 0.3
wedge_properties = {"width":width, "edgecolor":"w",'linewidth': 2}
plt.pie(v1, labels=labels1, labeldistance=0.85,
        wedgeprops=wedge_properties)
plt.pie(v2, labels=labels2, labeldistance=0.75,
        radius=1-width, wedgeprops=wedge_properties)
plt.show()

Want to learn more?

Actually Data Visualization is an important skill to understand and present data.

This is a key skill in Data Science. If you like to learn more then check my free Expert Data Science Blueprint course with the following resources.

  • 15 video lessons – covers the Data Science Workflow and concepts, demonstrates everything on real data, introduce projects and shows a solution (YouTube video).
  • 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
  • 15 projects – structured with the Data Science Workflow and a solution explained in the end of video lessons (GitHub).

11 Useful pandas Charts with One Line of Code

What will you learn?

When trying to understand data, visualization is the key to fast understand it!

Data visualization has 3 purposes.

  1. Data Quality: Finding outliers and missing data.
  2. Data Exploration: Understand the data.
  3. Data Presentation: Present the result.

Here you will learn 11 useful charts to understand your data and they are done in one line of code.

The data we will work with

We need some data to work with.

You can either download the Notebook and csv-file (GitHub repo) or read it directly from repository as follows.

import pandas as pd
import matplotlib.pyplot as plt
file_url = 'https://raw.githubusercontent.com/LearnPythonWithRune/pandas_charts/main/air_quality.csv'
data = pd.read_csv(file_url, index_col=0, parse_dates=True)
print(data)

This will output the first 5 lines of the data.

Now let’s use the data we have in the DataFrame data.

If you are new to pandas, I suggest you get an understanding of them from this guide.

#1 Simple plot

A simple plot is the default to use unless you know what you want. It will demonstrate the nature of the data.

Let’s try to do it here.

data.plot()

As you notice, there are three columns of data for the 3 stations: Antwerp, Paris, and London.

The data is a datetime series, meaning, that each data point is part of a time series (the x-axis).

It is a bit difficult to see if station Antwerp has a full dataset.

Let’s try to figure that out.

#2 Isolated plot

This leads us to making an isolated plot of only one column. This is handy to understand each individual column of data better.

Here we were a bit curious about if the data of station Antwerp was given for all dates.

data['station_antwerp'].plot()

This shows that our suspicion was correct. The time series is not covering the full range for station Antwerp.

This tells us about the data quality, which might be crucial for further analysis.

You can do the same for the other two columns.

#3 Scatter Plot

A great way to see if there is a correlation of data, is to make a scatter plot.

Let’s demonstrate how that looks like.

data.plot.scatter(x='station_london', y='station_paris', alpha=.25)

You see that data is not totally scattered all over, but is not fully correlated either. This means, that there is come weak correlation of the data and it is not fully independent of each other.

#4 Box Plot

One way to understand data better is by a box plot. It might need a bit of understanding of simple statistics.

Let’s first take a look at it.

data.plot.box()

The box plot shows the following.

To understand what outliers, min, median, max, and so forth means, I would suggest you read this simple statistic guide.

#5 Area Plot

An area plot can show you the data in a great way to see how the values follow each other in a visual easy way to get an understanding of values, correlation, and missing data.

data.plot.area(figsize=(12,4), subplots=True)

#6 Bar plots

Bar plots can be useful, but often when the data is more limited.

Here you see a bar plot of the first 15 rows of data.

data.iloc[:15].plot.bar()

#7 Histograms for single column

Histograms will show you what data is most common. It shows the frequencies of data divided into bins. By default there are 10 bins of data.

It is an amazing tool to get a fast view of the number of occurrences of each data range.

Here first for an isolated station.

data['station_paris'].plot.hist()

#8 Histograms for multiple columns

Then for all three stations, where you see it with transparency (alpha).

data.plot.hist(alpha=.5)

#9 Pie

Pie charts are very powerful, when you want to show a division of data.

How many percentage belong to each category.

Here you see the mean value of each station.

data.mean().plot.pie()

#10 Scatter Matrix Plot

This is a great tool for showing data for combined in all possible ways. This will show you correlations and how data is distributed.

You need to import an additional library, but it gives you fast understanding of data.

from pandas.plotting import scatter_matrix
scatter_matrix(data, alpha=0.2, figsize=(6, 6))

#11 Secondary y-axis

Finally, sometimes you want two plots on the same chart. The problem can be, that the two plots have very different ranges. hence, you would like to have two different y-axes, with different ranges.

This will enable you to have plots on the same chart with different ranges.

data['station_london'].plot()
data['station_paris'].cumsum().plot(secondary_y=True)

Want to learn more?

Want to learn more about Data Science to become a successful Data Scientist?

Then check my free Expert Data Science Blueprint course with the following resources.

  • 15 video lessons – covers the Data Science Workflow and concepts, demonstrates everything on real data, introduce projects and shows a solution (YouTube video).
  • 30 JuPyter Notebooks – with the full code and explanation from the lectures and projects (GitHub).
  • 15 projects – structured with the Data Science Workflow and a solution explained in the end of video lessons (GitHub).

15 Useful Things Things You Can Do in One Line of Python Code

What will we cover?

In this tutorial you will learn why One Liners is something beginners focus on and senior developers don’t waste their time on. But You will learn some useful things you can do in one line of Python code.

Why are One-Liners not (always) good?

Why are one-liners are bad?

If I could say it one word: Readability.

Most beginners, I was the same, are focus on solving the problem, and later, often solving it in some impressive way.

Why do senior developers not do that? Well, they spend hours debugging code – code should be easy to understand and maintain.

Yes, senior developers know that code that can written in one line, is often difficult to get useful stack traces from, when they fail. They know it is better to focus on breaking the code up in multiple lines. This makes the stack trace easier to get the error. Also, it enhances the readability of the code. Which makes it easier to understand and therefore to maintain.

Now let’s dive into 15 useful things you still can do in one line of Python code without compromising readability.

#1 Swap Two Variables

This one might not be fully appreciated if you have not been coding in another language.

Take a moment and think about the following problem.

You have two drinks.

Your job is to switch the content of the glass. That is, the blue drink should be in the glass of the red drink, and vice verse.

Obviously, to do this, you need a third glass or something similar.

This is the same problem when you need to swap (switch) the the “content” of two variables.

As you need to do this often as a programmer, Python has made this easy for you.

a = 10
b = 20
print(a, b)
# Swap the two variables
a, b = b, a
print(a, b)

This will swap the content. First print will output 10 20 and second 20 10.

#2 Reverse a List

First of all, Python lists are amazing. Again, if you worked with other programming languages, you will fall in love with how easy it is to work with Python lists.

Anyhow, sometimes you need to get the content from a list in reversed order.

This can be done easily as follows.

l = [1, 2, 3, 4, 5]
print(l[::-1])

This will output 5, 4, 3, 2, 1.

Check out the last bonus trick how this can be useful to know for a job interview.

#3 Calculate the mode of a list

First of all, what is the mode of a list?

Good question my friend. It is the most common element in a list. This is often useful to know.

Let’s see how this can be done.

l = [1,3,2,5,2,2,5,4]
mode = max(set(l), key=l.count)
print(mode)

This will output 2, as it is the most common element.

How to understand the code?

I am happy you asked.

The set(l) gives a set of the list, which is all the unique element in the list. Here it give {1, 3, 2, 5, 4}.

Then max(set(l), key=l.count) gives the maximum value of each value in set with the count of it. Hence, you get the value with the highest count.

#4 Strip lines for start and end spaces and remove new lines

When you read lines from a text file, it can have leading and ending spaces, as well as lines with no content.

This is an example of lines read from a text file.

lines = ['                       THE ADVENTURE OF THE NOBLE BACHELOR\n',
 '\n',
 '     The Lord St. Simon marriage, and its curious termination, have long\n']

To remove (or strip) for leading and ending spaces, you can do the following.

lines = [line.strip() for line in lines]
print(lines)

Which will result in.

['THE ADVENTURE OF THE NOBLE BACHELOR',
 '',
 'The Lord St. Simon marriage, and its curious termination, have long']

Notice it also remove new lines.

If you want to remove empty lines. This can be done as follows.

lines = [line for line in lines if len(line) > 0]
print(lines)

This will result in.

['THE ADVENTURE OF THE NOBLE BACHELOR',
 'The Lord St. Simon marriage, and its curious termination, have long']

#5 Multiple variable assignment

Sometimes code can become really long, if you have a lot of variable you need to assign to specific values.

This can be done in one line.

a, b, c = 4.4, 'Awesome', 7
print(a)
print(b)
print(c)

This will output.

4.4
Awesome
7

Notice the different types of the variables.

#6 Convert a string into a number

I actually love this one. Why? Because in Python it just a built-in function to convert a string to number.

my_str = '27'
my_int = int(my_str)
print(my_int, type(my_int))
my_str = '3.14'
my_float = float(my_str)
print(my_float, type(my_float))

It will print the values and the type of the variables, int and float, respectively.

#7 Type casting a list of items

This is a great use of List Comprehension.

Say, you have a list of strings with integers. It can happen you read a text file, and each line has integers. Then you need to convert them to integers to use the values.

This can be done as follows.

l = ['12', '23', '34']
items = [int(i) for i in l]

Wow. Did you see that? We just used what we learned in last step and combine it with List Comprehensions.

#8 Find the square root of a number

This is quite handy to know how to take the square root of a number without using math libraries.

print(16**.5)

Well, the 16**.5 syntax (the double **) puts the value (here 16) to the power of the exponent (here .5). This lifts 16 to the power of a half (.5). This is the same as taking the square root.

Hence, it will print 4.

#9 How to get the cube root of a number

This one is almost the same. But remember, you get a bonus one in the end, so you will get 15 one-liners that useful, if you feel cheated by this one.

The cube root means, given a number x, find a number y such that y*y*y equals x.

How do you do that?

I actually expect that many do not know that. I didn’t before I studied high-level math in college.

Here we go.

print(27**(1/3))

Ah, you see. You lift to the power of one third. It will print 3, as 3*3*3 is 27.

#10 Get the absolute value of a number

Again a great built-in function to know.

I often need the absolute values of a number. This can be achieved by using abs().

a = -27
print(abs(a))

This will print 27.

#11 Round a number to n digits

If you’ve been working with floats, you know the pain of endless long digits.

Another great built-in function in Python will help you here.

pi = 3.1415
pi_two_digits = round(pi, 2)
print(pi_two_digits)

This will print 3.14.

#12 Create a list of numbers in specific range

I actually used this one all the time before. And I loved to use it with Python loops.

Let’s see what it is.

my_list = list(range(7, 23))
print(my_list)

This will generate the following list.

[7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]

If you need to iterate a specific number of times.

for i in range(100):
    print(i)

This will iterate from 0 to 99 and print the values.

#13 Calculate the average of a list

Now does the use of this need to be explained?

You have a list of numbers, you need the average of the numbers in the list?

One line of Python will do it for you.

l = [1, 2, 3, 4, 5]
average = sum(l)/len(l)
print(average)

It will print 3.0, as it is the average of the list.

#14 If-else assignment

First I wasn’t really fan of this one. But when you keep it simple, it is useful. The key is not to have complex checks to keep the readability good.

x = 19
v = 42 if x < 10 else 27
print(v)

This will output 27, as x is greater than 10. Try it with x = 5 and see it will print 42.

#15 Flatten a list of lists

First of all, what does does flatten a list of list mean?

Given a list of lists.

l = [[1,2], [4, 6], [8, 10]]

How do you get a list of all the numbers, like this one?

[1, 2, 4, 6, 8, 10]

You do that as follows.

flat = [i for j in l for i in j]

(Bonus) Check if word is a palindrome

What does it mean that a word is a palindrome?

That it is spelled the same from back and front.

The typical example is racecar .

See, it is identical spelled backwards and forwards.

How can you check that a word is a palindrome? Obviously in one line of code? Yes, you should be able to do that now after this list.

Extra bonus: This is a typical job interview question. Be sure to nail this one and impress them.

Here we go.

s = 'racecar'
print(s == s[::-1])

This will print True as it is a palindrome.

s = 'palindrome'
print(s == s[::-1])

This will print False, as it is not a palindrome (except the string is palindrome).

How does it work?

Well, s[::-1] is the reverse of s. If the reverse of s equals s, then it must be a palindrome.

Want to learn more?

f you are hooked on learning Python I will suggest you follow my beginners course on Python.

It is well structured and has focus on you as a learner.

I suggest you break it down as explained in #1.

  • Day 1: See lesson with new concepts – take notes.
  • Day 2: Recap lesson (either from notes or video) – then see introduction to project. Try to solve the project and stop when you get stuck.
  • Day 3: Possibly recap lesson again, then continue with project. If you are really stuck – see solution.

Then continue that pattern for each lesson.

The course is structured with the following resources to improve your learning experience.

  • 17 video lessons teaching you everything you need to know to get started with Python.
  • 34 Jupyter Notebooks with lesson code and projects.
  • A FREE 70+ pages eBook with all the learnings from the lessons.