From Zero to Creating Photo Mosaic using Faces with OpenCV

What will we cover in this tutorial?

  1. Where and how to get images you can use without copyright issues.
  2. How to extract the faces of the images.
  3. Building a Photo Mosaic using the extracted images of faces.

Step 1: Where and how to get images

There exists a lot of datasets of faces, but most have restrictions on them. A great place to find images is on Pexels, as they are free to use (see license here).

Also, the Python library pexels-api makes it easy to download a lot of images. It can be installed by the following command.

pip install pexels-api

To use the Pexels API you need to register.

  1. Sign up as a user at Pexels.
  2. Accept the email sent to your inbox (the email address you provide).
  3. Request your API key here.

Then you can download images by a search query from this Python program.

from pexels_api import API
import requests
import os.path
from pathlib import Path

path = 'pics'
Path(path).mkdir(parents=True, exist_ok=True)

# To get key: sign up for pexels
# Reguest key :
# - No need to set URL
# - Accept email send to you
# - Refresh API or see key here:



query = 'person'
# Get photo entries
photos = api.get_entries()
print("Search: ", query)
print("Total results: ", api.total_results)
MAX_PICS = 1000
print("Fetching max: ", MAX_PICS)

count = 0
while True:
    photos = api.get_entries()
    if len(photos) == 0:
    for photo in photos:
        # Print photographer
        print('Photographer: ', photo.photographer)
        # Print original size url
        print('Photo original size: ', photo.original)

        file = os.path.join(path, query + '-' + str(count).zfill(5) + '.' + photo.original.split('.')[-1])
        count += 1
        picture_request = requests.get(photo.original)
        if picture_request.status_code == 200:
            with open(file, 'wb') as f:

        # This should be a function call to make a return
        if count >= MAX_PICS:

    if count >= MAX_PICS:

    if not api.has_next_page:
        print("Last page: ",
        # Search next page

There is an upper limit of 1.000 photos in the above Python program, you can change that if you like. It is set to download photos that are shown if you query person. Feel free to change that.

It takes some time to download all the images and will take up some space.

Step 2: Extract the faces from the photos

Here OpenCV comes in. They have trained model using the Haar Cascade Classifier. You need to install the OpenCV library by the following command.

pip install opencv-python

The trained model we use is part of the library, but is not loaded easily from the destination. Therefore we suggest you download it from here (it should be named: haarcascade_frontalface_default.xml) and add the it to the location you work from.

We want to use it to identify faces and extract them and save them in a library for later use.

import cv2
import numpy as np
import glob
import os
from pathlib import Path

def preprocess(box_width=12, box_height=16):
    path = "pics"
    output = "small-faces"
    Path(output).mkdir(parents=True, exist_ok=True)
    files = glob.glob(os.path.join(path, "*"))

    face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")

    images = []
    cnt = 0
    for filename in files:
        print("Processing...", filename)
        frame = cv2.imread(filename)
        frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        frame_gray = cv2.equalizeHist(frame_gray)
        faces = face_cascade.detectMultiScale(frame_gray, scaleFactor=1.3, minNeighbors=10, minSize=(350, 350), flags=cv2.CASCADE_SCALE_IMAGE)
        for (x, y, w, h) in faces:
            roi = frame[y:y+h, x:x+w]

            img = cv2.resize(roi, (box_width, box_height))

            output_file_name = "face-" + str(cnt).zfill(5) + ".jpg"
            output_file_name = os.path.join(output, output_file_name)
            cv2.imwrite(output_file_name, img)

    return np.stack(images)

preprocess(box_width=12, box_height=16)

It will create a folder called small-faces with small images of the identified faces.

Notice, that the Haar Cascade Classifier is not perfect. It will miss a lot of faces and have false positives. It is a good idea to look manually though all the images and delete all false positives (images that are not having a face).

Step 3: Building our first mosaic photo

The approach to divide the photo into equal sized boxes. For each box to find the image (our faces), which fits the best as a replacement.

To improve performance of the process function we use Numba, which is a just-in-time compiler that is designed to optimize NumPy code in for-loops.

import cv2
import numpy as np
import glob
import os
from numba import jit

def process(photo, images, box_width=24, box_height=32):
    height, width, _ = photo.shape
    for i in range(0, height, box_height):
        for j in range(0, width, box_width):
            roi = photo[i:i + box_height, j:j + box_width]
            best_match = np.inf
            best_match_index = 0
            for k in range(1, images.shape[0]):
                total_sum = np.sum(np.where(roi > images[k], roi - images[k], images[k] - roi))
                if total_sum < best_match:
                    best_match = total_sum
                    best_match_index = k
            photo[i:i + box_height, j:j + box_width] = images[best_match_index]
    return photo

def main():
    photo = cv2.imread("rune.jpg")

    box_width = 12
    box_height = 16
    height, width, _ = photo.shape
    # To make sure that it we can slice the photo in box-sizes
    width = (width//box_width) * box_width
    height = (height//box_height) * box_height
    photo = cv2.resize(photo, (width, height))

    # Load all the images of the faces
    images = load_images(box_width, box_height)

    # Create the mosaic
    mosaic = process(photo.copy(), images, box_width, box_height)

    cv2.imshow("Original", photo)
    cv2.imshow("Result", mosaic)


To test it we have used the photo of Rune.

This reuses the same images. This gives a decent result, but if you want to avoid the extreme patterns of reused images, you can change the code for that.

The above example has 606 small images. If you avoid reuse it runs out fast of possible images. This would require a bigger base or the result becomes questionable.

No reuse of face images to create the Photo Mosaic

The above photo mosaic is created on a downscaled size, but still it does not create a good result, if you do not reuse images. This would require a quite larger set of images to work from.


View Comments

    • That is a good question.
      The normal way to avoid reuse of images is to keep a list of a hash-value for the images.
      It could be the md5 of the images. Then for every image you have the md5.

Recent Posts

Build and Deploy an AI App

Build and Deploy an AI App with Python Flask, OpenAI API, and Google Cloud: In…

5 days ago

Building Python REST APIs with gcloud Serverless

Python REST APIs with gcloud Serverless In the fast-paced world of application development, building robust…

5 days ago

Accelerate Your Web App Development Journey with Python and Docker

App Development with Python using Docker Are you an aspiring app developer looking to level…

6 days ago

Data Science Course Made Easy: Unlocking the Path to Success

Why Value-driven Data Science is the Key to Your Success In the world of data…

2 weeks ago

15 Machine Learning Projects: From Beginner to Pro

Harnessing the Power of Project-Based Learning and Python for Machine Learning Mastery In today's data-driven…

2 weeks ago

Unlock the Power of Python: 17 Project-Based Lessons from Zero to Machine Learning

Is Python the right choice for Machine Learning? Should you learn Python for Machine Learning?…

2 weeks ago