What will we cover in this tutorial?
- Where and how to get images you can use without copyright issues.
- How to extract the faces of the images.
- Building a Photo Mosaic using the extracted images of faces.
Step 1: Where and how to get images
Also, the Python library pexels-api makes it easy to download a lot of images. It can be installed by the following command.
pip install pexels-api
To use the Pexels API you need to register.
- Sign up as a user at Pexels.
- Accept the email sent to your inbox (the email address you provide).
- Request your API key here.
Then you can download images by a search query from this Python program.
from pexels_api import API import requests import os.path from pathlib import Path path = 'pics' Path(path).mkdir(parents=True, exist_ok=True) # To get key: sign up for pexels https://www.pexels.com/join/ # Reguest key : https://www.pexels.com/api/ # - No need to set URL # - Accept email send to you # - Refresh API or see key here: https://www.pexels.com/api/new/ PEXELS_API_KEY = '--- INSERT YOUR API KEY HERE ---' api = API(PEXELS_API_KEY) query = 'person' api.search(query) # Get photo entries photos = api.get_entries() print("Search: ", query) print("Total results: ", api.total_results) MAX_PICS = 1000 print("Fetching max: ", MAX_PICS) count = 0 while True: photos = api.get_entries() print(len(photos)) if len(photos) == 0: break for photo in photos: # Print photographer print('Photographer: ', photo.photographer) # Print original size url print('Photo original size: ', photo.original) file = os.path.join(path, query + '-' + str(count).zfill(5) + '.' + photo.original.split('.')[-1]) count += 1 print(file) picture_request = requests.get(photo.original) if picture_request.status_code == 200: with open(file, 'wb') as f: f.write(picture_request.content) # This should be a function call to make a return if count >= MAX_PICS: break if count >= MAX_PICS: break if not api.has_next_page: print("Last page: ", api.page) break # Search next page api.search_next_page()
There is an upper limit of 1.000 photos in the above Python program, you can change that if you like. It is set to download photos that are shown if you query person. Feel free to change that.
It takes some time to download all the images and will take up some space.
Step 2: Extract the faces from the photos
pip install opencv-python
The trained model we use is part of the library, but is not loaded easily from the destination. Therefore we suggest you download it from here (it should be named: haarcascade_frontalface_default.xml) and add the it to the location you work from.
We want to use it to identify faces and extract them and save them in a library for later use.
import cv2 import numpy as np import glob import os from pathlib import Path def preprocess(box_width=12, box_height=16): path = "pics" output = "small-faces" Path(output).mkdir(parents=True, exist_ok=True) files = glob.glob(os.path.join(path, "*")) files.sort() face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml") images =  cnt = 0 for filename in files: print("Processing...", filename) frame = cv2.imread(filename) frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) frame_gray = cv2.equalizeHist(frame_gray) faces = face_cascade.detectMultiScale(frame_gray, scaleFactor=1.3, minNeighbors=10, minSize=(350, 350), flags=cv2.CASCADE_SCALE_IMAGE) for (x, y, w, h) in faces: roi = frame[y:y+h, x:x+w] img = cv2.resize(roi, (box_width, box_height)) images.append(img) output_file_name = "face-" + str(cnt).zfill(5) + ".jpg" output_file_name = os.path.join(output, output_file_name) cv2.imwrite(output_file_name, img) return np.stack(images) preprocess(box_width=12, box_height=16)
It will create a folder called small-faces with small images of the identified faces.
Notice, that the Haar Cascade Classifier is not perfect. It will miss a lot of faces and have false positives. It is a good idea to look manually though all the images and delete all false positives (images that are not having a face).
Step 3: Building our first mosaic photo
The approach to divide the photo into equal sized boxes. For each box to find the image (our faces), which fits the best as a replacement.
To improve performance of the process function we use Numba, which is a just-in-time compiler that is designed to optimize NumPy code in for-loops.
import cv2 import numpy as np import glob import os from numba import jit @jit(nopython=True) def process(photo, images, box_width=24, box_height=32): height, width, _ = photo.shape for i in range(0, height, box_height): for j in range(0, width, box_width): roi = photo[i:i + box_height, j:j + box_width] best_match = np.inf best_match_index = 0 for k in range(1, images.shape): total_sum = np.sum(np.where(roi > images[k], roi - images[k], images[k] - roi)) if total_sum < best_match: best_match = total_sum best_match_index = k photo[i:i + box_height, j:j + box_width] = images[best_match_index] return photo def main(): photo = cv2.imread("rune.jpg") box_width = 12 box_height = 16 height, width, _ = photo.shape # To make sure that it we can slice the photo in box-sizes width = (width//box_width) * box_width height = (height//box_height) * box_height photo = cv2.resize(photo, (width, height)) # Load all the images of the faces images = load_images(box_width, box_height) # Create the mosaic mosaic = process(photo.copy(), images, box_width, box_height) cv2.imshow("Original", photo) cv2.imshow("Result", mosaic) cv2.waitKey(0) main()
To test it we have used the photo of Rune.
This reuses the same images. This gives a decent result, but if you want to avoid the extreme patterns of reused images, you can change the code for that.
The above example has 606 small images. If you avoid reuse it runs out fast of possible images. This would require a bigger base or the result becomes questionable.
The above photo mosaic is created on a downscaled size, but still it does not create a good result, if you do not reuse images. This would require a quite larger set of images to work from.