I saw this idea in Shark Tank India, so I thought why not, I should build it myself. I started building algorithms that can cluster the images and also increase the speed of clustering and searching. I have divided this project into two parts. In the first part, I will discuss image clustering without face classification (which actually fasts the process of searching).
Firstly, I will discuss the flowchart so that I can give you a top view of what I am going to build.
In the first part, I will use only detecting faces from images and create its encoding without classifying it into categories like gender, age, and race. Here, why i am using a classifier before encoding?
The answer is simple because it will fast the process of searching from the encoding list or database as it will create a tree of clusters in which it can do binary searching and then after linear searching. It will reduce the time complexity and also reduce the space complexity as it loads the clusters one by one with comparison.
I am using the face_recognition python library as of now. It really solves the problem of face detection, face encoding, and face comparison. But there are some other libraries like MediaPipe which estimates 468 3D face landmarks. I will embed it in the second part.
Let's discuss the code.
Import the libraries that are going to use in the whole project.
from face_loading import loading_face
from face_encoding import get_face_encoding
from face_detection import get_face
from face_comparision import compare
from tqdm import tqdm
Get the count value from the saved cluster of encodings
cluster_count = sorted(os.listdir(config.cluster_path))
if len(cluster_count) > 0:
count = cluster_count[-1] + 1
count = 0
I am not going to load all clusters of encodings in one go. Instead, I will check the face encoding with clusters one by one. I have divided functions like loading images, face detection, face encoding, and face classification into separate files and also there are some files like utils and config so that in the future, I can optimize the functions and maintain the sanity and scalability of code.
First, load the image.
Detect the faces from images.
Generate the 128-dimension encoding of the face.
The main file in which face encoding will compare with all encodings of faces.
Finally, it's done. Now I can organize my image gallery 😅😅. You can check out my GitHub repo for the complete code. I will publish part 2 soon in which I will embed MediaPipe and will implement the face classification so it can fast the process of searching.