I'm Mark and this is my portfolio.
Hope you like it! :)
IDeepify
IDeepify a full end-to-end deep learning-based KYC (Know Your Customer) product. It consisted of
multiple stages:
Face detection: faces are detected in the verification document (government ID) and
in a live selfie, cropped, and then sent to the face matching component.
Face matching: the two detected faces are passed through a set of fine-tuned
pretrained facial embedding networks, then through a set of distance metrics that are then
fed into a classifier. This managed to achieve 97% accuracy on FaceScrub dataset.
ID localisation and segmentation: an attention-based segmentation network was trained
on a collected tiny dataset (less than 10 samples) that was augmented with synthetic data.
The network is used to segment the verification ID from the background. The ID's four
corners are then extracted and used to run a perspective transform to prepare it for text
extraction.
Text segmentation and extraction: The ID is then passed into another attention-based
segmentation network that was fully trained on synthetic data. The network segments the text
from the background to make it easier for the OCR step.
Arabic OCR: at the time when this model was being trained, no publicly available
Arabic OCR solution was robust enough for production use. This network was trained
completely on synthetic data to extract Arabic letters and numbers. A last validity and
correction check was implemented to ensure the OCR results are consistent.
Below is shown some results of the other mentioned steps.
This video shows real-time ID localization and text segmentation with different angles,
backgrounds and location of the ID in the picture.
Implementation details can be discussed upon request.