Superintelligent AI: Why Incode’s Proprietary AI Surpasses Humans in Detecting Identity Document Fraud.
- Why ML-based document verification is better
- Examples of fraud, Incode catches with its Document Verification solution
- Summary
Why ML-based document verification is better
Detection of fraudulent identity documents is an extremely difficult task for human beings for several reasons:
- There are over 4600 different document types globally, they all have different layouts, security features, and different levels of protection. No human can memorize them all to effectively conduct the fraud analysis, especially in real-time when the document throughput is intense.
- Many documents have machine-readable fields, such as NFC, MRZ, QR codes, and barcodes, that a human cannot understand and verify without the use of specialized interpreter software.
- Customers take pictures of their documents in various environmental conditions that affect image quality, may change the document’s colors, make it visually more difficult to read; the document itself may have damage, dirt, captured from an extreme angle of view, arbitrarily rotated in the image – all of it makes the task of fraud detection harder and washes out the border between authentic and counterfeit documents. Humans, when it comes to loosely defined objectives, are prone to making mistakes.
- We conducted experiments, and we know that if we take the exact same set of images of identity documents two times within an interval of one month and ask a human expert to identify counterfeits among them, the expert’s answers change significantly between these two attempts. This means even human experts show inconsistencies with their decisions and cannot be fully trusted.
ML models, on the other hand, give the same results for the same images regardless of when it’s requested, and it is specifically trained to recognize fraud even when the capturing conditions or the conditions of the document itself are far from ideal.
- We conducted experiments, and we know that if we take the exact same set of images of identity documents two times within an interval of one month and ask a human expert to identify counterfeits among them, the expert’s answers change significantly between these two attempts. This means even human experts show inconsistencies with their decisions and cannot be fully trusted.
- Humans can get tired and distracted. Research has shown that the average adult’s attention span measures only around 67 seconds (Quantifying attention span across the lifespan):
Besides, document verification is even more tiring as it requires multitasking in checking dozens of different security features simultaneously for the occurrence of fraud signals in a document. And people are not generally good at multitasking (Multicosts of Multitasking).
ML models, again, never get tired or distracted, and they never lose focus. They work accurately 24/7, and it takes less than a second for an ML model to analyze the whole document.
- When it comes to identity document fraud, many of the fraud signs are subtle and can be easily overlooked by a human. When we train our models, we apply consensus labelling for our datasets, which means that each document is reviewed by several fraud experts instead of a single expert, and the models trained on such datasets leverage the collective knowledge of all the experts to minimize chances of missing any signs of fraud. This strategy makes the models better than any given individual human expert. It’s proved with our internal benchmarks of our models against our own fraud experts:
For the screen spoof detection task:
For the paper spoof detection task:
- Other types of fraud, such as digital tampering using face swapping or face morphing, are nearly impossible for a human expert to detect, while ML models can detect them.
Examples of fraud, Incode catches with its Document Verification solution
Try these pictures with your current document verification provider and see if they can catch it.
1. Screen spoofs
Easy level:
The entire device and screen displaying the ID are fully visible.
Medium level:
The frame of the device screen is partially visible.
Hard level:
The borders of the screen are not visible at all.
2. Paper spoofs
Easy level:
The spoofed ID is on paper and has not been cut out.
Medium level:
The ID is poorly cut out, often with unusual borders.
Hard level:
The ID is cut out cleanly to resemble a real ID.
3. Document tampering
Easy level:
Physical alterations like a photo or text cutout placed directly on the document.
Medium level:
Digital cutouts of text are used, with the tampering being clearly visible.
Hard level:
Edits blend seamlessly with the ID, often involving advanced techniques like Photoshop.
4. Fake documents
Easy level
Obvious fakes, such as incorrect portrait features.
Medium level:
Indicators like non-handwritten signatures or unprofessional portrait images.
Hard level
Subtle discrepancies are only detectable through the barcode structure.
5. Invalidated documents & documents with obstruction
Easy level:
Documents with punched holes or data obscured by objects like fingers.
Medium level:
IDs with “VOID” punched into the barcode area, blending into the document.
Hard level
“VOID” text is even more subtle, resembling the surface of the document.
Summary
When it comes to ensuring robust and reliable document verification, the stakes are high. Fraud comes in countless forms, demanding a solution that can deliver top-tier accuracy 24/7, at scale, and in real-time. Traditional methods simply can’t keep up. That’s where our cutting-edge ML-based solution shines—detecting fraud with precision, adapting to new challenges, and providing the performance your business can rely on without compromise.