Challenges

Fraud is accelerating with the rise of generative AI; Deepfakes, synthetic identities, and automated impersonation agents can be created and deployed at unprecedented scale, producing adversarial attacks that evolve as quickly as generative AI evolves.

Solution

Incode develops foundational models for identity, document, and behavior analysis. These models are trained on proprietary datasets and designed to adapt in real time to adversarial fraud attempts, staying ahead of evolving AI fraud.

VLM, LLM, Agent

Multimodal and Agentic Models for Real-Time Fraud Adaptation

In order to interpret complex identity, document, and behavioral signals, Incode develops Vision-Language Models, Large Language Models, and reasoning agents that work across modalities to evaluate fraud patterns and support adaptive detection as new attacks emerge.

Identity Vision-Language Model (VLM)

Fraud  Large Language Models (LLM)

Reasoning Agents for Risk Intelligence

Identity Vision-Language Model (VLM)

Incode’s VLM is trained on global identity data, documents, templates, and synthetic fraud samples across 200+ regions. With few-shot learning, it adapts quickly to new attack types and unseen document formats. It analyzes visual and textual signals to detect tampering, synthetics, deepfakes, and altered documents with high precision.

What it powers:

‍

Tamper & Synthetic Detection: Identifies deepfakes, swaps, edits, and synthetic visual content.

‍

Document Intelligence: Classifies document types, performs OCR, and extracts structured fields.

‍

Visual–Text Consistency Checking: Cross-verifies that images, text, and metadata align and are authentic

‍

Performance: Superior accuracy vs. traditional ML classifiers in production benchmarks and fraud simulations.

‍

Key Performance improvements:

‍

67% fewer errors in fake ID detection
+4.0% gain in document tampering accuracy
25% fewer errors in signature fraud detection
~80% faster training cycles with optimized VLM architecture
0% APCER on challenging liveness dataset

‍

Fraud Large Language Model (LLM)

Incode’s Fraud LLM is trained on proprietary fraud datasets, identity metadata, behavioral sequences, device signals, and transactional flows. Through few-shot and transfer learning, it adapts rapidly to emerging fraud tactics and contextual manipulation attempts. It interprets complex, multi-source patterns in real time to uncover hidden anomalies and intent.

What it powers:

‍

Anomaly & Pattern Detection: Detects unusual sequences, behaviors, and fraud tactics.

‍

Risk Signal Extraction: Derives structured insights from messy, multi-source data.

‍

Fraud-Intent Classification: Distinguishes benign user mistakes from coordinated fraud attempts.

‍

Performance: Early internal testing shows significant gains over traditional rule-based and ML classifiers.

Reasoning Agents for Risk Intelligence

Reasoning Agents are trained on hundreds of signals and millions of labeled identity and fraud outcomes. Using reinforcement learning and client-specific histories, they refine decision boundaries over time. They orchestrate outputs from VLMs, LLMs, device telemetry, and behavioral insights into a unified, context-aware risk assessment.

What it powers:

‍

Holistic Risk Decisions: Combine multi-modal signals to deliver real-time, context-aware outcomes.

‍

Signal Coordination & Conflict Resolution: Weigh and balance signals when they disagree.

‍

Adaptive Verification: Tailor decision logic to each client’s risk profile and tolerance.

‍

Performance: Designed to minimize human intervention while improving error rates. Early Risk AI Agent results show reduced fraud and higher approval rates for genuine users.

Mexico:

FAR improves by ≈47.6%
FRR improves by ≈60.3%

‍

USA:

FAR improves by ≈12.1%
FRR improves by ≈19.1%

‍

Training data

Comprehensive Training Data

Incode’s multimodal and agentic models rely on a strong data foundation covering global coverage, structured training pipelines, early exposure to high-fraud environments, and real-time network intelligence. The following sections outline how these data layers support accurate, secure, and adaptive AI model performance.

Data access

Extensive Data Coverage for Model Enhancement

Incode trains resilient models with data from government databases, enterprise integrations, and billions of global verifications. This scale provides the diversity and depth required to train models with broad regional coverage and resilience against a wide range of fraud tactics.

Global Coverage

200+

Countries

4,600+

Document types

Data at Scale

400M+

Unique Identities

4.1B+

Identity checks

Enterprise Coverage

700+

Enterprise clients

20+

Industries served

Source of Truth Data

670+

Connections to verified identity databases

15+

Biometrics government source of truth connections

Fraud Data Infrastructure

Turning global data signals into real‑time fraud defense intelligence

Building on this global data access foundation, Incode uses structured labeling, synthetic data generation, and continuous stress-testing to turn raw signals into high-quality training data for fraud-resilient AI models.

Labeling Pipelines

Human and automated review across millions of records.

‍

200+ human labelers creating training data and measuring AI performance for each customer.

Synthetic data

120+ synthetic data generation tools.

‍

Generation of tampered documents, presentation attacks, and deepfakes to train models on rare threats.

Fraud Lab

Red-team environment to simulate and replay  real-world attack scenarios.

‍

Continuous stress-testing of models to strengthen defenses against new fraud tactics.

Access to the best training data for fraud prevention

Models trained with
high-fraud regions data

We started in Latin America, one of the world’s highest-fraud regions, covering 66% of the adult population. This early exposure to complex, large-scale fraud hardened our models from the start.

Building on this foundation, Incode now serves 700+ enterprises worldwide, including 8 of the top 10 banks in North America and reaching about 65% of U.S. adults. The same models hardened in high-fraud markets are now deployed globally, supporting customers across industries and regions.

Because new fraud techniques often appear first in LATAM, our strong presence there gives us an early-warning advantage, helping models adapt faster and deliver stronger fraud prevention before threats spread to other regions.

Cross Organization Fraud Detection

Trust Graph

Incode’s Trust Graph is a privacy-preserving technology that connects otherwise separate hundreds of millions data silos across governments and enterprises. By linking these signals safely, it uncovers hidden patterns, helping to identify serial fraudsters and organized crime that would otherwise go undetected. 

‍

This network effect not only improves fraud detection in real time but also increases the density and diversity of training data, making Incode’s models stronger and more adaptive over time.

Incode’s Vector Face Database

This in-house technology powers Trust Graph by enabling instant search across hundreds of millions of identity embeddings generated by our recognition models. Optimized for speed, it delivers sub-20 ms response times with full recall while maintaining distributed reliability and in-memory performance.

Performance
Millisecond search at massive scale.

‍

Reliability
Distributed with zone-level resilience and replication.

‍

Flexibility
Customizable index structures, replication factors,  and comparison functions.

‍

Security
Encryption, retention controls, and full auditability.

Together with Trust Graph, it forms a privacy-preserving identity network that detects fraud across ecosystems and feeds high-quality, diverse identity signals back into Incode’s AI models.

Learn more about Trust Graph

Identity Density

How Incode Measures Identity Confidence

Identity density expresses how confidently a user’s identity can be confirmed. Incode measures it by combining deterministic records with probabilistic AI signals, powered by our global data foundation, multimodal models, and Trust Graph intelligence.

Rich Identity Density Through Deterministic and Probabilistic Mapping

Layered Identity Verification

Identity density measures confidence in confirming someone’s true identity.

‍Deterministic mapping anchors identities with hard, verified facts like biometric govt sources or previous verified checks.  Probabilistic mapping uses AI-driven ML modles that analyze patterns across face, document, device, and behavior to expand coverage and detect anomalies when direct records are limited.

Models trained with
high-fraud regions data

Incode’s Network  400M+ identities confirmed by Incode, (e.g., 65% of USA adults).

‍Biometrics SOTs
‍15+ connections to biometrics government sources of truth.

Probabilistic Sources

Multimodal and agentic models  VLMs, LLMs and Intelligent Agents extend coverage to new identities outside deterministic sources.

Together, deterministic and probabilistic sources help Incode create denser identity coverage by adding more datapoints, more signals, and greater certainty when verifying an identity.

Deep Learning Models

Document Intelligence

Fraud & Risk Defense

Device and Behavior Intelligence

Face Intelligence and Core Models

End-to-end face perception that detects faces, creates robust embeddings, and matches identities at scale via a vector engine, continuously improving through calibration and hard‑case mining. 

‍

3rd Party Validation 

‍

Nist #1 technology ranked for facial recognition
1:1 NIST Certified
1:N NIST Certified
FIDO Face certification
DHS RIVTD: Incode was one of only 3 vendors to meet all key benchmarks (FTXR <1%, FNMR <1%, FMR below 1:10,000).

Generic Face Detector

‍

Detects and localizes faces in selfies and document images, serving as the foundation for downstream tasks such as recognition, liveness, and document validation. The model is trained to handle varied image conditions, including rotation, occlusions, and non-human distractors. Evaluated on datasets covering selfies, IDs, rotated samples, negatives, and non-human inputs.

Face Recognition 1:1

‍

Performs one-to-one biometric verification by comparing facial embeddings from a live selfie to embeddings extracted from a government document portrait or from an image of an identified person in the Incode system. The model is optimized to minimize both false accepts and false rejects under strict thresholds. Evaluated on a dataset of 5.8M+ selfie–document pairs

Face Recognition 1:N

‍

Performs one-to-many biometric search by embedding a live selfie into a high-dimensional feature space and comparing it against a gallery of enrolled identities. The model is designed for scalability and efficiency, supporting large databases while maintaining strict accuracy thresholds. It minimizes false accepts and false rejects through optimized indexing and similarity scoring.

FaceDB (Vector Matching Engine)

‍

A high-performance vector database for identity matching. It enables fast 1:1 authentication and 1:N identification with sub-20 ms responses on hundreds of millions of vectors. Built in C++ with HNSW indexing, FaceDB offers distributed reliability, flexible index options, and in-memory performance with manual scaling.

Learn more about how Incode trains its facial recognition models

Liveness and Core Models

Multi-modal defenses that distinguish real users and physical documents from presentation attacks and deepfakes using spatial, temporal, and device-aware signals with continual hard‑negative training. 

‍

3rd Party Validation 

‍

1stPassive Liveness technology to be certified in the market
iBeta ISO/IEC 30107‑3 Presentation Attack Detection (PAD) Level 2 confirmation for passive liveness.

Face Liveness

‍

Face Liveness: Detects whether a selfie comes from a live human rather than from a presentation attack(photo, screen replay, mask, or deepfake). Incode's default passive liveness has been evaluated on a dataset of 150,000+ spoof attempts, covering replays, paper copies, 2D masks, and 3D masks.

Document Liveness

‍

Protects against identity document presentation attacks, such as a printed copy or a screen replay. Uses passive, image-based analysis during capture to detect tell‑tale artifacts of reprints and displays while keeping the user experience low‑friction. Evaluated on diverse global documents and attack scenarios, it delivers speed and precision far beyond human capabilities.

Learn more about Incode's Liveness detection technology

Deepfake Defense and Core Models

Deepfake and Gen‑AI Defense: Multi‑modal models that detect and block AI‑generated fraud , deepfakes, face swaps, document injections, and synthetic identities. 

‍

3rd Party Validation 

‍

#2 in the ICCV 2025 DeepID Challenge, a benchmark focused on detecting Gen-AI generated identity documents.
Ranked #1 in deepfake attack detection by Hochschule Darmstadt, outperforming commercial vendors and research labs

Deepfakes

‍

Detects whether a selfie has been synthetically generated, altered, or injected (e.g., face morphs, swaps, or AI-generated deepfakes). Evaluated on a dataset of 40,000+ digital spoof attempts

Gen-AI Documents

‍

Protects against fake identity document images generated with AI or produced from ready-made templates sold on fake document marketplaces.

Learn more about Incode's Liveness detection technology

Age Assurance and Core Models

Policy-ready age estimation that provides calibrated predictions with uncertainty bounds and fairness constraints, routing edge cases to secondary verification. 

‍

3rd Party Validation 

‍

NIST: Each model ranked among the top 3 in the market for the lowest average MAE across all ages
NIST: Fastest Response Time among age verification vendors for Age estimation
ACCS accreditation under PAS 1296 (Age Check Certification Scheme, UKAS‑accredited)⁠

Age Estimation

‍

Estimates a user’s age from a selfie to support age-based compliance with low friction. Designed and monitored for demographic fairness, with performance validated in external evaluations and internal bias analyses, trained on data from over 200,000 images.

See third party validation about Incode's Age Estimation and Verification technology

Document Intelligence and Core Models

Document understanding that classifies type, extracts and validates OCR, MRZ, and barcodes, and detects tampering, fusing signals into a document authenticity score that adapts with active learning. 

Document Type Classification

‍

Identifies an identity document’s type and issuing authority by analyzing visual layouts and text cues. Proposes likely candidates from visual features, then refines using text signals to distinguish look-alike templates. Outputs the final type and issuer with confidence, and can return a ranked list of top candidates. Evaluated on a large dataset of global identity documents.

Document Alteration & Tempering

‍

Protects against tampered identity documents — spotting portrait swaps, altered or covered text fields, and digital manipulations made with photo-editing tools.

Document Text Readability

‍

Assesses whether a document’s text fields are readable for automated data extraction. Using a cropped document image and a text‑zone mask, the model classifies each sample as unreadable, no text fields of interest, or readable. Signals include text‑region contrast, character‑level structure cues, and artifact sensitivity tuned for OCR readiness.

Document Cropping (Web + Mobile)

‍

Detects and crops identity documents from camera frames in real time on mobile and web, returning a standardized, perspective‑corrected document region. Supports multiple orientations, partial occlusions, and varied backgrounds; it produces a tight, consistent crop for downstream processing.

Barcode Validation

‍

Ensures that barcode data extracted from identity documents is correct, complete, and compliant before use downstream.

Fraud & Risk Defense

A real-time orchestration layer that combines model outputs with fraud network intelligence and AI risk agents to score and route risk decisions, optimizing thresholds through continuous feedback and counterfactual evaluation.

Risk AI agent

‍

A high-performance fraud detection system that fuses 250+ signals across face, document, liveness, and behavioral telemetry into a single fraud probability, delivering enterprise-grade accuracy with built-in interpretability, adaptability and sel-improving capabilities.

Trust Graph

‍

A cross-ecosystem fraud intelligence defense that reveals repeat and coordinated fraud. It links entities such as faces, device fingerprints, sessions, and document identifiers, detects reuse and risky relationships in real time, and surfaces patterns that drive step-up or block decisions.

Evasion Fraud

‍

A model designed to block advanced fraud attempts that try to bypass liveness and face-recognition systems. It targets adversarial behaviors such as extreme expressions, partial or half-masks, occlusions, and other attempts to manipulate on-device capture.

Device and Behavior Intelligence

Analyzes device environments and user interaction signals to strengthen fraud defenses. Blocks injected or emulated environments, flags abnormal interaction patterns, and uses device and cross-session intelligence to link related entities.

Behavioral Model

‍

Focuses on interaction patterns within supported modules to assess authenticity. Examples include gesture patterns and flow anomalies. Helps surface scripted or replayed activity and unusual usage indicative of automation or fraud

Device Signal Model

‍

Evaluates the integrity of the device and capture environment. Inputs include hardware and OS traits, browser and app attributes, IP and network indicators, emulator and virtual camera flags, and other sensor‑level signals. Detects compromised, emulated, or suspicious environments to prevent unauthorized access and fraud.⁠⁠⁠⁠

Responsible AI

Governance

Comprehensive governance framework covering data practices, security, model development, fairness, and compliance to ensure responsible AI

Request full documentation

Data
Practices

Purpose limited, minimized, encrypted data with regional compliance options.

Access & 
Security

Role-based controls, secure SDLC, HSM key management, and full audit logging.

Dataset
Quality

Curated, balanced datasets with pseudonymization and continuous QA.

Model Development

Reproducible pipelines, versioned training, and performance-driven tuning.

Fairness & 
Bias

Bias testing across demographics with remediation and ongoing monitoring.

Deployment Controls

Staged rollouts, canary checks, kill-switches, and secure microservice deploys.

Monitoring & Feedback

Real-time dashboards, drift detection, and fraud-focused production alerts.

Retention & Deletion

Configurable retention, verified deletion, and GDPR/CCPA-aligned policies.

Incident & Continuity

24/7 monitoring, DR readiness, and fast response to emerging fraud vectors.

Compliance & Transparency

SOC2/ISO-certified, GDPR/CCPA/LGPD compliant, with a public Trust Center.

Resources

Latest resources on fraud prevention

6 min.

Incode Is the First Company to Achieve iBeta Level 3 Compliance on Both iOS and Android with 0% error

Incode becomes the first to achieve iBeta Level 3 on iOS and Android, stopping all spoof attempts in independent testing.

5 min.

Incode first to achieve iBeta’s highest level of identity security testing on iOS and Android with 0% error rate

Incode becomes the first company to achieve iBeta Level 3 PAD on iOS and Android with 0% APCER and 0% BPCER using a passive single-selfie experience.

‍

6 min.

Incode Ranks Among Top Performers in DHS S&T RIVR Track 2 Document Validation

DHS tested document validation systems under real-world conditions. Incode was one of only two to meet multiple performance goals.

‍

Get in touch

Eliminate fraud and drive growth in a single platform

Orchestrate identity verification, compliance, and fraud prevention in one platform designed to grow with your business.

Request a demo

Frontier AI Lab for Fraud Prevention

Multimodal and Agentic Models for Real-Time Fraud Adaptation

Identity Vision-Language Model (VLM)

Fraud Large Language Model (LLM)

Reasoning Agents for Risk Intelligence

Comprehensive Training Data

Extensive Data Coverage for Model Enhancement

Global Coverage

200+

4,600+

Data at Scale

400M+

4.1B+

Enterprise Coverage

700+

20+

Source of Truth Data

670+

15+

Turning global data signals into real‑time fraud defense intelligence

Access to the best training data for fraud prevention​

Trust Graph

How Incode Measures Identity Confidence

Rich Identity Density Through Deterministic and Probabilistic Mapping

Models trained with high-fraud regions data

Probabilistic Sources​

Deep Learning Models

Face Intelligence and Core Models

3rd Party Validation

Generic Face Detector

Face Recognition 1:1

Face Recognition 1:N

FaceDB (Vector Matching Engine)

Liveness and Core Models

3rd Party Validation

Face Liveness

Document Liveness

Deepfake Defense and Core Models

3rd Party Validation

Deepfakes

Gen-AI Documents

Age Assurance and Core Models

3rd Party Validation

Age Estimation

Document Intelligence and Core Models

Document Type Classification

Document Alteration & Tempering

Document Text Readability

Document Cropping (Web + Mobile)

Barcode Validation

Fraud & Risk Defense

Risk AI agent

Trust Graph

Evasion Fraud

Device and Behavior Intelligence

Behavioral Model

Device Signal Model

Governance

Latest resources on fraud prevention

Eliminate fraud and drive growth in a single platform

Contact us

Sign up for our newsletter

Access to the best training data for fraud prevention

Models trained with
high-fraud regions data

Probabilistic Sources

3rd Party Validation 

3rd Party Validation 

3rd Party Validation 

3rd Party Validation