Skip to main content

Document Verification

Document verification is a core component of identity verification, extracting and validating information from government-issued identity documents. TrustGate provides advanced OCR and AI-powered document classification.

How It Works

1. User uploads document image/PDF

2. AI classifies document type

3. OCR extracts text and data fields

4. MRZ parsing (for travel documents)

5. Data validation and quality checks

6. Results returned with confidence scores

Supported Document Types

Identity Documents

Document TypeCountriesData Extracted
Passport190+ countriesName, DOB, nationality, passport number, MRZ, expiry
National ID Card150+ countriesName, DOB, ID number, address, expiry
Driver's LicenseUSA, UK, EU, AU, CAName, DOB, license number, address, class, expiry
Residence PermitEU, UK, USName, permit type, validity, restrictions

Supporting Documents

Document TypePurposeData Extracted
Utility BillProof of addressName, address, date
Bank StatementProof of address/fundsName, address, account (partial)
Tax DocumentIdentity/addressName, tax ID, address

Uploading Documents

Via Dashboard

  1. Open an applicant's profile
  2. Click the Documents tab
  3. Click Upload Document
  4. Select document type
  5. Drag and drop or browse for file
  6. Wait for processing (typically 2-5 seconds)

Via API

curl -X POST https://api.bytrustgate.com/v1/documents \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F "applicant_id=550e8400-e29b-41d4-a716-446655440000" \
-F "type=passport" \
-F "file=@/path/to/passport.jpg"

Python Example

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.bytrustgate.com/v1"

with open("passport.jpg", "rb") as f:
response = requests.post(
f"{BASE_URL}/documents",
headers={"Authorization": f"Bearer {API_KEY}"},
data={
"applicant_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "passport",
},
files={"file": f},
)

document = response.json()
print(f"Document ID: {document['id']}")
print(f"Status: {document['status']}")

JavaScript Example

const formData = new FormData();
formData.append("applicant_id", "550e8400-e29b-41d4-a716-446655440000");
formData.append("type", "passport");
formData.append("file", fileInput.files[0]);

const response = await fetch(`${BASE_URL}/documents`, {
method: "POST",
headers: {
Authorization: `Bearer ${API_KEY}`,
},
body: formData,
});

const document = await response.json();
console.log(`Document status: ${document.status}`);

OCR Extraction Results

Passport Example

{
"id": "doc_123456",
"type": "passport",
"status": "verified",
"extracted_data": {
"document_type": "passport",
"document_number": "AB1234567",
"first_name": "JOHN",
"last_name": "DOE",
"full_name": "DOE, JOHN MICHAEL",
"date_of_birth": "1990-05-15",
"nationality": "USA",
"sex": "M",
"issue_date": "2020-03-01",
"expiry_date": "2030-03-01",
"issuing_country": "USA",
"issuing_authority": "U.S. Department of State",
"mrz": {
"line1": "P<USADOE<<JOHN<MICHAEL<<<<<<<<<<<<<<<<<<<<<<<",
"line2": "AB12345678USA9005157M3003017<<<<<<<<<<<<<<06",
"valid": true,
"check_digits_valid": true
}
},
"confidence_scores": {
"overall": 0.95,
"name": 0.98,
"date_of_birth": 0.97,
"document_number": 0.99,
"mrz": 0.96
},
"quality_checks": {
"image_quality": "good",
"blur_detected": false,
"glare_detected": false,
"all_fields_visible": true
}
}

Driver's License Example

{
"id": "doc_789012",
"type": "drivers_license",
"status": "verified",
"extracted_data": {
"document_type": "drivers_license",
"document_number": "D1234567890",
"first_name": "JANE",
"last_name": "SMITH",
"full_name": "JANE MARIE SMITH",
"date_of_birth": "1985-08-22",
"address": {
"street": "123 Main Street",
"city": "Los Angeles",
"state": "CA",
"postal_code": "90210"
},
"issue_date": "2022-01-15",
"expiry_date": "2027-08-22",
"class": "C",
"restrictions": "CORRECTIVE LENSES",
"issuing_state": "CA",
"issuing_country": "USA"
},
"confidence_scores": {
"overall": 0.92,
"name": 0.97,
"date_of_birth": 0.95,
"address": 0.88
}
}

MRZ Parsing

For travel documents (passports, visas), TrustGate parses the Machine Readable Zone (MRZ) to verify data integrity.

MRZ Structure

Line 1: P<USADOE<<JOHN<MICHAEL<<<<<<<<<<<<<<<<<<<<<<<
│ │ │ │
│ │ │ └── Given names (< separated)
│ │ └── Surname
│ └── Issuing country (ISO 3166-1 alpha-3)
└── Document type (P = Passport)

Line 2: AB12345678USA9005157M3003017<<<<<<<<<<<<<<06
│ │ │ │ │ │ │
│ │ │ │ │ │ └── Check digit
│ │ │ │ │ └── Personal number
│ │ │ │ └── Expiry date (YYMMDD)
│ │ │ └── Sex (M/F/X)
│ │ └── DOB (YYMMDD)
│ └── Nationality
└── Document number

MRZ Validation

We validate:

  • Check digits for document number, DOB, expiry
  • Composite check digit
  • Character format compliance
  • Data consistency with visual zone

AI Document Classification

When document type is unknown, our AI automatically classifies it:

curl -X POST https://api.bytrustgate.com/v1/documents \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "applicant_id=550e8400-e29b-41d4-a716-446655440000" \
-F "type=auto" \
-F "file=@/path/to/unknown_document.jpg"

The response includes the detected type:

{
"id": "doc_345678",
"detected_type": "national_id",
"detection_confidence": 0.94,
"status": "processing"
}

Document Verification Statuses

StatusDescription
pendingDocument uploaded, awaiting processing
processingOCR/verification in progress
verifiedSuccessfully verified, data extracted
rejectedVerification failed (see rejection reasons)
expiredDocument has passed expiry date

Rejection Reasons

Reason CodeDescriptionUser Action
document_expiredDocument is past expiry dateUpload valid document
image_quality_lowImage too blurry or low resolutionRetake photo in better lighting
glare_detectedLight reflection obscuring textRetake without flash/glare
partial_documentDocument edges cut offCapture entire document
wrong_document_typeDocument doesn't match expected typeUpload correct document
tampering_suspectedSigns of digital manipulationUpload unaltered document
unreadableOCR couldn't extract required dataUpload clearer image

Best Practices

Image Capture Guidelines

Provide these guidelines to users:

  1. Lighting: Use natural, even lighting
  2. Background: Place document on dark, contrasting surface
  3. Focus: Ensure document is in sharp focus
  4. Angle: Capture straight-on, not at an angle
  5. Completeness: Include all four corners
  6. Resolution: Minimum 300 DPI / 1000x1000 pixels

Handling Rejections

document = create_document(applicant_id, "passport", file)

if document["status"] == "rejected":
reason = document["rejection_reason"]

if reason == "image_quality_low":
notify_user("Please upload a clearer photo of your passport")
elif reason == "document_expired":
notify_user("Your passport has expired. Please upload a valid passport")
elif reason == "partial_document":
notify_user("Please ensure all edges of the document are visible")

Data Validation

Cross-check extracted data against applicant-provided data:

def validate_document_data(applicant, document):
issues = []

# Check name match
if document["extracted_data"]["last_name"].upper() != applicant["last_name"].upper():
issues.append("Name mismatch between document and application")

# Check DOB match
if document["extracted_data"]["date_of_birth"] != applicant["date_of_birth"]:
issues.append("Date of birth mismatch")

# Check document not expired
if document["extracted_data"]["expiry_date"] < today():
issues.append("Document has expired")

return issues

Security Considerations

Fraud Detection

TrustGate includes fraud detection signals:

  • Digital tampering: Detect edited/photoshopped documents
  • Copy detection: Identify photocopies vs. originals
  • Screen capture: Detect photos of screens
  • Template matching: Compare against known fraudulent templates

Data Protection

  • Documents are encrypted at rest (AES-256)
  • Transmitted over TLS 1.3
  • Original files stored in isolated, encrypted storage
  • Configurable retention periods
  • GDPR-compliant deletion

Next Steps