FORM EXTRACTION

XTRACT Powered by iCONECT

Automatically extract metadata from structured or unstructured files.

XTRACT facilitates auto-classification of inbound individual streamed or batched forms and documents, and automated extraction of selected metadata from structured or unstructured files. Optimize and streamline the business process for paper-to-meta-data capture, structured form templates or conversion to an export stream of your sales orders, intake forms, insurance claims, mortgages, invoices, or any other business-critical form data.

Intelligent Document Classification

iCONECT-XTRACT is an intelligent document classification and data capture platform built with the flexibility to extract meaningful data from forms and documents no matter the format or how you receive them. iCONECT-XTRACT technology, architecture, features, and functions are based on award-winning iCONECT technology, used to manage evidence and documents in some of the world’s largest, most-sensitive, and most-complex legal cases.

    • Create customized templates for any form layout
    • Define metadata types, formats, and structure to minimize eyes-on-documents
    • Improve turnaround time and reduce errors
    • Use extracted data for customized feedback into applications and line-of-business processes
    • Identify the information you need without manual data entry
    • Eliminate time-consuming document sorting or scanning separator sheets with barcodes

iCONECT-XTRACT

A zero-footprint, browser-based application that can run on your servers behind your firewall, or in the cloud as a subscription-based service. In either model, you can be in production in days, rather than the weeks or months of setup required with outmoded or customized systems.

API Connectivity
Interaction with third-party systems can also be optimized using iCONECT’s RESTful APIs, which enable document capture f or virtually any ERP, CRM, document management, and other back-office and personal productivity programs.

Metrics & Reporting
All activity and object progress is monitored and displayed in the intuitive XTRACT dashboard. Further, productivity metrics can monitor your team’s productivity, throughput, errors, changes, time and even keystrokes.


Form Extraction

Ideal for known forms with known values and value locations. Each extracted value is added to the meta-data and linked with a ‘key-value pair’. Export meta-data as part of a workflow stream.

 

 

Unstructured Documents
Correspondence, receipts, invoices, email, statements, certificates, reports etc.

Administrators can decide which fields to extract including PII, phone, address, $-value, reference numbers etc. Each extracted value is added to the meta-data and linked with a ‘key-value pair’. Export meta-data as part of a workflow stream.

 

Form ID Matching

Determines form and version based on form ID characteristics and known form identifiers.

 

 

 

Pixel Level Matching

Determines form and version based on matching to known forms at the pixel level. Offers multiple alternative form matches with confidence values.

 

 

Template Builder

Create custom templates for any form with field-type variables AND zonal matching to metadata fields. Post-OCR processes and confidence scores help determine when human review is required.

 

 

Automated De-speckle

Remove noise from the background to create cleaner text for OCR.

 

 

 

Automated Contrast/Illumination

Increase image contrast. Illumination to remove any dark parts of image to create cleaner text for OCR.

 

 

 

Automated Binarize

Convert multicolor image (RGB or other) to black and white monochrome image to create cleaner text for OCR.

 

 

 

Resize/Rotate

Zoom and normalize orientation for images including mirroring.

 

 

 

Automated Inversion

Invert text and remove any residual non-text objects (hole punches) to create cleaner text for OCR.

 

 

Dots-Per-Inch

Adjusts DPI settings and image type for optimization.

 

 

 

Automated De-skew

90 degree right the form such that OCR is done in a horizontal scan.

 

 

 

Blank Page Detection

Auto-detection of blank and minimal pixel pages with one-click or automated removal.

 

 

 

Re-Pagination

Cross reference known page count and flag documents that need to be merged, split, or repaginated.

 

 

 

Automated Background Suppression

Identify form and match to sample form to dropout form background such that only user-input text is OCR’d. Deals with black/grey boxes or lines around text including automatic repair of characters that intersect with removed lines or text.

 

 

Multi-Engine Support

Select the appropriate OCR engine based on your document type, input stream, data complexity and budget. Create intake workflows for forms or documents that may include handwriting and standard fonts (such as a handwritten cheque) or identified forms. Settings can route images to one or several OCR engines.

 

Handwriting Support

XTRACT uses the latest in Handwriting Optical Character Recognition (OCR) technology to convert both cursive and printed block lettering to text characters complete with confidence scores at the word level. The resulting text characters then become a part of the XTRACT workstream.

 

 

Field level AI Modeling

Restrict/modify/score word selection based on an AI model which helps normalize the output of a record. (NOTE: specific for free text entry, descriptions, sentences, but not form data.)

 

 

Capitalization

Auto capitalization clean-up of OCR’d content to match pre-determined field values.

 

 

Character Type Governer

Identify numeric, currency, or month fields OCR automatically adjusts each field with its own output

 

 

 

Optical Mark Recognition

Includes OMR, ideal for checkboxes, radio buttons and signature blocks. Links selections with pre-determined list values and auto-populates meta-data fields.

 

 

 

Text Validation To Known Data

Cross reference OCR’d text to known lists (area codes, zip codes, city names, states etc.) and quickly identify input errors.

 

 

 

Text Validation to known or 3rd party data

Cross reference OCR’d text to known lists, such as area codes or city names, or secondary validation tables, such as a list of personal ID numbers against secondary validation tables (such as personal ID numbers) to quickly identify input errors.