Extraction Tables
An Extraction Table stores uploaded documents alongside the structured data extracted from them. Each row corresponds to one file. When to use: You have PDFs, images, spreadsheets, audio, or video files and want specific fields pulled out into structured columns. Key API operations:- Upload a file →
POST /projects/{name}/tables/{id}/files→ returnsrow_id - Start an AI run →
POST /projects/{name}/tables/{id}/run→ returnsrun_id - Poll for results →
GET /projects/{name}/tables/{id}/run/{run_id} - Extract synchronously →
POST /projects/{name}/tables/{id}/extract
active_pipeline (flash vs pro), bounding_boxes (source-location highlighting), review_mode (human approval gate).
Reconcile Tables
A Reconcile Table validates or matches rows of data using an AI agent. Each row is one reconciliation task — the agent compares your input against reference data and returns a structured result. When to use: You’ve extracted invoice line items and want to match them against purchase orders, or you want to validate extracted fields against a known reference dataset. Key API operations:- Create a task (without running it) →
POST /projects/{name}/tables/{id}/tasks→ returnstask_id - Run reconciliation synchronously →
POST /projects/{name}/tables/{id}/reconcile - Use the async run pattern with
row_idfrom a task
AgentJobInput — pass files (references to extraction table rows by UUID) and/or data (arbitrary JSON payload).
Storage Tables
A Storage Table holds reference data — CSVs or row-by-row JSON inserts. It acts as the lookup source for reconciliation agents. When to use: You need to maintain a table of vendors, product codes, exchange rates, or any reference dataset that reconciliation agents query against. Key API operations:- Upload or overwrite a CSV →
PUT /projects/{name}/tables/{id}(mode:overwriteorappend) - Insert rows as JSON →
POST /projects/{name}/tables/{id}/data - Read rows →
GET /projects/{name}/tables/{id}/data
How they work together
The canonical pipeline uses all three table types in sequence:- Storage Table — load your reference data (vendor list, product catalog, etc.)
- Extraction Table — upload documents and extract structured fields
- Reconcile Table — pass extracted rows into reconciliation, matched against the storage table
Pipelines
Choose the right AI model for your extraction use case.
Async Run Pattern
The three-step upload → start → poll flow for extraction at scale.
