Skip to main content
Every Cloudsquid project contains tables. Tables are typed — the type determines what operations are available and how data flows through them.

Extraction Tables

An Extraction Table stores uploaded documents alongside the structured data extracted from them. Each row corresponds to one file. When to use: You have PDFs, images, spreadsheets, audio, or video files and want specific fields pulled out into structured columns. Key API operations:
  • Upload a file → POST /projects/{name}/tables/{id}/files → returns row_id
  • Start an AI run → POST /projects/{name}/tables/{id}/run → returns run_id
  • Poll for results → GET /projects/{name}/tables/{id}/run/{run_id}
  • Extract synchronously → POST /projects/{name}/tables/{id}/extract
Settings: active_pipeline (flash vs pro), bounding_boxes (source-location highlighting), review_mode (human approval gate).

Reconcile Tables

A Reconcile Table validates or matches rows of data using an AI agent. Each row is one reconciliation task — the agent compares your input against reference data and returns a structured result. When to use: You’ve extracted invoice line items and want to match them against purchase orders, or you want to validate extracted fields against a known reference dataset. Key API operations:
  • Create a task (without running it) → POST /projects/{name}/tables/{id}/tasks → returns task_id
  • Run reconciliation synchronously → POST /projects/{name}/tables/{id}/reconcile
  • Use the async run pattern with row_id from a task
Input format: AgentJobInput — pass files (references to extraction table rows by UUID) and/or data (arbitrary JSON payload).

Storage Tables

A Storage Table holds reference data — CSVs or row-by-row JSON inserts. It acts as the lookup source for reconciliation agents. When to use: You need to maintain a table of vendors, product codes, exchange rates, or any reference dataset that reconciliation agents query against. Key API operations:
  • Upload or overwrite a CSV → PUT /projects/{name}/tables/{id} (mode: overwrite or append)
  • Insert rows as JSON → POST /projects/{name}/tables/{id}/data
  • Read rows → GET /projects/{name}/tables/{id}/data

How they work together

The canonical pipeline uses all three table types in sequence:
  1. Storage Table — load your reference data (vendor list, product catalog, etc.)
  2. Extraction Table — upload documents and extract structured fields
  3. Reconcile Table — pass extracted rows into reconciliation, matched against the storage table
This pattern covers the full lifecycle: ingest → extract → validate.

Pipelines

Choose the right AI model for your extraction use case.

Async Run Pattern

The three-step upload → start → poll flow for extraction at scale.