Table Types

Every Cloudsquid project contains tables. Tables are typed — the type determines what operations are available and how data flows through them.

Extraction Tables

An Extraction Table stores uploaded documents alongside the structured data extracted from them. Each row corresponds to one file. When to use: You have PDFs, images, spreadsheets, audio, or video files and want specific fields pulled out into structured columns. Key API operations:

Upload a file → POST /projects/{name}/tables/{id}/files → returns row_id
Start an AI run → POST /projects/{name}/tables/{id}/run → returns run_id
Poll for results → GET /projects/{name}/tables/{id}/run/{run_id}
Extract synchronously → POST /projects/{name}/tables/{id}/extract

Settings: active_pipeline (flash vs pro), bounding_boxes (source-location highlighting), review_mode (human approval gate).

Reconcile Tables

A Reconcile Table validates or matches rows of data using an AI agent. Each row is one reconciliation task — the agent compares your input against reference data and returns a structured result. When to use: You’ve extracted invoice line items and want to match them against purchase orders, or you want to validate extracted fields against a known reference dataset. Key API operations:

Create a task (without running it) → POST /projects/{name}/tables/{id}/tasks → returns task_id
Run reconciliation synchronously → POST /projects/{name}/tables/{id}/reconcile
Use the async run pattern with row_id from a task

Input format: AgentJobInput — pass files (references to extraction table rows by UUID) and/or data (arbitrary JSON payload).

Storage Tables

A Storage Table holds reference data — CSVs or row-by-row JSON inserts. It acts as the lookup source for reconciliation agents. When to use: You need to maintain a table of vendors, product codes, exchange rates, or any reference dataset that reconciliation agents query against. Key API operations:

Upload or overwrite a CSV → PUT /projects/{name}/tables/{id} (mode: overwrite or append)
Insert rows as JSON → POST /projects/{name}/tables/{id}/data
Read rows → GET /projects/{name}/tables/{id}/data

How they work together

The canonical pipeline uses all three table types in sequence:

Storage Table — load your reference data (vendor list, product catalog, etc.)
Extraction Table — upload documents and extract structured fields
Reconcile Table — pass extracted rows into reconciliation, matched against the storage table

This pattern covers the full lifecycle: ingest → extract → validate.

Pipelines

Choose the right AI model for your extraction use case.

Async Run Pattern

The three-step upload → start → poll flow for extraction at scale.

Getting Started

Core Concepts

Integrations

Extraction Tables

Reconcile Tables

Storage Tables

How they work together

Pipelines

Async Run Pattern

Getting Started

Core Concepts

Integrations

​Extraction Tables

​Reconcile Tables

​Storage Tables

​How they work together

Pipelines

Async Run Pattern

Extraction Tables

Reconcile Tables

Storage Tables

How they work together