Schema Extraction
The extraction phase connects to your PostgreSQL database and exports its complete schema as a JSON representation. This JSON serves as the “current state” for schema comparison.How It Works
pgtofu queries PostgreSQL’s system catalogs directly to gather comprehensive schema information:Extraction Order
Objects are extracted in dependency order:- Schemas - Database namespaces
- Extensions - Installed extensions
- Custom Types - Enums, composites, domains
- Sequences - Serial/identity sources
- Tables - With columns, constraints, indexes
- Views - Regular and materialized
- Functions - All languages
- Triggers - Table triggers
- Hypertables - TimescaleDB (if installed)
- Continuous Aggregates - TimescaleDB (if installed)
What Gets Extracted
Tables
For each table, pgtofu extracts:Column Details
| Property | Description |
|---|---|
name | Column name |
data_type | PostgreSQL type name |
position | Ordinal position (1-based) |
is_nullable | Whether NULL is allowed |
default | Default value expression |
max_length | For VARCHAR/CHAR types |
precision, scale | For NUMERIC types |
is_array | Whether it’s an array type |
is_identity | GENERATED AS IDENTITY column |
identity_generation | ALWAYS or BY DEFAULT |
is_generated | Computed/generated column |
generation_expression | Expression for generated columns |
comment | Column comment |
Constraint Details
| Constraint Type | Extracted Properties |
|---|---|
| PRIMARY KEY | Columns |
| FOREIGN KEY | Columns, referenced table, ON DELETE/UPDATE |
| UNIQUE | Columns, deferrable settings |
| CHECK | Check expression |
| EXCLUDE | Columns, operators, where clause |
Index Details
| Property | Description |
|---|---|
name | Index name |
columns | Indexed columns |
include_columns | INCLUDE columns (covering index) |
type | btree, hash, gin, gist, etc. |
is_unique | Unique index |
where | Partial index condition |
definition | Full CREATE INDEX statement |
Functions
Views
TimescaleDB Objects
If TimescaleDB is installed, pgtofu also extracts:Schema Filtering
Automatically Excluded
pgtofu excludes system and internal schemas by default:| Category | Schemas |
|---|---|
| PostgreSQL | pg_catalog, information_schema, pg_toast |
| TimescaleDB | _timescaledb_*, timescaledb_information, timescaledb_internal |
| Hasura | hdb_catalog |
Manual Exclusion
Exclude additional schemas using--exclude-schema:
| Tool | Schemas |
|---|---|
| Prisma | _prisma, _prisma_migrations |
| Supabase | auth, storage, graphql_public, supabase_* |
| PostgREST | postgrest |
| pgAdmin | pgagent |
Type Normalization
pgtofu normalizes type names for consistent comparison:| PostgreSQL Representation | Normalized Form |
|---|---|
int4 | integer |
int8 | bigint |
bool | boolean |
varchar | character varying |
timestamp | timestamp without time zone |
timestamptz | timestamp with time zone |
Performance Considerations
Query Efficiency
- All extraction queries use indexed system catalog columns
- Single-pass extraction minimizes database round trips
- Typical extraction takes 5-30 seconds for most databases
Large Databases
For databases with 1000+ tables:- Consider extracting specific schemas
- Use output streaming (
--output -) - Extraction time may reach 60+ seconds
Troubleshooting
Permission errors
Permission errors
Ensure the user has SELECT access to system catalogs:
Missing objects
Missing objects
Objects in excluded schemas won’t appear. Check the exclusion list:
- System schemas (automatic)
--exclude-schemaflags
TimescaleDB not detected
TimescaleDB not detected
Ensure TimescaleDB extension is installed and accessible:
Output Verification
After extraction, verify the output:See Also
extractcommand - CLI reference- Declarative Migrations - Schema file format
- Schema Diffing - How schemas are compared