TheDocumentation Index
Fetch the complete documentation index at: https://pgtofu.com/llms.txt
Use this file to discover all available pages before exploring further.
extract command connects to a PostgreSQL database and exports its complete schema to a JSON file. This JSON representation serves as the “current state” for schema comparison.
Usage
Flags
| Flag | Short | Description | Default |
|---|---|---|---|
--database-url | PostgreSQL connection URL | $DATABASE_URL | |
--output | -o | Output file path (- for stdout) | schema.json |
--exclude-schema | Additional schemas to exclude (repeatable) | ||
--timeout | Maximum time allowed for database connection and schema extraction (0 disables) | 5m | |
--help | -h | Help for extract |
Examples
Basic Extraction
Output to stdout
Increase Extraction Timeout
extract has a 5 minute default timeout for its database operations. Increase it for large schemas:
--timeout 0 to disable pgtofu’s command timeout for extract.
Exclude Additional Schemas
Docker
Connection URL Format
The PostgreSQL connection URL follows this format:Examples
connect_timeout only controls connection establishment at the PostgreSQL driver level. Use --timeout to control pgtofu’s overall deadline for the database operations in extract, including connection checks and schema extraction.
SSL Modes
| Mode | Description |
|---|---|
disable | No SSL |
allow | Try non-SSL, fall back to SSL |
prefer | Try SSL, fall back to non-SSL (default) |
require | SSL required, no certificate verification |
verify-ca | SSL required, verify CA |
verify-full | SSL required, verify CA and hostname |
Schema Exclusion
Automatically Excluded Schemas
pgtofu automatically excludes system and internal schemas:| Category | Schemas |
|---|---|
| PostgreSQL | pg_catalog, information_schema, pg_toast |
| TimescaleDB | timescaledb_information, timescaledb_internal, _timescaledb_* |
| Hasura | hdb_catalog |
Excluding Third-Party Schemas
Use--exclude-schema to exclude schemas from third-party tools:
Output Format
The extracted schema is a JSON object containing all database objects:What Gets Extracted
Tables
Tables
- Column definitions (types, nullability, defaults)
- Primary keys and foreign keys
- Unique constraints and check constraints
- Exclusion constraints
- Indexes (including partial and covering indexes)
- Table comments
- Partition information (for partitioned tables)
Views
Views
- Regular views with definitions
- Materialized views with refresh settings
- View indexes (for materialized views)
- View comments
Functions & Triggers
Functions & Triggers
- Function signatures (name, arguments, return type)
- Function bodies (preserving original language)
- Volatility settings (STABLE, VOLATILE, IMMUTABLE)
- Security settings (DEFINER vs INVOKER)
- Triggers with timing and events
Types & Sequences
Types & Sequences
- Enum types with values
- Composite types with fields
- Domain types with constraints
- Sequences with start/increment values
TimescaleDB Objects
TimescaleDB Objects
- Hypertables with dimensions
- Compression policies and settings
- Retention policies
- Continuous aggregates
- Refresh policies
Performance Considerations
- Extraction queries the PostgreSQL system catalogs directly
- Large databases (1000+ tables) may take 30-60 seconds or longer, depending on catalog size, indexes, partitions, functions, and extensions
- pgtofu enriches each table with columns, constraints, indexes, and partition metadata, so large schemas may need a longer
--timeout - Consider using
--output -with streaming for very large schemas
Troubleshooting
Connection refused
Connection refused
Ensure the database is running and accessible:
Permission denied
Permission denied
The user needs read access to system catalogs. Grant if needed:
SSL errors
SSL errors
Try different SSL modes:
context deadline exceeded
context deadline exceeded
The default extraction timeout is 5 minutes. Increase it when extracting large schemas:If the error names a table or extraction step, check whether that object has unusually large catalog metadata, many indexes, many partitions, or blocking catalog locks. You can also run
psql "$DATABASE_URL" -c "SELECT * FROM pg_stat_activity WHERE state <> 'idle';" while extraction is running to look for waits.