Skip to main content

parser Package

The parser package parses SQL files containing schema definitions into schema.Database objects.
import "github.com/accented-ai/pgtofu/internal/parser"

Parser

Main type for SQL parsing.
type Parser struct {
    options Options
}

func New(options Options) *Parser

Options

type Options struct {
    // Currently no public options
    // Reserved for future configuration
}

func DefaultOptions() Options

Parse Methods

Parse String

Parse a SQL string:
func (p *Parser) Parse(sql string) (*ParseResult, error)

Parse File

Parse a single SQL file:
func (p *Parser) ParseFile(path string) (*ParseResult, error)

Parse Directory

Parse all .sql files in a directory:
func (p *Parser) ParseDirectory(path string) (*ParseResult, error)

ParseResult

type ParseResult struct {
    Database *schema.Database
    Errors   []ParseError
    Warnings []string
}

Example Usage

import "github.com/accented-ai/pgtofu/internal/parser"

func parseSchemaFiles(directory string) (*schema.Database, error) {
    p := parser.New(parser.DefaultOptions())

    result, err := p.ParseDirectory(directory)
    if err != nil {
        return nil, fmt.Errorf("parsing directory: %w", err)
    }

    // Check for parse errors
    if len(result.Errors) > 0 {
        for _, e := range result.Errors {
            fmt.Printf("Error in %s: %s\n", e.File, e.Message)
        }
        return nil, fmt.Errorf("found %d parse errors", len(result.Errors))
    }

    // Check for warnings
    for _, w := range result.Warnings {
        fmt.Printf("Warning: %s\n", w)
    }

    return result.Database, nil
}

Supported Statements

CREATE TABLE

CREATE TABLE [IF NOT EXISTS] [schema.]name (
    column_definitions,
    table_constraints
) [PARTITION BY strategy (columns)];
Parsed elements:
  • Columns with types, defaults, constraints
  • Primary keys, foreign keys, unique, check constraints
  • Table partitioning (HASH, RANGE, LIST)

CREATE INDEX

CREATE [UNIQUE] INDEX [IF NOT EXISTS] name
ON [schema.]table [USING method] (columns)
[INCLUDE (columns)]
[WHERE condition];
Parsed elements:
  • Index name and table
  • Columns and expressions
  • Index type (btree, hash, gin, etc.)
  • INCLUDE columns
  • WHERE clause for partial indexes

CREATE VIEW

CREATE [OR REPLACE] VIEW [schema.]name AS
query
[WITH CHECK OPTION];

CREATE MATERIALIZED VIEW [IF NOT EXISTS] [schema.]name AS
query
[WITH [NO] DATA];

CREATE FUNCTION

CREATE [OR REPLACE] FUNCTION [schema.]name(args)
RETURNS return_type
[LANGUAGE lang]
AS $$
body
$$;
Supports:
  • All PostgreSQL languages (plpgsql, sql, c, etc.)
  • Dollar-quoted bodies
  • Function attributes (VOLATILE, STABLE, IMMUTABLE, etc.)

CREATE TRIGGER

CREATE TRIGGER name
{BEFORE | AFTER | INSTEAD OF} {INSERT | UPDATE | DELETE | TRUNCATE}
ON table
[FOR EACH ROW]
[WHEN (condition)]
EXECUTE FUNCTION function_name();

CREATE TYPE

-- Enum
CREATE TYPE name AS ENUM ('value1', 'value2', ...);

-- Composite
CREATE TYPE name AS (
    field1 type1,
    field2 type2
);

-- Domain
CREATE DOMAIN name AS base_type [constraint];

CREATE EXTENSION

CREATE EXTENSION [IF NOT EXISTS] name [WITH SCHEMA schema];

TimescaleDB Functions

SELECT create_hypertable('table', 'time_column', ...);
SELECT add_compression_policy('table', interval);
SELECT add_retention_policy('table', interval);

Lexer

The lexer tokenizes SQL into tokens:
type Token struct {
    Type  TokenType
    Value string
    Line  int
    Col   int
}

type TokenType int

const (
    TokenKeyword    TokenType = iota
    TokenIdentifier
    TokenNumber
    TokenString
    TokenOperator
    TokenPunctuation
    TokenComment
    TokenEOF
)

Lexer Features

  • PostgreSQL keywords recognition
  • Dollar-quoted strings ($$, $body$, etc.)
  • Quoted identifiers ("TableName")
  • Single and double-quoted strings
  • Line and block comments
  • Numeric literals

Statement Splitter

Splits multiple statements from a single SQL file:
func Split(sql string) ([]string, error)
Handles:
  • Semicolon-separated statements
  • Dollar-quoted function bodies
  • Nested parentheses
  • Comments

Identifier Normalization

func NormalizeIdentifier(id string) string
PostgreSQL rules:
  • Unquoted identifiers are lowercased
  • Quoted identifiers preserve case
  • Schema-qualified names are split and normalized
Examples:
  • Usersusers
  • "Users"Users
  • public.userspublic.users

Error Handling

ParseError

type ParseError struct {
    File    string
    Line    int
    Column  int
    Message string
    SQL     string // Problematic SQL fragment
}

func (e *ParseError) Error() string

Common Errors

ErrorCause
”unexpected end of input”Unclosed parenthesis or quote
”unknown statement type”Unrecognized SQL statement
”invalid column definition”Malformed column syntax
”duplicate constraint name”Same constraint name twice

See Also