Validation & Dry Run

Clinker provides two levels of pre-flight validation so you can catch problems before committing to a full run.

Config-only validation

clinker run pipeline.yaml --dry-run

This validates everything that can be checked without reading data:

YAML structure and required fields
CXL syntax and compile-time type checking
Schema compatibility between connected nodes
DAG wiring (no cycles, no dangling inputs, no missing nodes)
File path resolution (existence checks for inputs)

No records are read. No output files are created. The command exits with code 0 on success or code 1 with a diagnostic message on failure.

Use this after every YAML edit. It runs in milliseconds and catches the majority of configuration mistakes.

Record preview

clinker run pipeline.yaml --dry-run -n 10

This reads the first 10 records from each source and processes them through the full pipeline – transforms, aggregations, routing, and output formatting. Results are printed to stdout.

The record preview exercises the runtime evaluation path, catching issues that config-only validation cannot:

CXL expressions that are syntactically valid but fail at runtime (e.g., calling a string method on an integer)
Data format mismatches between the declared schema and actual file contents
Unexpected null values in required fields

Save preview to file

clinker run pipeline.yaml --dry-run -n 100 --dry-run-output preview.csv

The output format matches what the pipeline’s output node would produce, so preview.csv shows you exactly what the full run will write.

Recommended workflow

Use both validation levels in sequence before every production run:

--dry-run – catch configuration and type errors instantly.
--dry-run -n 10 – verify output shape and values against real data.
Full run – execute with confidence.

This three-step pattern is especially valuable when:

Editing CXL expressions in transform or aggregate nodes
Changing source schemas or swapping input files
Adding or removing nodes from the pipeline DAG
Modifying route conditions

Combining with explain

You can also inspect the execution plan before running:

clinker run pipeline.yaml --explain

This shows the DAG structure, parallelism strategy, and node ordering without reading any data. See Explain Plans for details.

The typical full pre-flight sequence is:

clinker run pipeline.yaml --explain          # inspect the DAG
clinker run pipeline.yaml --dry-run          # validate config
clinker run pipeline.yaml --dry-run -n 10    # preview with data
clinker run pipeline.yaml --force            # run for real