Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Pipeline Variables

Pipeline variables are constants defined in the YAML pipeline: header and accessible in CXL expressions. They provide a way to parameterize pipeline behavior without modifying CXL logic.

Defining variables

Variables are declared in the vars: block of the pipeline: section:

pipeline:
  name: order_processing
  vars:
    high_value_threshold: 500
    express_surcharge: 5.99
    report_title: "Monthly Orders"
    apply_discount: true
    tax_rate: 0.085

Variables are scalar values only: strings, numbers (integers and floats), and booleans. Complex types (arrays, objects) are not supported.

Accessing variables in CXL

Use the $vars.* namespace to reference variables in CXL expressions:

- type: transform
  name: classify_orders
  input: orders
  config:
    cxl: |
      emit order_id = order_id
      emit amount = amount
      emit is_high_value = amount > $vars.high_value_threshold
      emit surcharge = if shipping == "express" then $vars.express_surcharge else 0
      emit total = amount + surcharge

The $pipeline.* namespace also provides access to pipeline-level metadata.

Variable semantics

Variables are frozen at pipeline start and constant across all records. They cannot be modified during execution. This makes them suitable for:

  • Thresholds and cutoff values
  • Fixed surcharges, tax rates, and multipliers
  • Labels and titles for report headers
  • Feature flags that control conditional logic

Using variables with channels

Variables defined in pipeline.vars serve as defaults. The channel system can override these values per client or environment without modifying the pipeline YAML:

# Pipeline: pipeline.yaml
pipeline:
  name: invoice_processing
  vars:
    express_surcharge: 5.99
    late_fee: 25.00

# Channel: channels/acme-corp/channel.yaml
_channel:
  id: acme-corp
  name: "Acme Corp"
  active: true
variables:
  EXPRESS_SURCHARGE: "8.99"

When run with --channel acme-corp, the channel variable overrides the pipeline default.

Complete example

pipeline:
  name: sales_report
  vars:
    min_amount: 100
    commission_rate: 0.12
    region_label: "North America"
    include_pending: false

nodes:
  - type: source
    name: sales
    config:
      name: sales
      type: csv
      path: "./data/sales.csv"
      schema:
        - { name: sale_id, type: int }
        - { name: rep_name, type: string }
        - { name: amount, type: float }
        - { name: status, type: string }

  - type: transform
    name: compute
    input: sales
    config:
      cxl: |
        filter amount >= $vars.min_amount
        filter if $vars.include_pending then true else status == "closed"
        emit sale_id = sale_id
        emit rep_name = rep_name
        emit amount = amount
        emit commission = amount * $vars.commission_rate
        emit region = $vars.region_label

  - type: aggregate
    name: rep_totals
    input: compute
    config:
      group_by: [rep_name]
      cxl: |
        emit total_sales = sum(amount)
        emit total_commission = sum(commission)
        emit deal_count = count(*)

  - type: output
    name: report
    input: rep_totals
    config:
      name: report
      type: csv
      path: "./output/sales_report.csv"
      sort_order:
        - { field: "total_sales", order: desc }