Routing to Multiple Outputs
This recipe splits a stream of order records into separate output files based on business rules. High-value orders go to one file, standard orders to another.
Input data
orders.csv:
order_id,customer,amount,region
ORD-001,Acme Corp,15000,US
ORD-002,Globex,450,EU
ORD-003,Initech,8500,US
ORD-004,Umbrella,22000,APAC
ORD-005,Stark Ind,950,US
ORD-006,Wayne Ent,3200,EU
Pipeline
order_routing.yaml:
pipeline:
name: order_routing
vars:
high_value_threshold: 5000
nodes:
- type: source
name: orders
config:
name: orders
type: csv
path: "./orders.csv"
schema:
- { name: order_id, type: string }
- { name: customer, type: string }
- { name: amount, type: float }
- { name: region, type: string }
- type: route
name: split_by_value
input: orders
config:
mode: exclusive
conditions:
high: "amount >= $vars.high_value_threshold"
default: standard
- type: output
name: high_value_output
input: split_by_value.high
config:
name: high_value_orders
type: csv
path: "./output/high_value.csv"
- type: output
name: standard_output
input: split_by_value.standard
config:
name: standard_orders
type: csv
path: "./output/standard.csv"
Run it
clinker run order_routing.yaml --dry-run
clinker run order_routing.yaml
Expected output
output/high_value.csv:
order_id,customer,amount,region
ORD-001,Acme Corp,15000,US
ORD-003,Initech,8500,US
ORD-004,Umbrella,22000,APAC
output/standard.csv:
order_id,customer,amount,region
ORD-002,Globex,450,EU
ORD-005,Stark Ind,950,US
ORD-006,Wayne Ent,3200,EU
How routing works
Port syntax
Route nodes produce named output ports. Downstream nodes reference these ports using dot syntax: split_by_value.high and split_by_value.standard.
The port names come from two places:
- Condition names in the
conditionsmap (here,high) - The
defaultfield (here,standard)
Exclusive mode
With mode: exclusive, each record goes to exactly one branch. Conditions are evaluated top to bottom – the first matching condition wins, and the record is sent to that port. Records that match no condition go to the default port.
Pipeline variables
The threshold is defined in pipeline.vars and referenced in the CXL expression as $vars.high_value_threshold. This makes it easy to adjust the threshold without editing the route condition, and channel overrides can change it per environment.
Variations
Multiple branches
Route nodes can have any number of named branches:
- type: route
name: split_by_region
input: orders
config:
mode: exclusive
conditions:
us: "region == \"US\""
eu: "region == \"EU\""
apac: "region == \"APAC\""
default: other
- type: output
name: us_output
input: split_by_region.us
config:
name: us_orders
type: csv
path: "./output/us_orders.csv"
- type: output
name: eu_output
input: split_by_region.eu
config:
name: eu_orders
type: csv
path: "./output/eu_orders.csv"
# ... additional outputs for apac, other
Transform before output
Insert a transform between the route and output to shape the data differently per branch:
- type: transform
name: enrich_high_value
input: split_by_value.high
config:
cxl: |
emit order_id = order_id
emit customer = customer
emit amount = amount
emit priority = "URGENT"
emit review_required = true
- type: output
name: high_value_output
input: enrich_high_value
config:
name: high_value_orders
type: csv
path: "./output/high_value.csv"
Combining routing with aggregation
Route first, then aggregate each branch independently:
- type: aggregate
name: high_value_summary
input: split_by_value.high
config:
group_by: [region]
cxl: |
emit total = sum(amount)
emit count = count(*)
This produces a per-region summary of high-value orders only.