Reconciliation Failure
Reconciliation failure is when two systems that should agree do not. It is a signal that data was lost, duplicated, or transformed incorrectly between stages--even when pipelines report success.
Common failure signals
- Row counts or totals differ between source and destination
- Aggregates reconcile at a high level but diverge by segment (region, product, customer)
- Late-arriving data causes "eventual" reconciliation that never fully converges
- Backfills correct one window but introduce drift in another
Often confused with
- Expected timing differences (freshness lag) -- reconciliation failure persists beyond normal lag
- Metric definition drift -- reconciliation compares like-for-like; drift changes "what is being measured"
- Sampling artifacts -- reconciliation checks should use consistent logic and scope
Where it shows up in Analytical Reliability
- Data Movement Reliability: gaps between stages (ingest -- transform -- serve) reveal silent loss/duplication
- Semantic Reliability: models compute correctly over data that no longer matches upstream truth
- Change Reliability: "safe" refactors or mapping edits break parity between systems