WhyDidItFail Base concepts

Freshness Failure

Freshness failure is when data is late, missing, or not updated at the expected cadence. Systems may appear healthy while users unknowingly make decisions based on stale or partially refreshed information.

Common failure signals

  • Reporting window is current but underlying partitions are lagging
  • Late-arriving data causes backfill churn and inconsistent totals
  • Upstream delays create partial refreshes that look "successful"
  • Timezone or watermark logic shifts the effective "as of" time

Often confused with

  • Pipeline failure (freshness failures can occur even when jobs succeed)
  • Performance issues (freshness is about timeliness, not speed of a single query)
  • Data completeness (freshness failures may appear as completeness gaps)

Where it shows up in Analytical Reliability

  • Data Movement Reliability: missing partitions, late arrivals, stuck watermarks
  • Semantic Reliability: time-intelligence measures behave "wrong" because data is old
  • Execution Reliability: slow systems change user behavior and perceived freshness

Related concepts