Skip to content

ADR-004: Replay Semantics

Status

Accepted

Context

  • Problem: Steps call LLMs and external services that produce non-deterministic outputs. We need to reproduce past executions exactly for debugging, compliance, and regression testing.
  • Why now: Audit requirements demand proof that behavior can be reconstructed; testing needs deterministic fixtures.
  • Constraints: Cannot require all content to be stored (compliance may forbid it); must support both exact replay and fresh recomputation.

Decision

  • Chosen option: Artifact-based replay with separate recompute path
  • Rationale:
    • Artifacts capture non-deterministic inputs (LLM responses) by content hash
    • Replay uses stored artifacts to reproduce exact output
    • Recompute uses fresh calls and diffs against original
    • Hash-only mode enables compliance without storing sensitive content

Alternatives

  • Event sourcing: Replay by re-applying all events. Rejected – events are outputs not inputs; doesn't capture LLM responses
  • Mocked adapters: Replace adapters with recorded responses. Rejected – tightly couples to adapter implementation; complex setup
  • Snapshot entire state: Store full state at each step. Rejected – doesn't enable re-execution; storage overhead for unchanged fields

Consequences

  • Positive: Exact replay for debugging; fresh recompute reveals model drift; hash-only mode for compliance
  • Negative: Step authors must ensure artifact capture at non-deterministic points; storage required for artifact content
  • Follow-ups: Define adapter wrapping patterns for automatic capture; document compliance mode configuration

References

  • SPEC-replay
  • SPEC-kernel-invariants (Invariant 6: Replay Is Exact)

LLM context: llms.txt · llms-full.txt
Released under the Apache 2.0 License.