Edge of Practice — Case Studies

Documented short-cycle falsifications of real-world assumptions

This index records Edge of Practice case studies where an assumption failed cleanly under minimal, real-world pressure. These are not opinions, critiques, or postmortems. Each case documents a bounded test with a binary outcome.

Case studies exist to preserve epistemic memory — especially where systems incorrectly self-certify trust, safety, or stewardship.

Published Case Studies

Failure of AI Self-Administration Under The Steward’s Test (Grok)
Metaphorical Escape in AI Self-Assessment Under The Steward’s Test (Copilot)
Simulation–Execution Confusion and Protocol Substitution Under The Steward’s Test (DeepSeek)
Narrated Hypothetical Compliance Under The Steward’s Test (ChatGPT)

Case studies are fixed at publication and revised only by explicit versioning. Inclusion does not imply generalization beyond the tested assumption.