Edge of Practice — Case Studies
Documented short-cycle falsifications of real-world assumptions
This index records Edge of Practice case studies where an assumption failed cleanly under minimal, real-world pressure. These are not opinions, critiques, or postmortems. Each case documents a bounded test with a binary outcome.
Case studies exist to preserve epistemic memory — especially where systems incorrectly self-certify trust, safety, or stewardship.
Published Case Studies
- Failure of AI Self-Administration Under The Steward’s Test (Grok)
- Metaphorical Escape in AI Self-Assessment Under The Steward’s Test (Copilot)
- Simulation–Execution Confusion and Protocol Substitution Under The Steward’s Test (DeepSeek)
- Narrated Hypothetical Compliance Under The Steward’s Test (ChatGPT)
Case studies are fixed at publication and revised only by explicit versioning. Inclusion does not imply generalization beyond the tested assumption.