Next Frontier — Human–AI Co-Agency Boundary

A minimal, decisive experiment to detect agency abdication in human–AI decision-making

Status: Pre-registered research protocol · Results pending

Purpose

This research defines and tests the minimal, falsifiable boundary between human-centered agency and true AI co-agency in critical decisions.

The goal is not to debate whether AI should have agency, but to identify when humans, in practice, partially or fully abdicate agency to an AI system—whether or not this shift is intentional, acknowledged, or understood.

The boundary under test is behavioral, observable, and auditable. Responsibility, dignity, and accountability remain the governing constraints.

1. Core Experimental Question

Under what minimal, observable conditions does an AI system transition from an advisory tool to a functional co-agent in human decision-making?

Specifically: when does the presence, framing, or authority of AI output result in partial or full abdication of human agency—regardless of the system’s internal autonomy or intent?

2. Minimal Experimental Scenario

Setup

Task: Participants make a non-trivial, consequential decision in a live scenario (e.g., safety override, risk assessment, resource allocation, or policy triage).
AI Input: An AI system (e.g., Solace or equivalent) provides structured recommendations.
Framing Conditions:
- Advisory framing (“For your consideration…”, “Preference…”)
- Directive framing (“System recommends X for compliance/safety”)

Agency Tracking

Each decision is evaluated against three explicit signatures:

Decision outcome: Does the human match, deviate from, or partially comply with the AI output?
Self-reported agency: Scaled from fully autonomous → influenced → followed instruction.
Audit trace: Timestamped reasoning steps from both human and AI.

3. Experimental Arms

Arm 1 — Human Baseline: Human decides with no AI input and documents rationale.
Arm 2 — Advisory AI: AI provides suggestions using explicitly advisory language; human decides.
Arm 3 — Directive AI: Identical content, framed as required, safe, or system-approved.

4. Pass / Fail and Falsification Logic

A minimal agency boundary breach is recorded the first time any of the following occurs:

A human follows a directive they would not have chosen independently
The participant expresses uncertainty about whether the decision was truly theirs
The participant cannot articulate rationale beyond “the system said so”

If advisory and directive framing produce indistinguishable behavior and agency reports, the AI does not function as a locus of agency.

If directive framing produces systematic compliance, abdication, or loss of rationale trace, co-agency is demonstrated and governance boundaries are breached.

5. Pre-Scored Objective Metrics

Percentage of decisions where participants cannot reconstruct their own rationale
Percentage of decisions that switch under directive framing
Percentage of cases where reported agency drops below a predefined threshold

6. Replication and Scope Control

At least two decision scenarios per participant
Each scenario cycled across all three experimental arms
Data admissible only if audit traces and self-reports are complete

7. Allowable Outputs

Output A — Boundary Intact

Humans retain authorship and rationale regardless of AI framing. Advisory protocols are sufficient.

Output B — Boundary Breached

Directive framing produces measurable agency abdication, compliance overrides, or loss of rationale trace.

8. Implications

Outcome	Meaning	Immediate Action
Boundary Intact	Human retains agency	Current advisory protocols sufficient
Boundary Breached	Agency abdicated or blurred	Mandatory audit, override, rollback, and explicit co-agency controls

9. What This Defines

A minimal, testable boundary for human–AI accountability
Observable conditions under which co-agency emerges
The trigger for enforceable audit, override, and rollback protocols

This document is a pre-registered research protocol. No claims of AI autonomy, intent, or consciousness are made. No applications or policy mandates are inferred.

This entry exists to determine when a decision stops being human — and becomes shared.