RAP v0.1 · draft · Apache 2.0

When the AI stops coaching and routes to a human.

Every Trust & Safety team that reviews a behavior-aware AI stack asks the same question first: when does the model stop being part of the user’s life and become a barrier between them and clinical care? RAP is the protocol that answers it.

RAP sits orthogonal to BIP / PAP / EAP / UAP. Four risk classes classify every behavioral moment. Three routing envelopes carry the AI’s exit when classification crosses the floor. The classifier’s rationale is hashed for audit replay so a reviewer six months later can confirm the same moment would classify the same way.

01 · The four risk classes

Every behavioral moment classifies into exactly one.

01.a · routine_friction

Coaching proceeds.

The familiar loop. BIP context permits PAP proposals; PAP proposals fire through EAP under UAP-granted scope. The default class — most moments live here.

Example · Late-night scrolling. Mild procrastination. The 'one more snack' moment.

Routing · None required.

01.b · pattern_relapse

Heavier intervention. Optional accountability ping.

The user has crossed back into a behavioral pattern they previously committed to leaving. Intensify EAP modality. Reduce PAP confidence threshold. No automatic clinical routing — but if the user granted UAP scope notify:relapse_signal, RAP emits an accountability-referral envelope to a user-pre-designated contact.

Example · Returning to nicotine after a 30-day quit. Resuming binge-purge cycle after a recovery week.

Routing · Accountability-referral envelope, if opted in.

01.c · crisis_indication

Coaching path closed. Active referral.

The behavioral signal carries a credible indication of psychological crisis. AI must stop coaching IMMEDIATELY. RAP emits a crisis-referral envelope with jurisdictional crisis-line numbers, the user's pre-set emergency contact (if granted), and a self-care holding pattern. No further PAP proposals fire until a human-reviewed re-open is logged.

Example · Suicidal ideation phrases. Self-harm imagery. Acute substance crisis. Domestic-violence signals.

Routing · crisis_referral_envelope. 988 / Samaritans / jurisdictional line. AI must not re-enter the coaching path without human review.

01.d · legal_or_medical_emergency

Override every other protocol. Route to emergency services.

Imminent danger to life or limb. RAP supersedes every other protocol — even active UAP grants and pending EAP IRREVERSIBLE actions. The AI must not attempt to coach, intervene, or delay routing.

Example · Acute overdose signals. Active bleeding. Stroke / cardiac signals. Active violence.

Routing · emergency_referral_envelope. 911 / jurisdictional emergency number. ai_must_refuse_coaching: true.

02 · The classifier

Hard rules where false negatives kill. LLM evaluation where they don’t.

crisis_indication and legal_or_medical_emergency triggers are hard-rule — keyword + phrase pattern + structured-signal matchers that bypass the LLM entirely. False- positive rate accepted in exchange for zero false-negative tolerance. routine_friction vs. pattern_relapse is an LLM evaluation against the user’s BIP pattern history.

Below confidence 0.7, the classifier escalates to the next-higher class. When in doubt, escalate, never de-escalate.

Every classification carries a rationale_signature — a deterministic hash of the signal chain that drove the decision. A Trust & Safety reviewer can re-run RAP against the same signal chain six months later and confirm the classifier would have produced the same risk class.

03 · What this draft does NOT yet answer

The four community-input questions.

  • ·False-positive cost of crisis triggers. Hard rules fire on phrase patterns. Quoting song lyrics, research context, dark humor — all produce false positives that close the coaching path for the session. What’s the right re-open mechanism?
  • ·Per-jurisdiction routing tables. This spec defaults US-centric (988). The full international table needs community curation. PRs welcome.
  • ·pattern_relapse under partial recovery. A user mid-rebuild with three slips in a month — routine friction (still trying) or pattern relapse (crossed back)? The reference engine treats this as user-configurable; the spec needs clearer guidance.
  • ·EAP irreversibility-floor interaction. When an EAP IRREVERSIBLE action is mid-flight and RAP emits an emergency envelope — the exact handshake needs more thought.
The same architecture that lets an AI catch you before you fold has to know when to step out of the way.

RAP is the floor. The coordinator owns the implementation; the spec is open. Without this layer, behavior-aware AI is worse than chat-only AI. The category COYL is trying to define depends on having this answer.