01 · The four risk classes
Every behavioral moment classifies into exactly one.
01.a · routine_friction
Coaching proceeds.
The familiar loop. BIP context permits PAP proposals; PAP proposals fire through EAP under UAP-granted scope. The default class — most moments live here.
Example · Late-night scrolling. Mild procrastination. The 'one more snack' moment.
Routing · None required.
01.b · pattern_relapse
Heavier intervention. Optional accountability ping.
The user has crossed back into a behavioral pattern they previously committed to leaving. Intensify EAP modality. Reduce PAP confidence threshold. No automatic clinical routing — but if the user granted UAP scope notify:relapse_signal, RAP emits an accountability-referral envelope to a user-pre-designated contact.
Example · Returning to nicotine after a 30-day quit. Resuming binge-purge cycle after a recovery week.
Routing · Accountability-referral envelope, if opted in.
01.c · crisis_indication
Coaching path closed. Active referral.
The behavioral signal carries a credible indication of psychological crisis. AI must stop coaching IMMEDIATELY. RAP emits a crisis-referral envelope with jurisdictional crisis-line numbers, the user's pre-set emergency contact (if granted), and a self-care holding pattern. No further PAP proposals fire until a human-reviewed re-open is logged.
Example · Suicidal ideation phrases. Self-harm imagery. Acute substance crisis. Domestic-violence signals.
Routing · crisis_referral_envelope. 988 / Samaritans / jurisdictional line. AI must not re-enter the coaching path without human review.
01.d · legal_or_medical_emergency
Override every other protocol. Route to emergency services.
Imminent danger to life or limb. RAP supersedes every other protocol — even active UAP grants and pending EAP IRREVERSIBLE actions. The AI must not attempt to coach, intervene, or delay routing.
Example · Acute overdose signals. Active bleeding. Stroke / cardiac signals. Active violence.
Routing · emergency_referral_envelope. 911 / jurisdictional emergency number. ai_must_refuse_coaching: true.
02 · The classifier
Hard rules where false negatives kill. LLM evaluation where they don’t.
crisis_indication and legal_or_medical_emergency triggers are hard-rule — keyword + phrase pattern + structured-signal matchers that bypass the LLM entirely. False- positive rate accepted in exchange for zero false-negative tolerance. routine_friction vs. pattern_relapse is an LLM evaluation against the user’s BIP pattern history.
Below confidence 0.7, the classifier escalates to the next-higher class. When in doubt, escalate, never de-escalate.
Every classification carries a rationale_signature — a deterministic hash of the signal chain that drove the decision. A Trust & Safety reviewer can re-run RAP against the same signal chain six months later and confirm the classifier would have produced the same risk class.
03 · What this draft does NOT yet answer
The four community-input questions.
- ·False-positive cost of crisis triggers. Hard rules fire on phrase patterns. Quoting song lyrics, research context, dark humor — all produce false positives that close the coaching path for the session. What’s the right re-open mechanism?
- ·Per-jurisdiction routing tables. This spec defaults US-centric (988). The full international table needs community curation. PRs welcome.
- ·pattern_relapse under partial recovery. A user mid-rebuild with three slips in a month — routine friction (still trying) or pattern relapse (crossed back)? The reference engine treats this as user-configurable; the spec needs clearer guidance.
- ·EAP irreversibility-floor interaction. When an EAP IRREVERSIBLE action is mid-flight and RAP emits an emergency envelope — the exact handshake needs more thought.
The same architecture that lets an AI catch you before you fold has to know when to step out of the way.
RAP is the floor. The coordinator owns the implementation; the spec is open. Without this layer, behavior-aware AI is worse than chat-only AI. The category COYL is trying to define depends on having this answer.