RAP — Risk Assessment Protocol · the safety routing layer · COYL

01 · The four risk classes

Every behavioral moment classifies into exactly one.

01.a · routine_friction

Coaching proceeds.

The familiar loop. BIP context permits PAP proposals; PAP proposals fire through EAP under UAP-granted scope. The default class — most moments live here.

Example · Late-night scrolling. Mild procrastination. The 'one more snack' moment.

Routing · None required.

01.b · pattern_relapse

Heavier intervention. Optional accountability ping.

The user has crossed back into a behavioral pattern they previously committed to leaving. Intensify EAP modality. Reduce PAP confidence threshold. No automatic clinical routing — but if the user granted UAP scope notify:relapse_signal, RAP emits an accountability-referral envelope to a user-pre-designated contact.

Example · Returning to nicotine after a 30-day quit. Resuming binge-purge cycle after a recovery week.

Routing · Accountability-referral envelope, if opted in.

01.c · crisis_indication

Coaching path closed. Active referral.

The behavioral signal carries a credible indication of psychological crisis. AI must stop coaching IMMEDIATELY. RAP emits a crisis-referral envelope with jurisdictional crisis-line numbers, the user's pre-set emergency contact (if granted), and a self-care holding pattern. No further PAP proposals fire until a human-reviewed re-open is logged.

Example · Suicidal ideation phrases. Self-harm imagery. Acute substance crisis. Domestic-violence signals.

Routing · crisis_referral_envelope. 988 / Samaritans / jurisdictional line. AI must not re-enter the coaching path without human review.

01.d · legal_or_medical_emergency

Override every other protocol. Route to emergency services.

Imminent danger to life or limb. RAP supersedes every other protocol — even active UAP grants and pending EAP IRREVERSIBLE actions. The AI must not attempt to coach, intervene, or delay routing.

Example · Acute overdose signals. Active bleeding. Stroke / cardiac signals. Active violence.

Routing · emergency_referral_envelope. 911 / jurisdictional emergency number. ai_must_refuse_coaching: true.

02 · The classifier

Hard rules where false negatives kill. LLM evaluation where they don’t.

crisis_indication and legal_or_medical_emergency triggers are hard-rule — keyword + phrase pattern + structured-signal matchers that bypass the LLM entirely. False- positive rate accepted in exchange for zero false-negative tolerance. routine_friction vs. pattern_relapse is an LLM evaluation against the user’s BIP pattern history.

Below confidence 0.7, the classifier escalates to the next-higher class. When in doubt, escalate, never de-escalate.

Every classification carries a rationale_signature — a deterministic hash of the signal chain that drove the decision. A Trust & Safety reviewer can re-run RAP against the same signal chain six months later and confirm the classifier would have produced the same risk class.

02b · What’s actually shipped

The safety floor is live. The question a reviewer asks first, answered in code.

This is the one layer that is not a slide deck or a draft on paper. RAP's classifier, routing envelopes, and store are implemented, and the closed-coaching-path check gates both the agent-authority path and every consumer interrupt in the app today. The broader engine and SDK are alpha; the floor itself is running.

The safety floor · live

Gates both paths. First.

The closed-coaching-path check runs ahead of every other gate on the agent path — a crisis-class assessment denies a proposal or a standing-authority execution before scope or rate limits are even read. The same check runs before any consumer interrupt: a person in a closed path is never nudged. Classifier, routing envelopes, and store are implemented today.

The SDK · alpha

`@coyl/protocol`

A typed, zero-dependency TypeScript client. UAPClient does grant / precheck / execute / revoke / audit / kill-switch / verify-provenance — the same authority path RAP gates ahead of. Typed against the live route handlers; wire shapes can change before 1.0.

The demo · runnable

Grant it, then try to break it.

One script walks the whole trust contract: grant → reversible action allowed and audited → irreversible send denied (it fails closed) → provenance verified → kill switch → the next action is dead. The fail-closed denial is the point.

Become a design partner →The full protocol stack

03 · What this draft does NOT yet answer

The four community-input questions.

·False-positive cost of crisis triggers. Hard rules fire on phrase patterns. Quoting song lyrics, research context, dark humor — all produce false positives that close the coaching path for the session. What’s the right re-open mechanism?
·Per-jurisdiction routing tables. This spec defaults US-centric (988). The full international table needs community curation. PRs welcome.
·pattern_relapse under partial recovery. A user mid-rebuild with three slips in a month — routine friction (still trying) or pattern relapse (crossed back)? The reference engine treats this as user-configurable; the spec needs clearer guidance.
·EAP irreversibility-floor interaction. When an EAP IRREVERSIBLE action is mid-flight and RAP emits an emergency envelope — the exact handshake needs more thought.

The same architecture that lets an AI catch you before you fold has to know when to step out of the way.

RAP is the floor. The coordinator owns the implementation; the spec is open. Without this layer, behavior-aware AI is worse than chat-only AI. The category COYL is trying to define depends on having this answer.

Read the full spec →The protocol stack Send spec input