Research & outcomes

Outcomes we measure.
Things we’ll publish.

Most behavior-change apps measure engagement — app opens, sessions, content consumed. COYL measures the only thing that matters: did the user catch themselves before the loop completed? Here’s what we track today, what we’ll publish next, and how to study it with us.

See the 12-week protocol →Read the underlying science

Open for partner enrollmentProtocol v0.9 · IRB pathway: minimal-risk expedited

GLP-1 weight-regain study — protocol live.

12 weeks, N=80, randomized 1:1. Tests whether real-time pattern interrupt during GLP-1 maintenance reduces weight regain in the 90 days after discontinuation. The protocol is drafted and partner-ready.

Read the protocol →Interim data ledger →

What we measure

Five outcome metrics, none of which are “app opens.”

PATTERNS DEFEATED

Slips intercepted before completion

Counted via SLIP_RECOVERED + AUTOPILOT_INTERRUPTED events within 30 minutes of a danger-window firing. Reported per-user weekly on /today and aggregated monthly.

Live metric

SAME-NIGHT RECOVERY

Slips logged + stabilized within 24h

A slip counts as recovered when the user completes the three stabilize actions (water, walk, planned next meal) within 24h of logging. Tracked via the SLIP_RECOVERED event.

Live metric

STREAK INTEGRITY

Days kept vs. days reset

COYL’s streak survives one missed day (grace window) and resets only on a logged slip without same-night recovery. Designed to avoid the streak-anxiety failure mode that drives users to lie to preserve streaks in legacy habit apps.

Live metric

EXCUSE FREQUENCY

Top excuse categories per user, by week

Excuses are detected silently from chat + decision logs and bucketed into 8 categories (DELAY, REWARD, MINIMIZATION, COLLAPSE, EXHAUSTION, EXCEPTION, COMPENSATION, SOCIAL_PRESSURE). The user sees their top 3 each week. We see the full distribution.

Live metric

DANGER-WINDOW ACCURACY

Predicted vs. actual slip windows

The heuristic learner (daily cron) computes (day-of-week × hour-of-day) histograms from the last 30 days of slips per user, writes the top 3 peak windows back to user.dangerWindows. Accuracy = % of slips that fell inside a learned window in the following 7 days.

Reporting from May 2026 onward

On the bench

Three studies we want to run.

We’re looking for telehealth GLP-1 prescribers, university research labs, and obesity-medicine clinics willing to co-author. We provide product access, de-identified outcome data, and engineering support. You provide IRB and patient recruitment.

GLP-1 maintenance study (priority)

N=50–100 GLP-1 patients, 12 weeks, COYL alongside Rx vs. Rx-only. Primary outcome: weight regain at 90 days post-discontinuation. Secondary: adherence-to-program rate, late-night-eating frequency from self-report.

Late-night-eating cohort

N=200, 8 weeks, COYL vs. waitlist control. Primary outcome: % reduction in self-reported post-9pm eating episodes by week 8. Secondary: pattern-defeated count, retention.

Re-entry / shame loop

Qualitative + quantitative. Compare time-to-resume after a logged slip in COYL’s recovery flow vs. legacy Monday-restart cohort. Outcome: hours-to-resume and 30-day retention delta.

What COYL is not

Behavioral support, not medical treatment.

Not a medical device, treatment, or therapy.
Not a substitute for professional mental-health care, eating-disorder treatment, or clinical addiction services.
Not making weight-loss claims tied to specific drug interactions.
Not collecting PHI without explicit consent and BAA where required.

For partners

If you run a clinic, a telehealth GLP-1 program, or a research lab and want to study real-time pattern interrupt with a real patient cohort, we want to talk. Co-authored publications welcome. De-identified data sharing under a DUA welcome.

research@coyl.ai

The underlying science How COYL works