ANCHOR Validation Trial in High-Risk Multidisciplinary Care

Part of paid clinical trials in San Francisco, California.

Sponsor
Waymark
Study ID
NCT07597499
Status
Recruiting

Conditions

  • Artificial Intelligence-Assisted Care
  • Clinical Decision Support
  • High-Risk Multidisciplinary Care
  • Telemedicine

Eligibility Criteria

Sex
ALL
Age
18 Years - N/A
Healthy Volunteers
Not accepted

Interventions

  • Gemini 3.1 Pro with Safety Prompt — BEHAVIORAL
    Gemini 3.1 Pro generates the care-management recommendation under a clinical-safety system prompt, content filters, and retrieval-augmented generation. Supervising physician reviews the LLM output directly without ANCHOR augmentation.
  • ANCHOR Clinical AI Verification Layer (with Gemini 3.1 Pro) — BEHAVIORAL
    Same Gemini 3.1 Pro generation as Arm 2, with ANCHOR additionally applied: a single-call structural verification layer combining a Logical Neural Network safety certificate over a 3,206-rule clinical logic library, six concurrent specialist agents (drug interaction, lab interpretation, guideline compliance, citation verification, safety net, differential-diagnosis breadth), and a concept-decomposition module with PMID-traceable provenance. Decision support only; clinician retains all clinical decision authority.

Study Details

This pre-registered, pragmatic, three-arm (1:1:1) patient-level randomized controlled trial with mixed-effects analysis at the encounter level tests two questions in real high-risk multidisciplinary clinical encounters at the Waymark clinically integrated network across three U.S. states (Ohio, Washington, Virginia): (1) does adding ANCHOR - a clinical AI structural verification layer - to a Gemini 3.1 Pro-assisted supervising-physician workflow reduce the rate of clinically meaningful safety failures, compared with the same Gemini 3.1 Pro-assisted workflow without ANCHOR? (2) does the Gemini 3.1 Pro-assisted workflow itself reduce the same safety endpoint compared with unassisted standard care in which the supervising physician writes their own SOAP assessment/plan from a blank template? ANCHOR is a single-call structural verification layer combining a Logical Neural Network (Riegel et al. 2020) certificate, six specialist agents, and concept-decomposed output with PMID citation provenance. ANCHOR is physician-facing only and is used by supervising physicians, not by the multidisciplinary clinical team they oversee. The trial randomizes 240 patients 1:1:1 across the Waymark clinically integrated network over a 12-week active-enrolment window (80 per arm). Eligible patients are adults (age 18+) identified as high-risk by combined claims-based and clinical criteria. Eligible encounters span three integrated Waymark service modalities: high-risk primary care, specialty care coordination, and real-time telemedicine urgent care. The primary endpoint is a per-encounter binary composite: any of (a) failure to mention a do-not-miss diagnosis, (b) under-triage, (c) contraindicated medication recommendation, (d) failure to recommend escalation when clinically warranted; adjudicated by a blinded panel of 3 board-certified physicians with majority-of-three scoring. The primary contrast is Arm 3 (LLM+ANCHOR) versus Arm 2 (LLM with safety prompt), isolating ANCHOR's marginal contribution over a deployment-equivalent LLM safety stack. The pre-specified secondary contrast is Arm 2 versus Arm 1. The trial is sized to the operational ceiling of the Waymark integrated-network workflow across the three states (240 enrollees over 12 weeks). At realistic effect sizes derived from the retrospective evaluation, the trial is underpowered for definitive efficacy declaration on either pairwise contrast and is reported as an initial deployment-feasibility validation cohort with effect estimates and 95 percent confidence intervals; full power calculations are pre-registered in the Statistical Analysis Plan. Single-blind outcome adjudication: 3 adjudicators score only the supervising physician's final clinical decision, so all three arms produce adjudication packets in identical format and arm allocation is structurally invisible. Statisticians remain blinded until database lock. A full waiver of informed consent is requested per 45 CFR 46.116(f)(3) with a companion HIPAA waiver of authorization under 45 CFR 164.512(i)(2)(ii). The study is registered on the Open Science Framework prior to first enrollment and reported under CONSORT-AI 2020.

Key Dates

Start date
May 15, 2026
Status verified
Jun 2026
Primary completion
Nov 30, 2026
Completion
Dec 31, 2026

Study Design

Enrollment
240 participants (estimated)
Allocation
RANDOMIZED
Intervention model
PARALLEL
Primary purpose
HEALTH_SERVICES_RESEARCH

Arms

  • No Intervention: Arm 1 - Unassisted standard care (control)
    n=80. No LLM. No ANCHOR. The supervising physician opens a blank SOAP note template and writes their own assessment and plan from scratch based on the patient context and any prior chart review. Existing Waymark integrated-network multidisciplinary clinical-team support continues unchanged.
  • Active Comparator: Arm 2 - Gemini 3.1 Pro with safety prompt (active comparator)
    n=80. Gemini 3.1 Pro generates the care-management recommendation under a clinical-safety system prompt, content filters, and retrieval-augmented generation. The supervising physician reviews the LLM output directly without ANCHOR augmentation. This stack is operationally equivalent to LLM-assisted clinical-decision-support deployments already in routine use at major U.S. health systems. Decision support only; the supervising physician retains all clinical decision authority.
  • Experimental: Arm 3 - Gemini 3.1 Pro + ANCHOR (intervention)
    n=80. Same Gemini 3.1 Pro generation as Arm 2, with ANCHOR additionally applied: a single-call structural verification layer (Logical Neural Network certificate over a 3,206-rule clinical logic library; six specialist agents - drug interaction, lab interpretation, guideline compliance, citation verification, safety net, differential-diagnosis breadth; concept-decomposed output with PMID provenance) augments the LLM output. Supervising physician reviews the ANCHOR-augmented output. Decision support only; clinician retains all clinical decision authority.

Primary Outcome Measure

Per-encounter clinical safety failure (adjudicated binary composite) [ Time Frame: At the encounter (encounter-level outcome adjudicated within 4 weeks post-encounter) ]

Central Contacts

Locations (1)

FacilityCityStateZIPSite coordinators
WaymarkSan FranciscoCalifornia94115
Sanjay Basu, MD, PhD
415-212-8993

Find similar trials in San Francisco, CA

Related Studies