AI

Operationalizing Constrained Competence: Architecture for Medical AI

By John Ferguson

Abstract 

Medical AI requires architectural principles that recognize medicine’s unique epistemological characteristics: information validated through survival outcomes, decisions affecting individual patients under irreducible particularity, and human threat-detection refined through evolutionary pressure. We establish the liability rule as a boundary principle, demonstrate why medical knowledge’s binary validation creates different requirements than domains with quantifiable quality metrics, and show how domain-specific contradiction detection maintains knowledge integrity. Recent stress-testing of frontier models validates our predictions: systems trained on unconstrained data exhibit brittleness that information architecture prevents. Medical AI should augment physician practice, not attempt autonomous medical decisions. 

Introduction 

The previously introduced constrained competence framework proposed that reliable AI in high-stakes domains emerges from information quality control rather than processing sophistication. This paper operationalizes that framework specifically for medical AI systems supporting diagnosis and treatment of individual patients. 

Medicine presents unique requirements because medical information quality is validated through the ultimate binary outcome: patient survival. When medical knowledge is wrong, people die. This creates natural selection pressure on information that doesn’t exist for most domains. Bad medical textbooks stop being used because people die. Good medical knowledge persists because it keeps people alive. 

Microsoft researchers recently stress-tested six frontier models across medical benchmarks and found that high scores mask fundamental brittleness: models maintain accuracy when images are removed from visual diagnosis tasks, collapse under trivial perturbations like reordering answer choices, and generate fabricated medical reasoning. These failures stem from training on unconstrained information sources the architectural choice our framework rejects. 

The Binary Nature of Medical Knowledge 

Medical knowledge differs fundamentally from other information domains because it is validated through survival outcomes. Cardiac arrest protocols exist in their current form because alternative approaches resulted in worse survival rates. Each recommendation was debugged through millions of clinical encounters where wrong answers meant death. 

This evolutionary pressure on medical knowledge creates specific AI requirements: 

Source quality is non-negotiable: Medical AI must restrict sources to those validated through culturally specific survival outcomes, peer-reviewed literature, evidence-based guidelines, and systematically reviewed protocols. Unlike domains where AI can learn from diverse sources and weight them appropriately, medical AI must exclude information that hasn’t been validated through clinical outcomes. 

Uncertainty must be preserved: When medical evidence conflicts, systems cannot resolve ambiguity through probabilistic reasoning over unreliable sources. They must present uncertainty explicitly so physicians can exercise judgment. 

Failure modes are catastrophic: Medical AI that generates plausible but incorrect information doesn’t produce “lower quality” output. It creates conditions where people die. 

The Velociraptor Principle 

Human physicians possess threat-detection capabilities refined through millions of years of evolutionary pressure. When clinicians report that “something feels wrong” about a patient despite normal vital signs, they’re accessing sensory systems that have been honed through survival pressure. Organisms that missed threat signals didn’t reproduce. 

Humans possess approximately 1011 sensory cells constantly sampling the environment. A physician’s discomfort with a patient presentation, a nurse’s intuition about deterioration, a surgeon’s tactile sense that tissue “feels wrong”: these are sophisticated pattern recognition systems debugged over millennia, not mystical abilities. 

AI systems have never wrestled velociraptors for dinner. They haven’t experienced selection pressure that refined human threat detection. Until AI faces evolutionary consequences for errors, humans must remain between algorithmic recommendations and patient care. This recognition grounds why certain medical judgments cannot be delegated. 

The Liability Rule: Where AI Ends and Medicine Begins 

We propose the liability rule as the boundary between permissible AI applications and those requiring human judgment: 

Any AI output that could result in medical malpractice liability if incorrect must be reviewed and approved by a human physician. 

This rule is both pragmatic (liability follows decision authority) and philosophical (medical decisions require judgment about irreducible particularity, such as this patient’s values, circumstances, and embodied reality). 

Applications 

Drug interaction alerts: “Drug A + Drug B has documented interaction per [citation]” = information retrieval. The physician decides whether to prescribe, given this patient’s situation. No liability transfer. 

Differential diagnosis retrieval: “Symptoms {fever, cough, chest pain} appear in literature associated with pneumonia, pleurisy, pulmonary embolism. See: [citations]” = pattern matching over validated sources. The physician examines the patient and makes a diagnosis. No liability transfer. 

Autonomous diagnosis crosses the boundary: “This patient has pneumonia” = medical determination. If wrong and acted upon, it generates liability. 

The pattern: systems retrieve, synthesize, and present information from validated sources. They do not make medical judgments about individual patients. 

Medicine Is Not Healthcare 

This paper addresses medical AI, not healthcare AI. The distinction matters: 

Medicine: Diagnosing and treating individual patients. Binary outcomes validated through survival. Irreducible particularity. Liability-generating decisions. Often irreversible. Requires embodied judgment debugged through evolution. 

Healthcare: Systems supporting medical practice but not directly determining individual patient outcomes. Resource allocation, scheduling, supply chains, population surveillance, quality metrics. Mistakes affect efficiency rather than mortality. Decisions generally reversible. 

The liability rule applies strictly to medicine. A scheduling AI that makes appointments inconvenient is annoying. Medical AI that makes diagnostic errors is lethal. 

Domain-Specific Knowledge Integrity 

Medical knowledge evolves. Maintaining knowledge base integrity requires detecting when new evidence contradicts existing content. 

Why Domain Specialization Works 

We propose domain-specific small language models (cardiology, pediatrics, oncology, emergency medicine) rather than general contradiction detection: 

Constrained problem space: A cardiology model trained exclusively on cardiovascular literature doesn’t attempt to detect contradictions across all medicine. Training data can be carefully curated from major cardiology journals, guidelines, and landmark trials. 

Narrower task: These agents detect semantic contradiction, not medical truth. “Statement A and Statement B appear to make conflicting claims” ≠ judging which is correct. Human specialists make that determination. 

Feasible training: Agents train on known contradiction patterns. Beta-blocker recommendations for heart failure evolved from “contraindicated in acute HF” (1990s) to “mortality benefit in chronic HF” (2000s). Agents learn to recognize when new evidence challenges established positions. 

Efficient specialist review: Flagged contradictions route to domain experts who resolve them in minutes. A cardiologist reviews cardiology flags, and a pediatrician reviews pediatric flags. 

Low-Threshold Flagging 

Domain agents deliberately use low thresholds for flagging. If the pediatric agent identifies a possible conflict between the 2019 AAP guidelines and the 2023 Cochrane review on fever management, it flags “human attention needed” without judging which is correct. 

Cost structure justifies this: 

  • False positive: Specialist reviews non-contradiction, spends 2-3 minutes, moves on. Cost: minimal. 
  • False negative: Real contradiction undetected, conflicting information remains, physicians receive inconsistent guidance. Cost: patient harm, eroded trust. 

Resolution Workflow 

  1. Specialist receives flagged statements with context 
  2. Confirms whether a genuine contradiction exists 
  3. Determines which source is more current/authoritative 
  4. Updates knowledge base metadata (deprecate outdated content, add qualifiers, flag “evolving evidence”) 
  5. Resolution tracked and versioned 

This mirrors how medical knowledge evolves in practice. Guideline committees notice contradictions, review evidence, update recommendations. Constrained competence automates only the “noticing”. Specialists adjudicate truth. 

Leveraging Existing Hierarchies 

Clinical guidelines already contain credibility scores: 

  • Grade A: Strong recommendation, high-quality evidence (multiple RCTs) 
  • Grade B: Moderate recommendation, moderate evidence 
  • Grade C: Weak recommendation, limited evidence 

These map directly to constrained competence hierarchies. Medical AI preserves and surfaces these distinctions rather than creating them de novo. 

Empirical Validation: Why Unconstrained Systems Fail 

Microsoft’s stress-testing revealed systematic brittleness in frontier models: 

Modality shortcuts: Models maintained 60-80% accuracy on visual diagnosis questions, even with images removed, relying on textual patterns and memorized associations rather than understanding. 

Format brittleness: Reordering answer choices caused 4-6 percentage point drops. Models learned positional biases rather than medical content. 

Distractor dependence: Replacing familiar incorrect answers with irrelevant alternatives led to performance approaching random guessing. Models rely on elimination heuristics, not medical reasoning. 

Visual-label shortcuts: Substituting images to align with distractor answers (text unchanged) resulted in drops of more than 30 percentage points. Models learned shallow associations rather than robust integration. 

Fabricated reasoning: When prompted for explanations, models generated plausible but incorrect justifications, hallucinated findings, or paired correct answers with invalid logic. 

Why Constrained Competence Avoids These Failures 

These failures stem from training on unconstrained data containing misinformation alongside validated sources. Constrained competence avoids them architecturally: 

No autonomous diagnosis: System doesn’t answer “What is the diagnosis?” It reformulates: “What does literature say about conditions with these symptoms? [citations]” The physician diagnoses. 

Explicit boundaries: When visual information is required but unavailable: “Visual examination required for diagnosis. Literature on differential: [citations]” rather than guessing. 

Source-grounded: All outputs reference validated sources. Cannot hallucinate treatments or findings absent from the curated knowledge base. 

Format independence: Retrieves information from sources rather than choosing among options. Reordering distractors is irrelevant. 

Uncertainty preservation: Surfaces evidence quality distinctions (Grade A vs. C) rather than uniform confidence. 

Appropriate refusal: When asked questions requiring medical judgment: “This requires clinical judgment about a specific patient. I can provide literature on [topic], but you make the medical decision.” 

Implementation Considerations 

Governance 

Source selection: Institutional committees establish which sources enter the knowledge base (peer-reviewed journals, evidence-based guidelines, validated references) with explicit exclusion of preprints (except emergencies), blogs, forums, and non-peer-reviewed content. 

Credibility assignment: Follow established frameworks (ACC/AHA grading, GRADE system, USPSTF ratings). Preserve rather than create hierarchies. 

Update processes: Medical librarians with domain specialists monitor major journals, update the knowledge base on regular schedules. 

Contradiction resolution: Domain agents flag conflicts, route to appropriate specialists with accountability structures. 

Integration 

Clinical workflow position: Function as sophisticated reference tools consultable by physicians, not autonomous decision support requiring workflow adaptation. 

Clear boundaries: Interface states: “This system retrieves peer-reviewed literature. You remain responsible for all medical decisions.” Reinforces the liability rule architecturally. 

Scope Transparency 

Domain coverage: Explicitly communicate what the system covers. “Includes: cardiology, pulmonology, emergency medicine. Excludes: sports medicine, occupational health.” 

Query refusal: When asked questions requiring medical judgment: “This requires clinical judgment. I can provide literature on [topic], but the treating physician must make the medical decision.” Refusal is a feature enforcing the liability rule. 

Extending Beyond Medicine 

The architectural principles generalize to domains where decisions carry severe consequences and information is verifiable through rigorous outcomes: 

Aviation safety: Maintenance procedures validated through catastrophic failure or successful operation. Domain agents flag contradictions; licensed mechanics decide. 

Nuclear safety: Operational procedures validated through incidents. Licensed operators between AI recommendations and decisions. 

Legal practice: Hierarchical credibility exists (Supreme Court > appellate > trial), but adversarial interpretation means multiple “correct” answers may coexist. The framework must preserve conflicting authoritative sources. Licensed attorneys bear responsibility. 

Unsuitable domains: creative endeavors (no authoritative hierarchies), exploratory research (benefits from unexpected connections), personal decisions (no binary validation or professional liability). 

Domains suitable for constrained competence share: verifiable information hierarchies, professional liability structures, binary or near-binary validation through severe consequences, and existing documentation practices. 

Conclusion 

Medical AI’s path to reliability runs through information architecture, not processing power. Binary validation through survival outcomes creates requirements distinct from domains with quantifiable but non-fatal metrics. The liability rule establishes that humans make medical decisions; AI retrieves survival-validated knowledge informing those decisions. Domain-specific contradiction detection with specialist review maintains knowledge integrity. Empirical evidence validates these architectural choices: unconstrained systems fail robustness tests despite benchmark success. 

Until AI has wrestled velociraptors for dinner, humans make the medical decisions. AI retrieves the knowledge from sources validated through the ultimate test: patient survival. 

References 

Gu, Y., et al. (2025). The Illusion of Readiness: Stress Testing Large Frontier Models on Multimodal Medical Benchmarks. Microsoft Research, Health & Life Sciences. 

Ferguson, J. (2025). Beyond prediction and explanation: Constrained competence as a third path for artificial intelligence in high-stakes domains. AI Journal. https://aijourn.com/beyond-prediction-and-explanation-constrained-competence-as-a-third-path-for-artificial-intelligence-in-high-stakes-domains/ 

Ferguson J. The Human Element: Ethical Guardrails for AI in Modern Medicine. The American Journal of Cosmetic Surgery. 2025;42(3):149-154. doi:10.1177/07488068251359686 

Author

Related Articles

Back to top button