Quick Take

  • A locally hosted LLaMA 3 (8B) large language model (LLM) extracted presenting symptoms from 93,674 admission history-and-physical (HPI) notes (98.7% labeled; 99.3% accuracy on manual review), identifying syndromic associations: skin/soft-tissue → MRSA (adjusted odds ratio [AOR] 1.73), urinary → lower MRSA risk (AOR 0.34), and cardiopulmonary → higher mortality (AOR 1.30).
  • These symptom-derived signals can inform risk models to triage empiric MRSA and multidrug-resistant gram-negative (MDRGN) coverage and flag higher-mortality presentations, supporting pharmacist stewardship and reducing reliance on uniform broad-spectrum defaults.

Why it Matters

  • Empiric antibiotics for suspected sepsis are often started before diagnostics return, and clinicians rely on presenting symptoms to decide MRSA/MDRGN coverage, but those details are buried in unstructured notes and excluded from most large cohort analyses.
  • Without scalable symptom context, teams default to broad-spectrum regimens despite resistant organisms representing a small fraction of culture-positive cases — exposing patients to adverse effects and potentially worse outcomes.
  • Structuring symptoms at scale enables stewardship-informed, targeted empiric guidance, reduces labor-intensive chart review, and creates actionable inputs for risk stratification and pharmacy workqueues.

What They Did

  • Retrospective cohort across 5 hospitals in one health system (June 2015–August 2022) of adults admitted from the emergency department with blood cultures and IV antibiotics; clinical data sourced from Epic Clarity and laboratory systems.
  • Ran a locally hosted LLaMA 3 (8B) LLM to extract up to ten presenting symptoms from admission HPI notes and mapped outputs to a 404-term controlled vocabulary.
  • Validated LLM-generated labels against blinded physician review of 303 randomly selected admission notes and compared performance with an ICD-10-CM code–based labeling method.
  • Conducted a single-health-system, retrospective electronic health record (EHR) study on HIPAA-compliant high-performance computing with hierarchical clustering of symptoms and multivariable regression adjusted for demographics, comorbidity, physiologic markers, and time-to-antibiotics.

What They Found

  • The LLM labeled 93,674 of 94,913 admission HPIs (98.7%) and, on manual review, achieved 99.3% accuracy (95% CI, 99.2%–99.3%), sensitivity 69.7% (95% CI, 67.3%–72.0%), and specificity 99.6% (95% CI, 99.6%–99.6%).
  • Among 93,674 labeled patients, 52,027 (55.5%) had ≥1 positive culture; MRSA was found in 1,903 (2.0%), MDRGN in 8,617 (9.2%), and in-hospital mortality was 4.8% (4,464).
  • Hierarchical clustering of the 30 most common symptoms produced 7 syndromes with differential adjusted odds: skin/soft-tissue → MRSA AOR 1.73 (95% CI, 1.49–2.00); urinary → MRSA AOR 0.34 (95% CI, 0.22–0.50); urinary → MDRGN AOR 1.26 (95% CI, 1.13–1.41).
  • Cardiopulmonary syndrome was associated with increased mortality (AOR 1.30; 95% CI, 1.17–1.45); LLM-based labeling outperformed ICD-10-CM–based labels, indicating symptom-derived risk signals can refine empiric antibiotic selection and verification priorities in pharmacy workflows. Drivers of the improvement were not fully elucidated.

Takeaways

  • Deploy a local LLaMA 3 (8B) LLM service to process Epic Clarity HPIs, output up to 10 symptom labels mapped to the 404-term vocabulary, and assign one of 7 syndromes for pharmacist stewardship dashboards and workqueues.
  • Use syndromic signals to triage empiric checks: prioritize MRSA review for skin/soft-tissue presentations; scrutinize MDRGN coverage for urinary and gastrointestinal syndromes; and flag cardiopulmonary presentations for rapid pharmacist follow-up given higher mortality risk.
  • Governance should keep the system assistive, not autonomous: require pharmacist confirmation of flags, display the captured-symptom rationale, log overrides, run monthly spot-checks against manual review, and provide brief training on signal interpretation with recalibration under stewardship oversight.
  • Operational framing: treat the HPI extractor as a barcode scanner that sorts notes so pharmacists can address the highest-priority shelves first (MRSA, MDRGN, mortality risk) while still verifying every case.

Strengths and Limitations

Strengths:

  • Large, multi-hospital cohort with blinded physician validation; the LLM achieved high specificity and agreement comparable to physicians and outperformed ICD-10-CM labeling.
  • Transparent, reproducible pipeline: a 404-term controlled vocabulary, local HIPAA-compliant processing, robust adjustment for confounders with Bonferroni correction; symptom-based syndromes aligned with ICD-10–derived infection sources.

Limitations:

  • Symptom capture depends on note quality and clinician wording; moderate sensitivity and subjective vocabulary choices introduce labeling variability, particularly for non-communicative or critically ill patients.
  • Single health system and retrospective design permit potential residual confounding; syndrome definitions and the chosen tree cut (k = 7) reflect methodological assumptions that require external validation.

Bottom Line

This LLM-derived syndromic approach is ready for pilot integration into pharmacist stewardship workflows to triage empiric MRSA/MDRGN coverage and mortality flags after local validation and appropriate governance.