We’ve spent the better part of the last decade talking about what artificial intelligence (AI) can do for patients – predicting sepsis, flagging deterioration, and reading imaging studies, for example. And look, that work matters. But there’s a quieter, arguably more important conversation starting to take shape, one that points the lens in a different direction. What if we used AI to study the clinicians themselves?
I’m not talking about surveillance. I’m talking about something far more valuable: making the invisible visible. Because here’s the thing: clinician behavior is one of the biggest drivers of quality, safety, and cost in all of healthcare, and it’s also one of the biggest drivers of compliance risk. And we’re still largely blind to it.
The Blind Spot Nobody Talks About
If you look at where AI investment has concentrated, it’s almost entirely on the patient side of the equation. Predicting inpatient trajectories. Identifying high-risk outpatients. Supporting scheduling. But we’ve been so focused on modeling what happens to patients that we’ve largely ignored modeling what clinicians actually do under real-world conditions.
Traditional quality metrics don’t help much here, either. Guideline adherence rates, lengths of stay, readmission percentages: these compress enormously rich behavioral data into a handful of summary numbers that strip out most of the context. And from a compliance standpoint, they’re even less useful. A strong readmission rate tells you nothing about whether the documentation supporting those admissions would survive a Recovery Audit Contractor (RAC) or Unified Program Integrity Contractor (UPIC) review.
We have electronic health record (EHR) audit logs. We have aggregate utilization reports. We have mountains of clinical notes. But how often do we actually mine any of that with modern machine learning to find the patterns hiding inside it? Rarely. And yet those patterns – the systemic variations in how individual clinicians document, order, and code – are exactly what auditors are trained to look for.
What Fatigue Looks Like in a Progress Note – and Why Auditors Care
Here’s where it gets genuinely interesting. There’s a body of emerging work using machine learning on clinical notes – not to extract clinical findings, but to identify linguistic fingerprints of fatigue and high cognitive load.
Researchers have trained models on notes written by physicians across varying workload conditions, and those models can reliably distinguish notes written under high-workload circumstances from notes written under normal conditions. The former classification generalizes to overnight shifts. They generalize to high-volume settings. And when the model predicts elevated fatigue from the note text, the diagnostic yield drops significantly – even after controlling for time of day, patient demographics, and chief complaint.
Now, think about that from a compliance angle. A fatigued clinician isn’t just making suboptimal clinical decisions. He or she is also more likely to be cutting corners on documentation – defaulting to copy-forward, skipping specificity in the assessment and plan, and under-documenting the medical necessity rationale that a payer is going to scrutinize later. The same cognitive drift that degrades clinical quality degrades documentation quality. And degraded documentation, at scale, is a RAC target waiting to happen.
This kind of AI-driven early warning system isn’t just a patient safety tool. It’s a prospective compliance risk signal – the kind that could trigger a documentation improvement intervention before an external auditor finds what you didn’t.
Behavioral Signatures: The Compliance Risk You Can’t See on a Dashboard
Beyond fatigue, there’s the broader question of practice variation – and this is where things get particularly relevant for compliance officers. Every clinician has what I’d call a behavioral signature: a characteristic pattern of ordering, documentation, and decision-making that shapes their care pathways in ways that are often invisible to traditional quality reviews.
AI can model these sequences and identify distinct patterns across individuals, teams, and units. Some of that variation is clinically low-value. But from a compliance perspective, the more pressing concern is variation that creates systemic billing risk – upcoding or under-coding patterns that emerge not from intent, but from habit, workload, or workflow design.
A physician who consistently documents at a higher level of complexity than his or her peers, adjusted for case mix, is a compliance exposure, regardless of whether the documentation feels defensible in isolation. A physician who routinely under-documents the medical necessity of the procedures he or she orders is leaving the organization vulnerable in the other direction. Aggregate dashboards flatten all of this into a single compliance rate, obscuring both ends of the risk spectrum.
The good news is that we already have the infrastructure to study this. The same EHR-integrated logs we use for patient trajectory prediction contain rich information about ordering patterns, documentation choices, and coding behavior over time. We just haven’t been using them as compliance surveillance tools.
The Override Problem: AI Liability and the Paper Trail
And then there’s a dimension that doesn’t get nearly enough attention in compliance circles: what happens when a clinician overrides an AI recommendation and something goes wrong?
Research on explainable AI in clinical settings has shown that the design of AI explanations directly alters clinician trust and reliance patterns, which in turn shape downstream outcomes. Systematic reviews have identified metrics for assessing the quality of these interactions: reliance, calibration, and how clinicians resolve conflicts between their own judgment and the model’s recommendation.
From a compliance standpoint, this matters for a reason that’s only going to become more pressing. As AI-assisted decision support becomes a standard of care, the override itself becomes a documented event. Who overrode the recommendation? When? On what basis? Was the override consistent with the clinician’s broader behavioral pattern, or was it an outlier? When a post-payment audit traces a denied claim back to a clinical decision that diverged from the AI’s recommendation, the organization needs to be able to reconstruct the story coherently.
This isn’t hypothetical. The liability architecture around AI-mediated clinical decisions is being built right now, claim by claim, denial by denial. Health systems that pay attention to how their clinicians interact with AI tools – who accepts, who overrides, when, and why – will be in a far better position to defend those decisions than systems that treat the AI as a black box they install and forget about.
From Measurement to Meaningful Compliance Infrastructure
So, what do we actually do with all of this?
- First, use behavioral AI as a prospective compliance screen. If you can identify clinicians whose documentation patterns often diverge from peers, adjusted for case mix and context, you have an evidence-based basis for targeted education, focused review, or documentation improvement programs before an external auditor builds that same profile from your claims data.
- Second, build the AI interaction audit trail now. If your organization uses AI-assisted coding, clinical documentation integrity (CDI), or clinical decision support, start treating clinician interaction patterns with those tools as compliance-relevant data. Who’s engaging, who’s ignoring, and who’s overriding – all that is information you want to own before someone else asks for it.
- Third, recognize that inequitable patterns in ordering and diagnostic intensity, the kind that surface when you analyze clinician behavior across patient populations, carry their own compliance and regulatory exposure, particularly as the Centers for Medicare & Medicaid Services (CMS) and the U.S. Department of Health and Human Services Office of Inspector General (HHS OIG) sharpen their focus on health equity in billing and care delivery.
Third, recognize that inequitable patterns in ordering and diagnostic intensity, the kind that surface when you analyze clinician behavior across patient populations, carry their own compliance and regulatory exposure, particularly as the Centers for Medicare & Medicaid Services (CMS) and the U.S. Department of Health and Human Services Office of Inspector General (HHS OIG) sharpen their focus on health equity in billing and care delivery.
The Governance Question You Can’t Skip
Now, I want to be clear-eyed about the risks, because this is the part that determines whether any of this actually helps – or becomes something else entirely.
Continuous behavioral monitoring raises legitimate concerns about professional autonomy and psychological safety. Clinicians who fear that every note and every order is being scored will change their behavior – and not necessarily in the right direction. Governance frameworks have to be explicit: these tools are for safety, fairness, and compliance improvement, with clear guardrails on use in credentialing or disciplinary processes.
The models themselves also carry real limitations. Documentation style is not the same thing as clinical quality or billing accuracy. Case mix differences, resource constraints, and local culture are all potential confounders. Any behavioral insights need to be triangulated with qualitative input and domain expertise before they’re used for anything consequential.
And the design process matters. The clinicians who will study these tools need to be involved in building them. That’s not just politically smart; it’s analytically necessary.
The Mirror, Not the Oracle
I’ll add a personal note here. I’m currently involved in early-stage research developing agentic frameworks to measure exactly these kinds of behavioral patterns in a more dynamic, continuous way than current retrospective approaches allow. I can’t fully disclose the details yet, but what I’m seeing in that work only deepens my conviction that this is a direction worth serious attention.
Most of the AI conversation in healthcare (and in compliance) treats AI as an oracle, something that tells us what will happen. That’s valuable. But AI can also be a mirror, showing health systems how their clinicians actually behave under real-world pressures, where that behavior drifts from what they intend, and where that drift creates exposure.
The EHR audit logs, the clinical notes, the utilization records, the AI interaction logs: they’re all sitting there, largely unexamined, full of signal about how care actually gets delivered, by whom, and under what conditions. Auditors are going to keep mining that data whether health systems do or not. The only question is who gets there first.
If we approach behavioral AI ethically, collaboratively, and with genuine transparency about how the findings will and won’t be used, it could become one of the most powerful compliance tools we’ve ever fully deployed.
And for organizations facing the current audit environment? That’s not a small thing.
And that’s the world according to Frank!









