Like the majority of my colleagues, I recited a version of the Hippocratic Oath early in my medical education. Though not all versions of the Oath are the same, some themes are consistent. Primum Non-Nocere – First Do No Harm – is perhaps the best known of these. At the individual patient level, interpretation is fairly straightforward: do not prescribe any treatment which may harm the patient, at least not out of proportion to the benefit of that treatment.
In the modern practice of medicine, many primary care providers are responsible for a panel of patients. A primary care internist, for example, may oversee the care of more than 2,000 lives. Optimally, they not only see these patients one at a time at appointments, but also serve as stewards of their health beyond their face-to-face time. Commonly referred to as “panel management” or a facet of “population health”, physicians may be asked to survey these patients’ records to proactively address gaps in care or poorly controlled chronic disease.
In recent years, our ability to survey the healthcare data on a panel of patients has grown. Put another way, descriptive analytics has matured and is now widely adopted, even if it is not always acted upon perfectly. Finding and addressing patient care gaps is a solvable problem. So much so, in fact, that one may argue that physicians and health systems who are not addressing these care gaps may be engaging in a form of neglect that violates the spirit of primum non-nocere.
Now consider some other technologies which have grown in potency in recent years. Thanks to HITECH, we have clinical data by the zettabyte. The unrelenting march of Moore’s Law – and advances in GPUs partly driven by video games – has given us massive parallel computing power. These are the two raw ingredients for machine learning, and the prerequisites for doing robust prediction in medicine.
What if it is possible to survey the data of your 2,000 patient panel and identify those patients with 15 times higher risk of common cancers? Those patients with prediabetes who are more than 10 times more likely than other prediabetics to progress to full on diabetes this year?, Those patients who are at least 20 times more likely than others to suffer a complication like hospitalization or death from influenza this winter? Once we know about these patients, are we doing (or allowing, through neglect) harm by not addressing these risks?
A second theme common to many versions of the Hippocratic Oath is the emphasis on prevention. “I will prevent disease whenever I can, for prevention is preferable to cure.” If you are not convinced that ignoring the situations in the previous paragraph constitute an ethical lapse, perhaps this clause will convince you. “Prevention is preferable to cure.” It’s right there in the Oath. We have a mandate to prevent – the first logical step in the treatment sequence. Failing that, we are told, Do No Harm.
Throughout the history of medicine, the evolution of a superior diagnostic or treatment approach eventually led to a paradigm shift in the standard of care. Previously, plain films of the knee were scrutinized for indirect and imperfect evidence of ACL rupture – now we just get an MRI. Suspected gonorrhea historically was evaluated with a Gram stain, now we use amplified nucleic acid tests. To cite an extreme example, amputation of an infected limb gave way to antibiotic treatment which can often save the limb.
One might argue that the time has come to shift the paradigm on population health. Machine learning algorithms to pick, from among the many, those few at highest risk for targeted intervention, are now available. As the technology evolves, cost will come down, and adoption will grow. These tools will become widely available. It is time for us physicians to ask ourselves, “Are we using all of the tools available to us to fulfill the oath? To first do no harm? To fulfill our mandate to prevent?”
The implications here are profound. We are no longer bound to the face-to-face encounter in the exam room for the practice of our art. Through a survey of data and application of sophisticated algorithms, we can reach into the great wide world and intervene where the likelihood for harm is greatest. Prevention preferable to cure, outreach preferable to treatment after the disease manifests.
Of course, Machine learning-based AI care recommendations will extend beyond population health. If you have not seen one already, you are very likely to see AI guidance in both ambulatory and inpatient care settings in the very near future. What will be your reaction to these recommendations? For many, the response will depend on whether the recommendation is consistent-with or contrary-to their own clinical judgment. In the case where the AI recommendation stands in contrast to our self-derived view, we tend to trust our training and clinical instinct and doubt any recommendation to the contrary. It’s important for us to acknowledge this bias and consider the consequences.
Let’s take a concrete example. Imagine you are discussing colon cancer screening with an asymptomatic 48-year-old male patient. The patient had a complete blood count last month and all of the indices are within the normal range. By conventional clinical wisdom, and by guideline, a typical primary care physician would suggest a colonoscopy at age 50.
Now add another factor to the decision-making process. A machine learning-informed algorithm determines that, based on some subtle changes in the complete blood count indices within the normal ranges, the patient is at high risk for having occult bleeding. Patients that are typically flagged as high risk by this algorithm are about 15 times more likely to have certain occult lower GI bleeding than patients who are flagged as normal. Does this change your mind?
In many ways, this dilemma represents a mismatch between evidence-based medicine (the age-based guideline) and machine learning-informed medicine. This contrast is well explored in a recent opinion piece by Ian Scott in the Annals of Internal Medicine.
On the one hand, we are well versed in EBM guidelines, whose sources are clear. On the other, we have this ML recommendation, and it is sometimes difficult to discern where it is derived from. It may be this “black box” aspect of ML guidelines which makes them so hard to digest. This generates fear and misunderstanding which makes us less likely to trust the AI derived recommendation, despite a clear benefit in many cases.
Which brings us back to the Hippocratic Oath. In the era of powerful machine learning derived algorithms, are we in fact doing everything we can to Do No Harm if we adhere strictly to evidence-based guidelines and ignore ML recommendations? How does this calculation change as these algorithms grow more powerful and precise?
Scott argues that machine learning can complement and augment existing evidence-based guidelines. The time to reconcile these approaches and determine a way forward is upon us. Welcome to a new era. It is indeed an interesting time to practice medicine.
 Prediabetes and Primary Prevention of Type 2 Diabetes: A Guide for Pharmacy, Podiatry, Optometry, and Dentistry, U.S. Centers for Disease Control and Prevention, 2014, https://www.cdc.gov/diabetes/ndep/pdfs/ppod-guide-prediabetes.pdf