What the eGFR race correction teaches us about AI

The number was wrong for twenty years. Nobody could see it.

For more than twenty years, a number quietly shaped the care of millions of patients. It told nephrologists when to refer. It told transplant committees who qualified for the waitlist. It told primary care physicians which patients were stable, and which were declining.

For Black patients, the number was wrong. By design.

The number was eGFR, and the formula behind it included a race correction, an upward adjustment for Black patients based on a 1999 study’s assumption about baseline creatinine and muscle mass. The premise that race is a biological variable with consistent physiological implications was never scientifically sound. It reflected ideas about racial biology that medicine had formally discarded generations earlier. But it was embedded in the electronic health record systems of thousands of hospitals, and so it ran, automatically, on every calculation, for over two decades.

The consequences were not abstract. Black patients’ kidney function was systematically overestimated. Specialist referrals came later. Transplant waitlist placements were delayed. Patients who should have been flagged as high priority were assessed as stable for years. When the National Kidney Foundation and the American Society of Nephrology task force finally removed the race adjustment from clinical guidelines in 2021, EHR vendors took two more years to update their systems. The patients affected during the twenty-year run were never identified or compensated.

I have spent thirty years in health care, most of them as an executive responsible for deploying the systems clinicians work inside. I did not write the eGFR formula. But I have approved hundreds of decisions that put calculations like it in front of clinicians, and the eGFR story keeps me up at night for a reason that has nothing to do with kidneys.

It is this. The error was not hidden by anyone. It was hidden by the interface.

The clinicians who acted on those scores were doing exactly what good clinicians do. They trusted a validated number from a trusted system. The formula’s assumptions were not displayed on the screen. There was no asterisk that said this estimate embeds a racial assumption from a methodologically weak 1999 study. The number arrived with the full authority of the system that delivered it, and the system delivered it millions of times.

Now consider what we are deploying today.

AI clinical systems are arriving in every department, and many of them are really good. I have seen AI flag an atypical presentation that put a patient in the OR hours before a routine workup would have caught it. The early detection work in pancreatic cancer is the kind of thing our field has wanted for fifty years. I am not writing this as a skeptic.

I am writing this because every lesson of the eGFR story applies with more force with AI models. The race correction was inspectable. It was a published formula. Anyone who looked could see the adjustment. Today’s models are not inspectable by anyone in that way, not even by their builders. They perform well on average and unevenly in the specific. They were validated on datasets whose gaps are invisible until someone goes looking. The eGFR error took twenty years to correct when the formula was sitting in plain sight. What is the correction timeline for an error nobody can see?

Physicians will carry the weight of this, the way nephrologists and primary care doctors unknowingly carried the eGFR adjustment. The number on the screen will arrive with authority. You will trust it. The pressure of the day will not leave room to interrogate it. And if the system is wrong in some invisible way, the chart will show that the clinician followed the standard of care.

I do not think the answer is refusing the technology. The answer, I think, starts with three questions every clinician is entitled to ask of any system placed in front of them, and every health system executive is obligated to answer. What populations was this validated on, in detail? What is its performance in the specific, not on average, across the patients I actually see? And when it is wrong, how will we find out?

If your institution cannot answer those questions about a system it has already deployed, that is not a technology problem. That is a governance problem, and it has a twenty-year precedent with a body count we never counted.

The eGFR correction was a quiet victory. A flawed number was fixed. But nobody was identified, nobody was compensated, and almost nobody outside nephrology knows it happened. The next version of this story is being installed now, at scale, with better marketing. The clinicians who will be told to trust the number deserve to know what happened the last time.

Craig Hauben is the CEO of Clutch and has spent thirty years in health care, the last fifteen as an executive in private equity-backed companies. He writes about what AI is doing to work, in health care and beyond, from the operator’s side of the table. He approves the kinds of systems clinicians are asked to trust, and he writes about what that responsibility should mean.

His white paper series, comprising The Meter Is Running, The Token Trap, and The Second Token Trap, covers the economics of AI tokens and why most companies cannot price what they are buying. His first novel, The AI: Migration, publishes in July 2026. Every AI system, study, and clinical event in it is drawn from the documented record.

He shares updates on LinkedIn, and readers can sign up for early access to his book.