Woman having flu or COVID symptoms, feeling bad and lying on the sofa

(© Paolese - stock.adobe.com)

Estimated 15 Million Americans — Twice As Many As Thought — Still Battling Effects From Coronavirus Years After Diagnosis

In a Nutshell

  • Standard hospital billing codes have been shown to identify fewer than 7% of long COVID cases; using an AI algorithm instead, this study found a rate of more than 16%.
  • Nearly 9 in 10 long COVID patients in the study developed chronic conditions requiring ongoing medical management, not temporary symptoms that resolve on their own.
  • Long COVID cases varied significantly by region, with rates ranging from roughly 14% in Western Pennsylvania to nearly 23% in Southern California, and the types of complications differed by location as well.
  • The burden of long COVID continued growing through mid-2024, years into the pandemic, with three of four regions showing statistically significant quarterly increases.

Millions of Americans who developed lasting health problems after COVID-19 infections are essentially invisible to the medical tracking systems designed to count them. A sweeping new study analyzing nearly half a million patient records across 58 hospitals found that official diagnostic coding captures fewer than half of all long COVID cases, and that the true toll of the illness is not fading away, but steadily growing.

Published in JAMA Network Open, the study found that roughly 1 in 6 COVID-19 patients developed long COVID, a rate more than double what standard hospital billing codes would suggest. Even more alarming: nearly 9 in 10 of those patients developed chronic conditions requiring ongoing medical care. These are new, lasting illnesses temporally associated with a prior COVID-19 infection, such as thyroid disorders, blood sugar problems, and metabolic disease. Patients are filling doctors’ waiting rooms without anyone connecting them back to COVID-19 at a population level.

For public health agencies trying to understand the scope of the long COVID crisis, that gap in detection is a serious problem. If tracking systems can’t accurately count who is sick, health systems can’t plan for the care those patients will need, disability programs can’t fully recognize the burden, and researchers can’t efficiently find participants for treatment trials.

Long Covid patient with post viral fatigue syndrome pushing a metaphoric Coronavirus up a mountain.
The virus may be gone, but the millions of Americans battling long COVID are still struggling months or even years later. (© THP Creative – stock.adobe.com)

How Researchers Counted Long COVID Cases Others Missed

Researchers examined electronic health records, the detailed digital files hospitals and clinics keep on every patient, from 58 hospitals and affiliated clinics across four U.S. regions: New England, Southeast Texas, Southern California, and Western Pennsylvania. In total, the study looked at 457,950 adults with confirmed COVID-19 infections, with records spanning from 2017 through 2025, though the analysis of long COVID trends focused on cases from 2020 through mid-2024. The average patient age was about 52, and roughly 60% were female.

Rather than relying on the standard billing codes that hospitals use to flag long COVID, a system well known to dramatically undercount cases, the research team used a custom artificial intelligence algorithm called P2RC. The AI was designed to comb through a patient’s full medical history and identify patterns of symptoms appearing three or more months after a COVID-19 infection and lasting at least two additional months, while filtering out symptoms explained by pre-existing conditions. The algorithm had been previously validated with about 80% precision in identifying long COVID cases.

Using that approach, the team identified 74,560 long COVID cases, a rate of 16.28% across the full patient group, compared to the fewer-than-7% rate that billing-code-based surveillance would have caught. The gap held across all four regions, though rates varied from about 14% in Western Pennsylvania to nearly 23% in Southern California.

Long COVID Is Largely a Chronic Disease Problem

Just how permanent these conditions appear to be is one of the study’s most significant findings. Among the thousands of medical diagnoses associated with long COVID in the study, more than two-thirds were classified as chronic or potentially chronic, meaning conditions that don’t simply go away. Of the 74,560 patients identified with long COVID, 66,587 (nearly 90%) had developed at least one chronic condition requiring sustained care. Translated to the full COVID-19 patient pool in the study, roughly 14.54% of everyone who got COVID-19 went on to develop an unexplained chronic illness temporally associated with that infection.

The types of long COVID complications also varied by region. Patients in New England were more likely to develop thyroid problems, while those in Southeast Texas, Southern California, and Western Pennsylvania more commonly showed metabolic issues like abnormal blood sugar and prediabetes. Across all regions, the most common long COVID symptoms involved general fatigue-type systemic symptoms, followed by breathing problems and digestive issues. The researchers noted that regional differences in diagnoses could reflect true biological variation between populations, or could partly reflect differences in how local doctors document conditions.

The most concerning finding may be the trend line. Looking at data from 2020 through mid-2024, the cumulative rate of long COVID didn’t flatten or decline; it kept creeping upward. Three of the four regions showed statistically significant quarterly increases. This pattern, the authors wrote, “reflect[s] ongoing accrual of incident cases from successive infection waves rather than a fixed cohort progressing toward resolution.” In plain terms: new cases keep adding to the total burden with each wave of COVID infections, rather than the overall burden shrinking.

Why So Many Long COVID Patients Go Uncounted

So why are these patients missing from the broader tracking systems that policymakers rely on? A person with long COVID might visit their primary care doctor complaining of fatigue, see a heart specialist for a racing pulse, and visit an endocrinologist for newly elevated blood sugar, and none of those visits may ever get logged under a long COVID diagnosis code. Each specialist treats a piece of the puzzle without the system ever recognizing that COVID-19 started it.

The AI-driven approach used in this study ran on a network where each hospital processed its own patient data locally, without sharing private records externally, a setup that could in theory be scaled nationally. The authors suggest that expanding this kind of infrastructure is the most viable path toward accurate, ongoing long COVID surveillance.

The study also extrapolated from its findings to estimate that, based on the roughly 103 million documented U.S. COVID-19 cases on record, approximately 15 million Americans may be living with chronic post-COVID conditions. The authors flag that estimate as one to treat carefully, given the specific type of patient data used.

Five years after the pandemic began, the long COVID problem hasn’t resolved itself. It’s growing, it’s mostly chronic, and the systems built to track it are capturing only a fraction of the picture, leaving millions of sick Americans uncounted, under-resourced, and out of view.

Disclaimer: This article is for general informational purposes only and does not constitute medical advice. The findings described are based on a retrospective analysis of electronic health records from U.S. hospital systems and reflect statistical associations rather than confirmed causal links between COVID-19 infection and subsequent chronic conditions. Individual health outcomes vary. Always consult a qualified healthcare professional before making any changes to your medical care, treatment plan, or health-related decisions.


Paper Notes

Limitations

The study’s AI algorithm depends heavily on the quality and completeness of electronic health records. A filter used to select patients with sufficient documented medical history may have excluded people with limited or fragmented healthcare access, meaning long COVID rates in those populations could be even higher than the study found. Detailed on-site record reviews to validate the AI’s accuracy were only conducted at the development site, not at the hospitals in Southeast Texas, Southern California, or Western Pennsylvania. Without a comparison group of people who never had COVID-19, it is not possible to confirm that every chronic condition identified was truly caused by COVID-19 rather than coincidentally developing around the same time. The researchers also note that regional differences in diagnoses, such as higher rates of blood sugar problems in some areas, could reflect differences in local medical coding habits rather than true biological differences between populations. The study did not directly compare its AI-identified long COVID counts against site-specific counts using the standard long COVID billing code.

Funding and Disclosures

The research was supported by the National Institutes of Health under award number R01AI165535 from the National Institute of Allergy and Infectious Diseases, and under award number U24TR004111 from the National Center for Advancing Translational Sciences. The funders had no role in the design, conduct, analysis, or publication decisions of the study. One author, Dr. Hügel, reported receiving grants from the German Academic Exchange Service and the German Research Foundation during the conduct of the study. No other conflicts of interest were reported.

Publication Details

Authors: Jiazi Tian, Alaleh Azhir, Matthew Decaro, Ngan Chau, Jonas Hügel, Michele Morris, Jingya Cheng, Pedram Fard, Ingrid V. Bassett, Douglas S. Bell, Elmer V. Bernstam, Shyam Visweswaran, Jeffrey G. Klann, Shawn N. Murphy, and Hossein Estiri. Authors are affiliated with Massachusetts General Hospital, Brigham and Women’s Hospital, the University of Texas Health Science Center at Houston, the University of California Los Angeles, University Medical Center Göttingen (Germany), the University of Pittsburgh, and other institutions.

Journal: JAMA Network Open Paper Title: Long COVID Persistence and Surveillance Gaps Across 58 US Hospitals Published: May 27, 2026 DOI: 10.1001/jamanetworkopen.2026.14909

About StudyFinds Analysis

Called "brilliant," "fantastic," and "spot on" by scientists and researchers, our acclaimed StudyFinds Analysis articles are created using an exclusive AI-based model with complete human oversight by the StudyFinds Editorial Team. For these articles, we use an unparalleled LLM process across multiple systems to analyze entire journal papers, extract data, and create accurate, accessible content. Our writing and editing team proofreads and polishes each and every article before publishing. With recent studies showing that artificial intelligence can interpret scientific research as well as (or even better) than field experts and specialists, StudyFinds was among the earliest to adopt and test this technology before approving its widespread use on our site. We stand by our practice and continuously update our processes to ensure the very highest level of accuracy. Read our AI Policy (link below) for more information.

Our Editorial Process

StudyFinds publishes digestible, agenda-free, transparent research summaries that are intended to inform the reader as well as stir civil, educated debate. We do not agree nor disagree with any of the studies we post, rather, we encourage our readers to debate the veracity of the findings themselves. All articles published on StudyFinds are vetted by our editors prior to publication and include links back to the source or corresponding journal article, if possible.

Our Editorial Team

Steve Fink

Editor-in-Chief

John Anderer

Associate Editor

Leave a Comment