COVID-19 Asymptomatic Rates and Implications

Mike Thurber

April 20, 2020

BLOG_COVID-19 Asymptomatic Rates and Implications

A key issue for COVID-19 response policies moving forward is the asymptomatic rate - people who have the virus but do not show symptoms.  If all the people who get the virus show symptoms, then despite the wide scope of the current outbreak, we have a long way to go.  In New York state, the virus epicenter in the U.S., there are nearly 220,000 tested & confirmed infections (as of 4/16/20), and the vast majority are symptomatic (it’s very hard to get tested if you're not symptomatic).  Still this is less than 1% of the total New York state population.  On the other hand, if there are 20 asymptomatic individuals for every confirmed symptomatic case, that would mean 4.5 million cases in New York, probably concentrated in the New York City area.  This would bring the city much closer to “herd immunity” and reduce fear of resurgence or second waves.  So what is the asymptomatic rate? 

Nearly all the data on COVID-19 rates is based on individuals who were symptomatic, or at high risk of contracting the virus.  To learn the true infection rate, symptomatic plus asymptomatic, it is necessary to identify a sample of people and test all of them. 

There are several recent studies that shed light on this question, involving both tests for active infections and test for prior exposure (antibody tests).  The first study reported on 215 births between March 22 and April 4 at the New York–Presbyterian Allen Hospital and Columbia University Irving Medical Center. All the expectant mothers were selected for testing, quite unlike usual COVID-19 testing strategies.  33 of the 215 of the mothers tested positive for COVID-19, an infectious rate of 15%.  However, only four (12%) of those who tested positive showed symptoms.  Therefore, for every symptomatic case, there were 7.25 asymptomatic cases. 

The second study was in Gangelt Germany.  Approximately 800 citizens were tested for current infection as well as for antibodies indicating prior infection.  The results showed an active infection rate of 2%, and prior infection rate of 14% (that is, having antibodies).  While different from the NYC expectant mother study, which looked only at positive tests for current infections, this population had approximately 7 “background” cases for each active infection.

The third study tested 3330 residents of Santa Clara, CA for COVID-19 antibodies.  Subjects were recruited via Facebook ads that were aimed at yielding a representative sample.  Investigators concluded that, for each confirmed active virus case, there were between 50 and 85 inactive cases that carried antibodies.   (The wide range takes into account possible false negatives and false positives in the test; this is an issue in all COVID tests.  More on this in a later brief.)

In the opposite direction, we have the Diamond Princess cruise ship, where there was but one asymptomatic case for each confirmed symptomatic case.  In this case, passengers skewed older than the general population, and we would expect a higher proportion of infections to develop symptoms.

The distribution of symptoms by age is of keen interest. A separate study in NYC gives us the age distribution of reported victims of COVID-19 there, as shown in the Table below. Note how the case, hospitalization and death rates all rise strongly with age in NYC (which is also seen in data for other regions):

(following data in

For example from this table, while the case rate among 18-44 year-olds is 55% of what it is among the 75+ age group, the death rate is less than 1.5% of the older group’s rate; that is, it’s about 70 times lower. This demonstrates that COVID-19 effects are much more severe among the elderly than among younger individuals, as has been pointed out repeatedly in the media. Our chief concern, then, needs to be to protect the vulnerable (the elderly and those with co-morbidities).


These studies suggest the prevalence of COVID-19 in the population is many times higher than confirmed case reporting shows, primarily because so many victims are not presenting with symptoms. On the positive side, this means that most people will be asymptomatic when they contract the disease.  If they develop immunity (which is widely assumed, but the science is not fully clear yet), society can move quickly to herd immunity. On the negative side, this means that without massive, continuous testing, we cannot limit the scope of social distancing to simply the known infected individuals.  Current testing practice misses those who are asymptomatic yet contagious, and so does not protect the population. On the other hand, if the individuals vulnerable to severe consequences from COVID-19 are well isolated to avoid infection, the negative consequences of the disease can be greatly mitigated, and a large proportion of the working population can resume normal activity, in managed stages.  If exposure to the virus does confer immunity, a great strategy would be to sequester the vulnerable, expose the strong, work on therapeutic drugs and a vaccine, and have a chance to kill the virus completely – sparing the economy and taking fewer lives.

Need Help Developing a Data Pipeline in the Cloud?

Cloud-2Our team of certified cloud developers build data pipelines and machine learning solutions that scale to meet changing business requirements.

  • Certified AWS Solutions Architects
  • Certified Google Cloud Professional Data Engineers
  • We can develop within a client’s cloud environment or host a client instance in our own cloud environment



Covid-19: Epidemiological Models vs. Statistical Models

Is the Spread of the COVID-19 Coronavirus Being Slowed?

Big Data and Clinical Trials in Medicine

About the Author

Mike Thurber Mike Thurber is an analysis professional who listens carefully to partners to master an organization’s objectives and challenges, and he has a passion for extracting relevant and valuable insights from available data in a collaborative setting. As a trusted data science consultant, he clearly communicates deep analytical insights to managers and leaders regarding decision alternatives to help them improve key outcomes. He has 20 years of experience modeling causal relationships between potential actions and desired outcomes. He has 30 years of experience procuring and transforming historical data for descriptive analysis, statistical testing, predictive modeling, and deep learning. As a Principal Scientist at Elder Research, a highly regarded data science consultancy, he has delivered a broad range of advanced analytic solutions across many industries, as well as training, mentoring, and leading other data scientists. Mike’s work has ranged from estimating the profitability, risk, and responsiveness of credit card prospects, to identifying which infants will be negatively impacted by a Cesarean delivery. He has gleaned insights on how complex consumer choices impact sales, modeled individual healthcare providers rank in achieving desired patient outcomes, and calculated fraud risk and identified emerging fraud types. His projects have shown how call center interactions affect customer retention, measured the effect of targeted messages in political campaigns, forecasted debt recoveries at the account-holder level, modeled maintenance events on natural gas wells, and predicted propensity of past benefactors to make voluntary donations. Finally, especially in the last five years, Mike has been teaching principles and best practices of data science to a broad professional audience of emerging and experienced data scientists, with an emphasis on predictive and prescriptive modeling in an AI setting.