Mathematical constructs can help make predictions about coronavirus’ spread, but not without pitfalls
Our thirst for daily information on the coronavirus pandemic finds us exploring the mountains of articles pushed by our newsfeeds. Inevitably, stories that profess the projected number of expected COVID-19 cases and death toll in your local community, the U.S., and globally, are everywhere.
These projections are often based on mathematical models of infectious disease, thrusting modeling into the public eye. While our esteemed colleague Dan McQuillan helped us understand the basics of exponential growth and the 1% fallacy, our goal is to shed light on disease modeling, how models are used to make predictions and the pitfalls in trying to predict the future.
Each group in an SIR model, which tracks people who are susceptible to a disease, infected with it and recovered from it, is governed by a rate of change equation that describes how people might move in and out of the group.
In a recent White House briefing, members of the coronavirus task force used several models to lay out the “best-case” scenario of between 100,000 and 240,000 U.S. deaths because of the virus (remarks by President Donald Trump, Vice President Mike Pence and coronavirus task force members in press briefing, April 1, 2020). These projections assume stringent social distancing and community mitigation efforts, while the “worst-case” scenarios predict upward of 2 million deaths in the U.S. This difference is huge and, unfortunately, we won’t know which projections are accurate until the disease no longer dictates our daily routines. Projections also seem to change daily, so why trust any results from models?
Disease modeling can provide significant information during a pandemic. For example, the idea that social distancing can reduce the spread of diseases such as coronavirus is the product of longtime modeling efforts. As governments struggle to manage the outbreak, mathematicians are offering policymakers important strategies for how to handle it.
In various disciplines, including mathematics, researchers interested in modeling physical phenomena often use rate of change equations. In the case of disease modeling, these equations track how people move from one group to another in time. A class of models used to describe disease transmission are called “SIR” models. These models typically include three population-groups: susceptible, infected and recovered (although variations exist). The susceptible are not currently infected; the infected have the disease and may be contagious; the recovered no longer have the disease but can become susceptible again. As happens for COVID-19, individuals that succumb to the disease are removed from the model population.
Each group in an SIR model is governed by a rate of change equation that describes how people might move in and out of the group. As an individual is infected, that individual is moved from the susceptible population into the infected population changing the proportion of individuals in each group. When individuals recover, it is generally assumed that they are immune to the disease, at least for a while. Although this immunity duration is unknown for the novel coronavirus, current models assume it’s at least one outbreak cycle. Simulations that examine the idea of “flattening the curve” are often based on SIR model constructs and assumptions.
For individuals to go from susceptible to infected depends on whether they come into “contact” with an infected person. For COVID-19, this contact includes person-to-person interaction with an infected, as well as contact with an infected surface or viral droplets. In a simple model, one often uses a single parameter for this rate of transmission, sometimes referred to as the “basic reproduction number” for the disease.
What rate should we use in a model when this is unknown, as is the case for COVID-19? In some European countries, this rate was estimated to be as high as 3.87 in March (meaning that one infected person will infect 3.87 other people). However, this rate is constantly changing based on numerous factors, including the community’s effectiveness at social distancing and availability of protective equipment. Hence, modelers vary this transmission rate based on available data and run simulations leading to a wide array of disease projections. One modeling strategy is to determine the conditions that will bring the transmission rate to 1.0 or below, as this will eventually lead to disease eradication.
The desire for data
To understand how individuals transition from infected to recovered, a modeler needs access to disease data, such as the case-fatality ratio. With COVID-19, recent evidence suggests that subgroups in the population may have different case-fatality ratios, making it difficult to capture with a homogenous population. Subgroups’ dynamics can be incorporated in a model, but at the cost of complicating simulations and results. Furthermore, individuals with pre-existing conditions may also be at an increased risk but supporting data to accurately model this is lacking.
To overcome voids in information, models can be expanded to include stochasticity to address random phenomena such as missing statistics or unforeseen events. For example, Did the model account for the cruise ship that just released hundreds of potentially infected individuals within the local population? or, Did the model account for an unforeseen mutation of the disease?
Because not everything can be predicted in advance, adding a degree of randomness can help account for such scenarios and unlikely events. Unfortunately, stochasticity in a model leads to further uncertainty in model predictions.
The aforementioned factors help explain a large part of why we see wide-ranging projections seemingly daily. When modeling any disease, the model’s predictive power is directly linked with the accuracy of disease data used in the model. This highlights the critical need for expanding testing, establishing precise records of infection rates and death counts and supporting the scientific community working on COVID-19.
Whether modeling ocean currents, the impact of Alzheimer’s disease in human cells or infectious diseases, mathematics provides a wealth of information that can advance scientific knowledge. When looking back at model predictions in the aftermath of the current pandemic, we’ll appreciate and understand why some projections were correct and others way off. Attributed to George Box, the aphorism “all models are wrong, but some are useful” reminds us that by design, models are a mere simplification of reality and cannot accurately predict the future. However, models remain critical in our search for optimal strategies as the U.S. government looks to reopen the economy.
- Not so novel? Echoes of history in the COVID-19 pandemic
- Let’s learn from this teachable moment
- To plot pandemic’s path, ponder how virulence, transmission intertwine
- The vulnerable Latino/a community: Essential, undocumented in coronavirus pandemic