Tracking COVID-19 outbreaks with evolution
May 2020, updated August 2020
The map shows the known locations of coronavirus cases by county. Credit: New York Times
Earlier this month, an autopsy revealed that COVID-19 began killing people in the United States weeks before we noticed. The earliest COVID-19 death now appears to be a Californian who died on February 6, 2020, not the Washington State residents who died 20 days later, the previous "first" deaths. The newly discovered coronavirus victims had not traveled recently and probably caught it from someone in the community. This suggests that the virus was spreading person-to-person in the U.S. earlier than previously believed. The revised timeline also fits well with new evidence collected by scientists who use evolutionary techniques to unveil the sources and course of COVID-19 outbreaks.
Where's the evolution?
In real time, to prevent new infections, health professionals must use contact tracing to figure out who passed the virus to whom and who might have been exposed. However, after the fact, scientists can use viral genetics to reconstruct much of this information, revealing the source of outbreaks in different communities and even when they might have started.
This is possible because viruses like COVID-19 evolve relatively quickly. As virus particles reproduce, their genomes acquire small copying errors — mutations. As the virus jumps from one person to the next, it takes its mutations with it and new mutations occur alongside the old ones. This is no different than the evolution of more familiar species (though it happens much more rapidly): just as one ancestral finch species can diversify into a multitude of species over millennia, one ancestral COVID-19 strain diversifies into a multitude of descendent strains over a few weeks of contagion. And scientists use the same techniques to study the two processes. To reconstruct the evolutionary history of finches, scientists collect data on their genetic sequences (and often other traits as well) and then use this information to reconstruct the evolutionary tree, or phylogeny, representing finch history and how all the modern species are related to one another. For viruses, this means sequencing genetic material — coronaviruses carry RNA, not DNA — from viral strains infecting different patients and building a viral family tree.
When evolving entities like viruses pass heritable information from parent to offspring, we can depict their evolution in a tree shape, called a phylogeny. As variations are introduced and passed to offspring, new branches of the tree form. We can usually reconstruct the evolutionary history of a group by collecting information about the distribution of traits (genetic sequences, anatomical caracteristics, behaviors, etc.) among lineages in that group. However, when new variations (e.g., mutations) are introduced rapid fire, it is harder to reconstruct ancient evolutionary relationships. This is because those variations are evidence, and when a lot of them occur, new changes are likely to overwrite older ones, destroying evidence of older relationships. Since viruses mutate quickly, it is very difficult to reconstruct their deep evolutionary history. Nevertheless, scientists have made a lot of progress on this challenge, especially by including information about the structure of viral proteins (and not just genetic sequences) in their analyses.
Biologists have studied COVID-19 outbreaks in several communities this way. For example, scientists collected virus samples from patients in New York City, sequenced their genetic material, and combined them with data on viruses from around the world to build an evolutionary tree. They announced their results this past month: New York City viruses occur in small clusters spread out over the tree. This means that the New York outbreak is the result of many different introductions of the virus, not a single Patient Zero. However, most of the New York viruses are not very closely related to viruses from patients in China, where COVID-19 first arose. Instead, most of the introductions to NYC seem to have come from Europe and other places in North America. We can't blame New York's overflowing emergency rooms on travelers from China.
The same is true in California: viruses from California are all over the evolutionary (and geographic!) map. Some are closely related to European strains, some to strains from China, and some to virus samples from elsewhere in the U.S. In Connecticut, coronavirus sequences suggest most cases there stem from other parts of the U.S, rather than from China or Europe. In fact, the latest data suggest that most of the coronavirus outbreaks in smaller communities in the U.S. were set off by travellers from New York City, not overseas.
However, the infections in Washington State showed a very different pattern. Most of the Washington cases (85%) formed a tight-knit clade — a group of all the lineages descended from a single shared ancestral viral strain. Furthermore, the closest viral relatives of this clade were from patients in China. This suggests that most of the infections in Washington State can be traced to a single strain (perhaps even carried by a single person), likely acquired through travel to China, which was then passed on locally. Because mutations occur at a relatively predictable rate, biologists can use the number of mutations that distinguish two viruses to estimate how long they've been evolving as distinct lineages: more differences mean they've been evolving separately for a longer time. In the case of the Washington State clade, the viral strains are diverse and suggested that they'd been diversifying locally since late January or early February, 2020. This means that in Washington, the virus was probably passed from person to person for weeks, without anyone realizing it, before the first case of local transmission was finally identified on February 28.
Evolutionary tree of COVID-19 from Washington State and around the world. Image loosely based on Bedford et al, 2020, but significantly simplified from the original for clarity.
Studies like these use COVID-19's own evolution to illuminate the sources, pattern, and timing of transmission in local communities. In many cases, they reveal things that we couldn't figure out in any other way — like how outbreaks started and where the virus might have passed undetected under our radar. Mutations are key to these insights. For fast-evolving entities like viruses, mutations work like breadcrumbs, allowing us to work backwards to reconstruct the path of evolutionary history.Primary literature
Discussion and extension questions
View this article online at:
Understanding Evolution © 2020 by The University of California Museum of Paleontology, Berkeley, and the Regents of the University of California