Header image by Josh Appel on Unsplash

As the alpha variant has been replaced by the delta variant in the UK, I’ve been trying to think how this fits with an idea that periodically comes up, seems important, and never really seems integrated into mainstream thinking - that covid is an overdispersed virus where a small number of current infections are responsible for a majority of future infections.

In my head, I could imagine how a virus that behaved like this could generate successful variants that replaced the original strain, without that new strain being functionally different. But I wasn’t sure how over-dispersed a virus needed to be before this was the case. As described a few paragraphs down in this study, different research has come to very different conclusions on just how over-dispersed Covid is and it turns out the pattern I imagined is much stronger at unreasonable levels of overdispersion. In general I ended up not being convinced this was an entirely satisfying explanation for what is happening in the real world. But the maths is fun and led to some good graphs, so read on if you like those.

R and K

Starting with the basics, R0 is the average number of new people that one person with the virus will infect. This number is important mostly because of the implications for exponential growth. If R0 is above 1, the number of people infected will increase exponentially over time, if R0 is below 1, the number of people infected will decline, and if R is exactly 1 the number of people infected at any time should remain the same. R0 is an average across all people infected, and if R0 is 1, it does not necessarily mean that every single person will infect exactly one person. One-third of infected people infecting zero additional people, one-third infecting one person, and one-third infecting two people will also have an average R0 of 1. Different patterns of spread can explain how and in what situations a virus can successfully infect new people.

The division of probability does not have to be even above and below the average. In an overdispersed distribution, a large number of infected people that pass the virus on to few or no people can be balanced out by a small number of infected people who pass the virus on to many people. In equations that describe the different distributions, this is controlled by a variable labelled k. Where a virus is very overdispersed (k is small), a tiny proportion of currently infected pass on the infection to the next wave (and almost all do not pass it on at all). As k rises, the distribution becomes more like a bell curve and the next wave of infections is caused by most people with the virus infecting small numbers of people.

The value of k is not as easy to interpret on its own as R0, but one way to think about it in practice is to convert k to the percentage of currently infected people who do not infect the next wave. The following chart shows this for a negative binomial distribution (other models are possible) where the mean R0 is ‘1’ and the overall number of infected should remain steady over time.

When k is very small, almost all samples from the distribution are zero but when the mean (R0) is one, the remaining small number of non-zero samples should roughly sum to the original population of the virus. As k rises, this drops off until it approaches the roughly one-third expected by a bell curve.

The practical implications of identifying that a virus is overdispersed is it suggests different kinds of responses will be more effective. For instance, bans on mass gatherings will have a greater impact on overdispersed viruses as it reduces the opportunity for one person to infect many. One way to think about this in practice might be a virus that is only infectious for a small window, and if no one is present in that window the virus is not passed on, but if many people are present all will be infected. Another would be if how infectious a person is results from a quirk of the part of the body that was infected, with very few people infected in such a way that they can spread to many people. I’m mostly dealing with a toy virus in this blog post where we just accept overdispersion affects random sampling without wondering too much about the mechanisms of that.

K and variant switchover

The area I’m especially interested in understanding is how this idea of overdispersion interacts with mutation and variants. Over time, viruses accumulate mutations, which can be used to identify new strains or variants of the virus. These mutations may or may not give the new variant functional advantages over the original form, making it easier to spread or more harmful. If a variant has mutated to a form that makes it easier to spread, it will grow as a relative proportion of the current infected population. This is either because it is spreading through a separate population faster, or is directly reducing the spread of older variants by infecting people faster and first and leaving immunity that makes the person resistant to the original variant. Given this, if a variant’s population is rising relative to older variants, this is a possible sign that it has become more infectious.

Overdispersion changes the dynamic of how a virus evolves over time. For an overdispersed virus, most viruses in wave 1 have no descendants in wave 2. Because most people do not pass on an infection, more variants die out before infecting a second person. As a result of this, lower values of k make it more likely that one variant can be replaced by another as the dominant variant of the virus without the new variant being any more infectious than the original.

To see how this works, I used a very basic simulation of a virus, which has an R0 of 1, a starting infected population of 200,000 and a 0.1% chance that the offspring of a virus in the next wave is an identifiable variant, but that is not any more or less contagious than the original. Starting with a virus which isn’t overdispersed, the initial variant would fragment over time into many smaller variants, none of which have a competitive advantage, and lucky variants will randomly rise to a low percentage of the overall population (10-20%). Variants when they emerge have only a 1/3 probability of growing in size, and some variants get lucky several times in a row and stabilize at a higher value, while most stay small or disappear. Technically speaking, if a variant’s luck holds out long enough and it can maintain an average R0 above 1 it could become dominant, the chances of it doing so are just incredibly unlikely.

But this population bottleneck can then cut the other way, when inevitably a new variant gets lucky and super-spreads several times in a row, it can in a few waves represent a sizeable proportion of the overall population, making it more likely to get lucky in future rounds and replace the original variant as the dominant variant in the population. To put it another way, if you have a bag with 99 black balls and 1 white ball, and if picking a ball out of that bag and putting it another made a bag that was mostly that colour, 99% of the time you would end up with an identical 99 black 1 white situation. The black ball population is very stable. But in the 1% of the time you pick the white ball, you suddenly have an almost entirely white bag, and the population of white balls will be dominant for a while.

The time for this switchover to happen depends on how overdispersed the virus is. The more overdispersed the virus is, it takes less time for a single variant to replace the original variant. When k gets higher, the chances of this happening becomes increasingly improbable. This chart shows the average number of generations required in the simulations for a new variant to represent 66% of the virus population using the above model and different values of k:

This represents an average of 50 simulations, in some simulations the successful variant would directly overtake the original, in others the original variant would disappear a long time before another variant appeared to be more successful. The exact way this dynamic plays out is a matter of chance, but generally as k gets lower, the chances of a variant getting lucky sufficient times to become the dominant variant increases. As k gets higher, the the amount of time required for this to happen passed beyond the arbitrary cap I put in the simulation size (30,000 waves). The number of waves mean little in themselves as they result from all the properties of the model, but the difference caused just by varying k demonstrates the expected dynamic. However, it also demonstrates the measurable effect is limited in this case to the quite low values of k.

If you were a scientist living in this model with a very small k, looking at the different variants, and tracking them through the population, you would be very worried about this variant that is coming from nowhere to dominate the population. The table below shows the effective R0 history of the successful variant in simulations where k = 0.001. It is consistently much higher than 1, with early waves suggesting rates much higher than that.

average_r0 1 2 3 4 5 6 7 8 9 10
2 104.94 4259.0 1.57 1.16 1.40 0.78 0.97 0.80 0.47 2.02 2.01
1 12.58 1279.0 1.35 1.00 3.35 0.81 1.36 1.13 1.01 1.03 1.39
5 12.32 2244.0 0.60 1.54 0.55 3.97 1.73 0.76 0.58 0.71 0.72
8 9.46 721.0 3.84 1.18 1.15 0.76 2.72 0.87 1.80 1.00 0.76
7 4.41 548.0 5.25 3.28 1.12 1.17 0.76 1.80 1.33 1.19 1.12
6 4.04 3228.0 1.04 1.04 1.04 0.25 0.21 12.98 0.80 0.57 2.45
0 2.96 578.0 3.41 1.19 1.48 1.00 0.67 1.59 0.51 1.58 1.44
3 2.25 358.0 6.56 0.83 3.47 1.30 1.47 0.84 1.61 1.28 0.80
9 1.56 147.0 8.97 2.52 0.47 1.73 1.28 1.44 0.71 0.65 0.53
4 1.21 16.0 6.50 11.36 1.81 0.52 3.40 2.03 0.72 1.50 0.43

If your society is currently controlling the virus enough that it is not growing, it would be very worrying to see this kind of explosive growth. But in these models this is misleading. In reality, this variant got lucky and tracking it through the period of its most intense luck produces a very high figure for R0. As it represents more and more of the population, its luck trends towards the average for the overall virus. And in time, it will be replaced by a new, functionally-identical, lucky variant.

K and improving transmissibility

These simulations show that results very similar to what has been seen in the real world and attributed to explosive variants might also be explained by how very overdispersed viruses and variants interact. But this model is not like the real world and is cheating in that we have ruled out the very real possibility that a variant might be worse than the original variant. What do different kinds of overdispersion mean if we make it more like the real world and allow the effectiveness of the virus to change over time? In general, we would expect that the R0 of the variant population should increase over time, because variants with higher R0 will produce more of themselves. As this is entirely theoretical, there are no costs borne by the virus in other respects to becoming more transmissible.

Running this kind of simulation found that all values of k on average had the expected increase in R0, but the scale of this was affected by k. Lower values of k had less of an increase in average R0 by the end of the simulation. One way of thinking about this is that at lower values of k success is far more governed by randomness than merit. A slight change in mean R0 is less significant than being the lucky virus that infects the majority of people in the next generation. On the other hand, when many more viruses contribute to the next generation, there is a slow and steady reward for even slightly higher R0. For some of the simulations with lower values of k, the variant population had a lower average transmission rate than it started with. Viruses with a very low k will make slower progress towards a more infectious variant, but it is more likely to happen in a series of steps rather than a steady climb. There may even be retrograde steps, where a new virus could become dominant despite being less infectious than the previous wave. In short, very overdispersed simulations had weaker evolutionary pressure for better transmission.

What does this mean in the real world?

Designing and running these simulations, I found the behaviour I imagined existed, but was much more noticeable for unreasonably low values of k. The lowest I could see suggested for Covid seems around 0.1, which is around where the simulation could reliably produce a switch-over effect in a reasonable amount of generations. If it is actually significantly higher than this, this all has less and less relevance. Some of this may be to do with the parameters I chose for the simulation. In particular, the idea that only 1/1000 viruses may be different enough to be an identifiable variant governs how fast the simulation can run. If the figure is far higher than this, you might get change-over more frequently and this might have more real world applicability. It is unclear if this would be reasonable to assume this however. Looking at the simulations, I’m not convinced this feature of very over-dispersed viruses is sufficiently satisfying to be confident that other explanations are worse.

Where overdispersion may be interesting even when only slightly present is simulations that account for different densities in different populations. For instance, the idea that the alpha variant growing in December 2020 was a product of kids being in schools is aided by the idea of overdispersion, because schools were one of the few places where large groups of people were still meeting (also a big concern during current increase). I’ve seen various uses of the idea that a society has a “contact budget” in a pandemic, where only a certain amount of contact is possible without dangerous growth. What overdispersion would add to this is that all contact risks are not equally valued, and that larger class sizes are far more expensive in terms of overall spread than you might expect.

A potential generalisable point is that an overdispersed variant that is not any more transmissible may appear to be very transmissible during early stages driven by super-spreading events. However, it is not clear that changes how you should read that kind of data in the real world, because the same would apply for a variant that was genuinely more transmissible. Even if you said there was only a small chance that a variant that appeared to have higher transmission actually was, that is more than enough to justify a strong response. There is a strong case for society as a whole to over-react to even low probabilities of more infectious variants while exponential growth in the unvaccinated population is unmanageable (the consequences of being wrong are very bad). Indeed, the basic message of the blog post about schools linked above is that the exact mechanism is probably good to find out, but ultimately unimportant because if the number is going up, current policy is not working.