What is going on with the coronavirus epidemic

By Kevin Roche

Some people are freaking out because more cases are being reported in some parts of the country.  As I have explained in earlier posts, I don’t think this is anything to be excessively concerned about.  Hospitalizations as a function of cases continue to decline as does the death rate.  One thing it might suggest is that the virus isn’t particularly seasonal, especially since some of the case increase is in warmer states.  It is truly mysterious that there appears to be significant variation across regions.  The continuing failure of governments to either do or give us information from antibody or other tests to determine true prevalence is frustrating.  Along those lines, the CDC did drop us a hint yesterday that the antibody test results they are seeing, and they must have nearly ten million of such tests from various sources, indicate that there are ten times as many cases as the infection tests have confirmed. (CDC Story)  That would imply almost 30 million cases in the US.  It also would imply that the hospitalization rate and death rate as a function of cases is very low.  I do not understand why those antibody test results are not being released.

Other issues that have surfaced recently are that some positive test results are multiples of the same person, as hospitals retest to ascertain when the virus has cleared a patient or just individuals continue to get tested repeatedly to be sure they aren’t infected. I have not seen information from various states on how they ensure they aren’t double-counting people.  In addition, the standard infection test apparently does not distinguish between live virus that means a person could be infectious, and virus fragments, which may be present for some time after the infection is over.

The statistics are confirming two things that I and others have said for some time. One is that almost all cases are asymptomatic or mild.  Severe illness is pretty rare.  In fact, there currently have been around 35,000 cumulative hospitalizations.  If the CDC is right, and they are seeing the antibody test results, that is around a .12% hospitalization rate.  That is about 12 in every ten thousand cases. The second is that we saw an excess of the most severe cases first. This is related to the front-loading concept I have mentioned. I want to see if I can describe that a little better, because I think it is critical to understanding the shape of the epidemic.

Imagine a population of people. They have various personal characteristics–sex, age, race, ethnic background, residence type, health history and conditions, genetic makeup, occupation, etc. Where they live may also have certain characteristics–population density, income level, use of mass transit, health resource levels. When a new pathogen appears, in this case a respiratory one, who and at what rate that pathogen infects people is dependent on many of these factors.  Most models don’t take them into account, which is one of their primary flaws.

So in your population, X% of people are going to be male or female, are going to fall into a certain age group, will have certain health conditions, will be of a certain race and so on.  As the pathogen meets the population, you could make several varying, but critical, assumptions about how it proceeds through that population. You could assume that everyone is equally likely to become infected. Or you could assume that there will some variability in susceptibility to infection, perhaps some people don’t get infected on exposure. Similarly, not everyone who is infected may be equally infectious, or capable of passing the disease to others. So your assumption in that regard is also very important. And if you are modeling or considering possible outcomes of infection, like hospitalization or death, you could assume everyone is equally likely to become hospitalized or die, or you could assume some variation in the seriousness of the disease.

Obviously, it is very important to have data that supports whatever assumptions you make.  And now you see the fundamental problem with a forecasting model, especially in a situation like an epidemic caused by a new pathogen.  At the start you have very little real data, so your model is basically worthless, it is just a guess.  So why supposed experts don’t publish these models with a huge warning that says “not fit for use in decision-making” is beyond my comprehension.  And why they do things like assuming equal susceptibility to infection, when basic epidemiology says people always vary in susceptibility, is also extremely puzzling.

So back to our population and the spread of the pathogen through the population.  You would need to understand how transmission occurs.  As far as we know now, this virus spreads almost exclusively through the respiratory route, that is we breathe it in or we have it on our hands and we place those near our mouth or nose.  And most of that transmission appears to be from person to person, although there may be some limited transmission that occurs from virus on surfaces.  So one critical fact to consider is how many contacts does a Person A have with other people in a unit of time, say a day, and how likely is it that any one of these people is infected and infectious and could pass the virus to Person A?  Many models address this by a contact matrix that typically varies by age and/or other factors.  The Minnesota epidemic model incorporates one of these contact matrixes, although it clearly has performed extremely poorly.   The contact models tend to have children with the highest number of contacts, young and middle-aged adults have a relatively high number, because they are working, among other factors, and older, especially retired, persons have the fewest.

If you are considering all these factors properly, when the virus hits the population, it isn’t being assumed to have an equal chance to infect each person in the population, rather the likelihood that it infects a person is dependent on their personal characteristics, the characteristics of where they live, and their individual likely level of contacts, and the same is true of that person’s infectiousness and their likelihood of certain disease outcomes.  If you haven’t adjusted these assumptions or even recognized that you were making an assumption of equal likelihood, then your model will be really screwed up, especially if the factor or characteristic is an important one in regard to epidemic course.

If we look at this epidemic and what we know about it so far, a very notable feature is that it is has less likelihood of children being infected, in fact likelihood of infection rises with age, and there is a very dramatic difference in risk of serious illness or death with increasing age and with some residential settings.  A model that appropriately adjusts level of contacts alone might pick up some or most of the differential spread of infections, but it won’t pick up the difference in severity of illness.  And if it doesn’t have some ability to adjust for type of residence, especially for the older segment, it will miss a clustering effect that occurs in group living settings, such as nursing homes, assisted living facilities, group homes for the mentally disabled or recovering drug or alcohol addicts, etc.  A model which doesn’t incorporate these adjustments will assume the population (as modified for the adjustments it does make, like contacts) has an equal likelihood of being infected and being seriously ill.  So as the epidemic proceeds, those cases roll out equally over time for the various segments of the population.

So simplistically, a model that does incorporate contact levels, but not other factors would have more children getting infected earlier in the epidemic, because they have more contacts; young and middle-aged adults would get infected at a somewhat lower rate, and older persons would bring up the rear in terms of infections because they have fewer contacts.  And if you consider people in group living settings they may have the fewest contacts, other than with staff and other residents.   And we see some of this effect in per population case counts (which are somewhat unreliable for this purpose because of uneven testing).  But we also see that this simplistic model would completely miss that many people and younger adults simply don’t get infected on exposure and that the oldest group has a very high likelihood of serious illness and death.  And it would miss the differences in severity of illness.

In fact, for a model to match reality, it needs to incorporate what I would call front-loading. The oldest and most vulnerable segments get infected easily and have high rates of hospitalization and death. And while they may have few contacts, for those in group living settings, once the virus is in the residential facility, it has a high likelihood of infecting every one living, or working, there and those residents have a high likelihood of being hospitalized or dying. You can imagine exactly how this happens. For some extended period of time there are no infections in the residence. Then a staff member or a visitor who is infected and infectious comes in and spreads the virus, which easily moves via staff members or the residents themselves to other residents. Suddenly you have a very high number of cases and serious illness in these facilities. This is exactly what we see in Minnesota and other states. The ease of infection and severity of illness characteristic of these segments outweighs their more limited contacts.

Once those infections start in the group living settings, and when they reach the frail elderly at home, hospitalizations and deaths skyrocket.  If you don’t understand why that is happening, it would be easy to misread the epidemic as having high rates of severe illness for everyone. But the flip side of this phenomenon is that once the epidemic has burned through these group settings and the elderly, it largely has the younger populations left to infect, and we are going to see much lower levels of severe illness in those groups. And that is exactly what is happening now, we are experiencing an apparent rise in cases, while hospitalizations and deaths are not increasing as rapidly or are even declining. I view this as potentially beneficial from a public policy perspective. We are gaining population immunity, while seeing less risk of serious illness and death to the elderly.

So for a model to forecast or even track what is happening with this epidemic, it needs to incorporate many factors and assign them in the proper percentages to the population being modeled.  For this epidemic, it needs to assume variation in susceptibility.  It needs to incorporate variation in infectiousness.  It needs to capture residential setting and for those persons living in a group setting, needs to be able to adjust likelihood of infection upward significantly for all residents once there is any case of a staff person or resident.  It needs to have appropriately age and other factor adjusted rates of hospitalization and death.

Finally, a word again about advance directives.  I believe this is a significant hidden factor in deaths.  It would be very useful to have this information, which isn’t as easy to gather as most of the characteristics listed above.  If a high percent of the elderly have advance directives, that probably means many are dying without hospitalization, and many are dying who were near the end of life and chose to die without attempts to stop the disease.