Every day, Governor Hogan and members of his team are communicating news on the COVID-19 situation in Maryland via Twitter (and other media): @GovLarryHogan, @riccimike, @katadhall, … (and of course: @MDHealthDept too!). A number of data enthusiasts are also parsing the MD Department of Health data: @TylerFogarty7, @MikeBReporter, @JauquetW, @PrayagGordy, … and of course: @jepoirrier) … And this is only on Twitter!
But also every day, there is one thing that constantly changes: how everyone is calculating the COVID-19 positivity rate. Today (July 26), for instance, the different daily positivity rates announced are: 3.77% (Hogan), 4.47% (Hogan again in the same tweet, Hall, Ricci, MD Health Department, Fogarty) and ~6% (for me, the exact number behind the ~ is 6.14%). This doesn’t show the 7-days (or n-days) averages and other measures. And this is only on Twitter.
Why are these numbers different? Which one is correct?
First, let me get rid of the second question: all of them are mathematically correct. What value you give to them is highly depending on what you are looking at or looking for.
So how are these numbers different? Let’s calculate all of them … Today, the Maryland Department of Health COVID-19 dashboard indicates:
This gives a total of 838,572 cumulative unique tests (# confirmed cases + # persons tested negative) since the beginning of the pandemic. And it gives an overall unique positivity rate of 10.06% (# unique confirmed cases / # unique total). I added “unique” as all these numbers are only counting each person once per test (if someone is tested negative several times, he/she will show up only for 1 negative test). As shown in the chart below, this overall unique positivity rate is growing up fast when cases are increasing but is very slow to go down when cases are diminishing. In consequence, this overall positivity rate will reach 0% in a very, very distant future (almost never will as we will always have cases from the beginning).
Now we may be interested in the total testing volume (1,097,361 today): this is the total of all tests, whether results are always the same or different for the same person. Imagine a doctor being tested every week for COVID-19; for 3 weeks, she is negative (= 1 unique negative test but 3 negative tests in total) until she is found positive on week 4 (= 1 unique positive test = 1 positive test in total); after 2 weeks in quarantine at home, she is again tested negative before returning to work (= 1 unique negative test but 1 positive and 4 negative tests in total). The total testing volume is simply the addition of all tests ever done in Maryland. If you divide the # of unique confirmed cases by the total testing volume, you have an overall positivity rate of 7.64%. I personally don’t like this metric because it mixes unique positive cases with repetitive total cases. As seen in the plausible example above, the total number of unique positive tests and the total number of all positive tests is probably very close (unless positive people are tested positive several times) and it could give a good estimate of the positivity rate.
But to create even more confusion, positive cases are reported from ESSENCE (Electronic Surveillance System for the Early Notification of Community-based Epidemics, click on Biosurveillance here; they write weekly reports that are a trove of information – this may be for a later post). And negative cases are reported from NEDSS (National Electronic Disease Surveillance System, also from the CDC). And the total testing volume is given from all lab results transmitted electronically to the state. It is clearly stated that all results transmitted non-electronically are not taken into account. Having 3 different sources, counting cases differently, doesn’t help reporting – but this highlight the difficulty to present a comprehensive figure. If we plot them all on the same figure, this is what it gives:
As discussed above, the % positive from the cumulative count (green line) will always be high and go down slowly. The % positive of daily reported (violet line) is fluctuating a lot and seems to be often higher than the % positive of daily reported electronically (blue line). This high level of fluctuation is the reason why the MD Health Department has a 5-days average of the % positive of daily reported electronically (red line).
Understanding the positivity rate is important because it gives an indication of the severity of the disease. In this respect, we see that Maryland did well to reduce the severity of this disease, so far, with a positivity rate going down since early May. But the positivity rate can also be read as an indicator that the state is doing relatively good on testing (usually, a high positivity rate is associated with too few testing, only testing the most severe cases). But positivity rate can be influenced by many factors that cannot be understood from these graphs only … One of these factors is the test selection: now that Maryland allows anyone to be tested, one could reasonably thing that the samples tested are more representative of the disease in the state than when only a very restricted set of patients could have been tested (before May 19, 2020). Another key parameter is how long testing takes before giving results. All the numbers above are for when tests results were reported. When these tests were performed is not disclosed (there are discussions online that tests results take several days to several weeks to arrive – if this is true, the % positive we see now is merely a photo of what happened mid-July and not now or last week). And to add to the confusion, I’m sure tests results from different labs are reported at different speed.
All in all, data we see here is a fuzzy picture of what happened in a relatively close past. If figures go down, fine. If they tend to go up, we’ll have to be careful that we are not further up than estimated here.
Better metrics for the severity of this COVID-19 pandemic may be deaths or intensive care unit admissions. I briefly wrote about COVID-19 deaths in Maryland related to counties, to races and inequalities, to age or to gender. It’s maybe time to look at ICU in a future post …
To be continued …
As usual, you’ll find other graphs on my page about COVID-19 in Maryland (and figures above are updated with new data as they appear) and the data, code and figures are on Github (including these ones).