Month: June 2020

Why would Maryland remove COVID-19 data from nursing homes?

Since the beginning of the COVID-19 pandemic, we suspected and saw that nursing homes and other facilities where people are grouped together (prisons, …) could be at higher risk of transmission. The focus on nursing homes was because deaths seem to disproportionately affect the older population that also resides there. And nursing homes are also home for frail people with comorbidities.

In its dashboard, the Maryland Department of Health quickly started to build a dedicated page with numbers from different “congregate facility settings”. As I did for other metrics from this dashboard, I made a chart of what seemed the cumulative total cases, differentiating staff (who are stuck working there) and residents (who stuck living in these facilities):

Besides the weekly update (contrasting with the daily update on the main dashboard), the strange thing is that curves are going down! If it was a true cumulative curve, it would keep either growing (new cases are added) or it will go flat where it reached (no new case, we keep the total from last day or week).

Then you read the note below the dashboard (before the tables) and it says:

Facilities listed above report at least one confirmed case of COVID-19 as of the current reporting period. Facilities are removed from the list when health officials determine 14 days have passed with no new cases and no tests pending.

I could imagine that the reason is pragmatic: somewhere, someone stops adding cases (or deaths) if the facility doesn’t send new case (or new death) count for 14 days. But it doesn’t make sense to actively remove the facility from the list and therefore remove the cases (or deaths) that were reported earlier. Especially if the dashboard leads viewers in error by stating “Total # of Cases” as y-axis:

Melissa Schweisguth reported an article from the Baltimore Sun pointing to this discrepancy and current state of discussion (but no solution reported) in this tweet:

The article quotes the Department of Health mentioning that the other data presented is cumulative but I couldn’t find this … Indeed all datasets available include the same caveat that facilities not reporting within 14 days are removed:

If I take an example in the first few facilities that reported cases, we clearly see that this one (whichever it is, it doesn’t matter here) started to report cases up to June 10. Since I’m writing this on June 25, there are more than 14 days that they stopped reporting, the dataset doesn’t include this facility anymore (the latest data points in the dataset are for June 24):

This is a pity because, besides the difference between residents and staff, these datasets also present cases and deaths among youth and inmates. It would have been nice to understand the evolution of the burden of COVID-19 in these populations. But the curve is clearly not cumulative, as we can seen on the charts below: after about June 2nd-10th, curves going down probably indicate removal of facilities in the total count.

As mentioned in the Baltimore Sun article, with this kind of reporting, you cannot know the real toll in nursing home, prisons and other congregate facility settings and therefore you cannot respond to it appropriately (i.e. the toll is now underestimated).

Also, you can’t put things in perspective because you can’t have a reliable proportion of cases in congregate facility settings compared to the total number of COVID-19 cases in Maryland. This total number of cases is cumulative and we see an artificial decrease in % of cases in these facilities, as illustrated below:

Now, what can we do? One clear solution is that the Maryland Department of Health changes its reporting and really report the correct cumulative number of cases in congregate facility settings. Besides that, I have a technical solution in mind but I had no time today to code it yet …

To be continued …

As usual, you’ll find other graphs on my page about COVID-19 in Maryland (and figures above are updated with new data as they appear) and the data, code and figures are on Github (including these ones).


Post-scriptum on June 26, 2020: the day after I posted this, Maryland Governor Larry Hogan announced a safe and phased reopening plan for Maryland’s assisted living facilities. Although I welcome any initiative targeting the protection of everyone and especially the most vulnerable populations, the 2 first prerequisites are still tied to this absence of new cases in 14 days (which is fine) – this is still not a reason to intentionally remove facilities from the count. And I couldn’t see the phased approach – but I guess this will be followed up in another post here. To be continued …

COVID-19 inequalities in Maryland

The recent Black Live Matters protests made me think a lot – as a white man, as a husband and dad, as a biologist by training, as a health economist by day, as someone interested in COVID-19 data where I live by night … as a human, in summary. I don’t have grandiose pieces of advice or any deep thoughts, not for here (but if you call me, we can talk ;-)). Here, let’s continue our exploration of COVID-19 data in Maryland.

There are only 2 metrics that the MDH dashboard provides, around races: confirmed cases and deaths (and probable deaths but as this is not precise and small, let’s put this aside for the moment).

Today (June 12, 2020), communities worst hit (in crude numbers) are African Americans and Hispanics in terms of cases (17,345=28% and 16,293=27%) and African Americans and Whites in terms of confirmed deaths (1,133=41% and 1,164=42%). This is represented in the figure below. Note also the high number of “race not available” in the cases chart (this could mean a worst impact for some communities as some would fear negative consequences of disclosing their race).

But this means little if we don’t know how many Marylanders are in each categories. Numbers varies and I couldn’t find the following data from the Census or the CDC directly (the 2 sources I would consider the most reliable on this): number of people categorized in 1 and only 1 race at a time (which is an approximation of reality but allows for easier calculations below). I found the following data from SuburbanStats: in Maryland there are approximately

  • 1.7 million African Americans (~27%),
  • 318 thousands Asians (~5%),
  • 479 thousands Hispanics (~7%),
  • 3.3 million Whites (~53%) and
  • 410 thousands of “others”.

Given this, we can see a different picture …

In this figure, on top, we see the evolution of crude case numbers since April (up to June 11). We also see the rapid rise of cases in Hispanics since they were separated from the “Others” (April 14). But at the bottom, I show the evolution of cases relative to the population. And here we can clearly see that, very early on, Hispanics accumulated cases in larger proportion compared to their less than half million population. Yesterday (June 11, 2020), there were 3,461 Hispanic COVID-19 cases per 100,000 population (compare that to 350 in Whites).

In the following figure, on top, we see the evolution of confirmed deaths since April (also up to June 11). Here, both African Americans and Whites are close and widely distancing the other communities. But at the bottom, the evolution of deaths relative to the population is shown. And here we can clearly see that African Americans (especially) and Hispanics are the worst hit communities compared to their general population. Yesterday (June 11, 2020), there were 66 African American deaths per 100,000 population and 58 Hispanic deaths per 100,000 population (compare that to 34 in Whites).

The table below summarizes cases and deaths relative to population on June 11, 2020, in Maryland:

CommunityCumulative COVID-19 (cases / 100,000 pop.)COVID-19-specific death rate (deaths / 100,000 pop.)Share of the general population
African Americans1,0206627%
Asians368325%
Hispanics3,461587%
White3503453%
Community-related cases and deaths in Maryland on June 11, 2020

So even in Maryland, a US state ranked 6th best state overall and #8 for healthcare in 2019, disparities exist. Hispanics are the worst hit in COVID-19 cases (27% of cases and > 3,000 cases per 100,000) in cases while they represent only 7% of the population. And African Americans are the worst hit in COVID-19 confirmed deaths (41% of deaths and > 60 deaths per 100,000) while representing only 27% of the population. The CDC has an interesting summary of main causes of these disparities but also what people and organizations can do about it; a good read to start doing something about these inequalities.

To be continued …

As usual, you’ll find other graphs on my page about COVID-19 in Maryland (and figures above are updated with new data as they appear) and the data, code and figures are on Github.


Post Scriptum – but still important … Methodologically, there are a few caveats for all this. First, the concept of race is linked with so many other parameters that COVID-19 is probably exacerbating these other issues (with an indirect effect on people of color) rather than targeting a specific population (the virus itself does not choose who it will infect). Also, there is no explanation on how race information is collected: with a question on the test form (with all the reporting bias it contains), by linking the names or social security number to a previously recorded race identity, …? This is another source of potential bias. Third, we have here the 2 extreme metrics: cases and deaths. There is no information on hospitalizations, despite requests to the MD Department of Health or the Governor’s staff (no hospitalizations info for counties neither btw). I suspect here that race collection in hospitals is not performed (because unethical?) and/or there would be HIPAA issues if this data would be transmitted from hospitals to the state, for instance.

Weekly seasonality in COVID-19 deaths reported in Maryland

On its dashboard, the Maryland Department of Health is reporting confirmed deaths due to COVID-19 in two ways: by date of report and by date of death (updated as amendments to the death record are received). The definition of confirmed death is:

A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result.

What I was intrigued is that reporting seems to follow a pattern influenced by the day of the week (see figure below). The top chart (cumulative) is just an addition. A plateau would be welcome: it would indicate death rate is slowing down. Today, the COVID-19 death rate is 41 / 100,000 population. The bottom chart shows the number of deaths due to COVID-19 reported each day: the black line represents the number of deaths each day they were reported; the grey line represents the number of deaths each day they occurred.

Evolution of coronavirus confirmed deaths in Maryland, as of June 3, 2020

One could see that in both lines, there are two kind of patterns. The first in an overall trend upwards until beginning of May, followed by a decrease since then. The second trend has a big peak being followed by a decrease with 2 smaller peaks and a big dip – then an up, decrease with 2 peaks and a big dip – etc.. As data was reported, we saw intuitively that the big dip came on Sundays, the big peak on Tuesdays and the rest of the week was a decrease towards Sunday.

And this is confirmed by the analysis of seasonality for confirmed death by reported date:

Here, the top chart is just the data we observed before. Below, the trend shows that, indeed, there was an increase up to end of April and we then see a slow decline. The third graph (“seasonal”) shows the pattern I mentioned earlier. This confirms the lowest reporting on Sundays and the highest reporting on Tuesdays. The bottom chart (“irregular”) shows that, even if there is a pattern, there are a lot of irregularities added to the seasonality.

The same patterns can be observed for the deaths by date of death (when they occurred; see chart below). This shows we are currently also in a decreasing number of deaths, each day (fortunately!). The pattern here is that the number of deaths increase from the lowest on Saturday to the peak on Friday (with an intermediary peak on Wednesday). Again, note the important number of irregularities (at the bottom).

In my opinion, this regular patterns come from the reporting system. I don’t see why COVID-19 patients would die more towards the end of the week and less during the weekend. But please tell me if you have more information about this (in the comments below or by email)!

To be continued …

As usual, you’ll find other graphs on my page about COVID-19 in Maryland and the data, code and figures are on Github.

P.S. I’m not counting probable deaths. The MD Department of Health reports this variable but, as it is dependent of a confirmation, it is highly fluctuating and not necessarily representative of deaths due to COVID-19. If confirmed, these probable deaths are accounted in the confirmed deaths (counted here).