After my previous post on the age of COVID-19 cases in Maryland, it was logical that I write about the age of COVID-19 deaths in Maryland. So far, media and State Departments of Health all agreed that the older someone is, the more risk this person has to die from coronavirus.
So far, this is unfortunately also true in Maryland. In the graph below, we clearly see that people 50-59 years old have more than 250 deaths, people 60-69 have more than 500 deaths, people 70-79 have more than 750 deaths and people 80+ have nearly … 1,5000 deaths! The graph at the bottom also clearly shows that people in age categories 60 and above provide most of the new daily deaths due to COVID-19 (even if we came back down from a peak at about 40 deaths in 80+ at the end of April).
The simpler section at the latest date for which death data by age is available (i.e. today, July 9th, 2020) also shows this curve highly skewed towards older age groups (at the bottom, compare that to cumulative cases, on top):
The two graphs below confirm that people in old age are at much higher risk of death due to COVID-19. On top, if we report the deaths in each age group by the population they actually are in Maryland, we also see that deaths in 80+ disproportionaly affect this age group, reaching a COVID-19-specific mortality rate of 629 per 100,000 pop.!!! The table under the graph gives all the data points.
And when we look at it to see the relative importance of each age groups compared to the total number of cases, we see again that people aged 80+ have 46% of all deaths, followed by people 70-79 (25%) and people 60-69 (16%).
COVID-19-specific mortality rate, by age group, in Maryland, on July 9th, 2020
As opposed to cases by age, we don’t see here any shift in most affected age group: the older some is, the more risk of dying from COVID-19 exists (and part of the problem is the close living conditions in nursing homes). There aren’t 1,000 solutions to protect them: wear a mask and practice physical distancing, especially when there is a risk to meet elderly people and transmit the disease to them!
We recently heard in the US media that, if COVID-19 affected more the older population, beginning of 2020, the younger population was now more affected, especially young adults (various reasons were mentioned: the various academic breaks, being more active or “forced” to work, the sentiment of invincibility …). I wanted to see if one could see a similar trend in Maryland.
If you look at the section of the Maryland population by age (graph below), as of today (July 9, 2020), you see that cumulatively, people 30-39 have the majority of cases, followed by people aged 40-49, 50-59 and 20-29 years old. There are relatively few cases above 70 years old and fewer cases below 20 years old.
This snapshot doesn’t show a trend we indeed saw in the past few weeks. In the chart below, representing the cumulative cases by age categories, one can see a faster increase of cases in 20-29 years old (than the increase in, let’s say, 40-49 years old) – since mid-May. This fast increase is such that one could predict that 20-29 years old will soon have more cases than 40-49 years old and become the 3rd age group with most cases.
Two other age groups also saw their number of new cases accelerates, at a lower rate than 20-29 but still: children (both groups below 20 years old) seem to catch up with the older group (both group above 70 years old). This needs to be watched and, ideally, prevented!
Note the bottom graph shows the number of daily new cases. Although it’s messy, we can see that all age groups are now adding less cases than in May but the middle aged groups (20-59) sill add more cases every day than the younger (< 20) or older (> 70) ones. I could smooth it with a 7- or 14-days average but then we wouldn’t see new trends emerge.
The direct impact of COVID-19 cases on each age category can be better grasped in the next chart, where the evolution of cases is again displayed but this time relative to the respective population in each age category. These populations by age were found from a projection from 2018, for 2020 by the Maryland Department of Planning. This demographic spread is a bit odd because all age groups below 70 years old are between 700k and 800k (I would have expected more a bell/Gaussian distribution):
Age group (years old)
Projected total population by 2020
Age pyramid of Maryland, projection from 2018 for the year 2020 From the Maryland Department of Planning, August 2018 / OpenData Maryland
In the top chart, below, one can see the evolution of cumulative cases relative to the total number of people (sick and healthy) in each age category (for instance: how many cases 70-79 years old relative to 100,000 individuals in this age category). Because of the relatively constant number of people in each age category (see table above), we find back approximately the same mix of curves. However, we should first note the high toll of people 80+ who have the highest number of cases per 100,000. We should also note the fast increase of the 20-29 years old population: they were just above the less than 20 years old in the beginning of the pandemic; they are now the 4th age group in relative cases. The table below indicates the relative cases for yesterday (July 8, 2020):
Age group (years old)
Relative COVID-19 cases (cases / 100,000 pop.)
Cumulative number of COVID-19 cases relative to population, by age group, in Maryland, on July 8th, 2020.
Another way to look at it is to see the relative importance of each age groups compared to the total number of cases. This is done in the last chart, above. We can see that around mid-April, COVID-19 cases in adults 80+ “carved” their share of number of cases. Starting in May, the share of COVID-19 cases in children below 20 also started to increase (from 1.9% on March 29 to 8.5% on July 8). Despite this, 20-29 increased their share of cases (from 13.3% on March 29 to 15.1% on July 8); 30-39 also increased their share of cases (from 16.3% on March 29 to 18.7% on July 8).
All this indicates a shift in new cases, with more and more new cases being discovered in the young adult population. This can be due to a number of factors … The first one is probably that tests were not restricted (or became widely available, without restriction) mid-May: this would have allowed people younger to be tested and therefore would have increased their share of cases. Another parameter could be that younger adults are still in the workforce and therefore more exposed and more often exposed than older adults. A last parameter could also be that some younger adults may care less about their health, may be less willing to follow state and federal rules, may be composed of more Hispanics or African-Americans – two populations specifically at risk for COVID-19 … Nevertheless, this increase / these populations should be watched carefully and reminded that they are also at risk of COVID-19 (maybe less deaths – that’s for a follow-up post – but the disease itself and its long-term consequences).
Since the beginning of the COVID-19 pandemic, we suspected and saw that nursing homes and other facilities where people are grouped together (prisons, …) could be at higher risk of transmission. The focus on nursing homes was because deaths seem to disproportionately affect the older population that also resides there. And nursing homes are also home for frail people with comorbidities.
Besides the weekly update (contrasting with the daily update on the main dashboard), the strange thing is that curves are going down! If it was a true cumulative curve, it would keep either growing (new cases are added) or it will go flat where it reached (no new case, we keep the total from last day or week).
Then you read the note below the dashboard (before the tables) and it says:
Facilities listed above report at least one confirmed case of COVID-19 as of the current reporting period. Facilities are removed from the list when health officials determine 14 days have passed with no new cases and no tests pending.
I could imagine that the reason is pragmatic: somewhere, someone stops adding cases (or deaths) if the facility doesn’t send new case (or new death) count for 14 days. But it doesn’t make sense to actively remove the facility from the list and therefore remove the cases (or deaths) that were reported earlier. Especially if the dashboard leads viewers in error by stating “Total # of Cases” as y-axis:
The article quotes the Department of Health mentioning that the other data presented is cumulative but I couldn’t find this … Indeed all datasets available include the same caveat that facilities not reporting within 14 days are removed:
If I take an example in the first few facilities that reported cases, we clearly see that this one (whichever it is, it doesn’t matter here) started to report cases up to June 10. Since I’m writing this on June 25, there are more than 14 days that they stopped reporting, the dataset doesn’t include this facility anymore (the latest data points in the dataset are for June 24):
This is a pity because, besides the difference between residents and staff, these datasets also present cases and deaths among youth and inmates. It would have been nice to understand the evolution of the burden of COVID-19 in these populations. But the curve is clearly not cumulative, as we can seen on the charts below: after about June 2nd-10th, curves going down probably indicate removal of facilities in the total count.
As mentioned in the Baltimore Sun article, with this kind of reporting, you cannot know the real toll in nursing home, prisons and other congregate facility settings and therefore you cannot respond to it appropriately (i.e. the toll is now underestimated).
Also, you can’t put things in perspective because you can’t have a reliable proportion of cases in congregate facility settings compared to the total number of COVID-19 cases in Maryland. This total number of cases is cumulative and we see an artificial decrease in % of cases in these facilities, as illustrated below:
Now, what can we do? One clear solution is that the Maryland Department of Health changes its reporting and really report the correct cumulative number of cases in congregate facility settings. Besides that, I have a technical solution in mind but I had no time today to code it yet …
Post-scriptum on June 26, 2020: the day after I posted this, Maryland Governor Larry Hogan announceda safe and phased reopening plan for Maryland’s assisted living facilities. Although I welcome any initiative targeting the protection of everyone and especially the most vulnerable populations, the 2 first prerequisites are still tied to this absence of new cases in 14 days (which is fine) – this is still not a reason to intentionally remove facilities from the count. And I couldn’t see the phased approach – but I guess this will be followed up in another post here. To be continued …
The recent Black Live Matters protests made me think a lot – as a white man, as a husband and dad, as a biologist by training, as a health economist by day, as someone interested in COVID-19 data where I live by night … as a human, in summary. I don’t have grandiose pieces of advice or any deep thoughts, not for here (but if you call me, we can talk ;-)). Here, let’s continue our exploration of COVID-19 data in Maryland.
There are only 2 metrics that the MDH dashboard provides, around races: confirmed cases and deaths (and probable deaths but as this is not precise and small, let’s put this aside for the moment).
Today (June 12, 2020), communities worst hit (in crude numbers) are African Americans and Hispanics in terms of cases (17,345=28% and 16,293=27%) and African Americans and Whites in terms of confirmed deaths (1,133=41% and 1,164=42%). This is represented in the figure below. Note also the high number of “race not available” in the cases chart (this could mean a worst impact for some communities as some would fear negative consequences of disclosing their race).
But this means little if we don’t know how many Marylanders are in each categories. Numbers varies and I couldn’t find the following data from the Census or the CDC directly (the 2 sources I would consider the most reliable on this): number of people categorized in 1 and only 1 race at a time (which is an approximation of reality but allows for easier calculations below). I found the following data from SuburbanStats: in Maryland there are approximately
1.7 million African Americans (~27%),
318 thousands Asians (~5%),
479 thousands Hispanics (~7%),
3.3 million Whites (~53%) and
410 thousands of “others”.
Given this, we can see a different picture …
In this figure, on top, we see the evolution of crude case numbers since April (up to June 11). We also see the rapid rise of cases in Hispanics since they were separated from the “Others” (April 14). But at the bottom, I show the evolution of cases relative to the population. And here we can clearly see that, very early on, Hispanics accumulated cases in larger proportion compared to their less than half million population. Yesterday (June 11, 2020), there were 3,461 Hispanic COVID-19 cases per 100,000 population (compare that to 350 in Whites).
In the following figure, on top, we see the evolution of confirmed deaths since April (also up to June 11). Here, both African Americans and Whites are close and widely distancing the other communities. But at the bottom, the evolution of deaths relative to the population is shown. And here we can clearly see that African Americans (especially) and Hispanics are the worst hit communities compared to their general population. Yesterday (June 11, 2020), there were 66 African American deaths per 100,000 population and 58 Hispanic deaths per 100,000 population (compare that to 34 in Whites).
The table below summarizes cases and deaths relative to population on June 11, 2020, in Maryland:
Cumulative COVID-19 (cases / 100,000 pop.)
COVID-19-specific death rate (deaths / 100,000 pop.)
Share of the general population
Community-related cases and deaths in Maryland on June 11, 2020
So even in Maryland, a US state ranked 6th best state overall and #8 for healthcare in 2019, disparities exist. Hispanics are the worst hit in COVID-19 cases (27% of cases and > 3,000 cases per 100,000) in cases while they represent only 7% of the population. And African Americans are the worst hit in COVID-19 confirmed deaths (41% of deaths and > 60 deaths per 100,000) while representing only 27% of the population. The CDC has an interesting summary of main causes of these disparities but also what people and organizations can do about it; a good read to start doing something about these inequalities.
Post Scriptum – but still important … Methodologically, there are a few caveats for all this. First, the concept of race is linked with so many other parameters that COVID-19 is probably exacerbating these other issues (with an indirect effect on people of color) rather than targeting a specific population (the virus itself does not choose who it will infect). Also, there is no explanation on how race information is collected: with a question on the test form (with all the reporting bias it contains), by linking the names or social security number to a previously recorded race identity, …? This is another source of potential bias. Third, we have here the 2 extreme metrics: cases and deaths. There is no information on hospitalizations, despite requests to the MD Department of Health or the Governor’s staff (no hospitalizations info for counties neither btw). I suspect here that race collection in hospitals is not performed (because unethical?) and/or there would be HIPAA issues if this data would be transmitted from hospitals to the state, for instance.
On its dashboard, the Maryland Department of Health is reporting confirmed deaths due to COVID-19 in two ways: by date of report and by date of death (updated as amendments to the death record are received). The definition of confirmed death is:
A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result.
What I was intrigued is that reporting seems to follow a pattern influenced by the day of the week (see figure below). The top chart (cumulative) is just an addition. A plateau would be welcome: it would indicate death rate is slowing down. Today, the COVID-19 death rate is 41 / 100,000 population. The bottom chart shows the number of deaths due to COVID-19 reported each day: the black line represents the number of deaths each day they were reported; the grey line represents the number of deaths each day they occurred.
One could see that in both lines, there are two kind of patterns. The first in an overall trend upwards until beginning of May, followed by a decrease since then. The second trend has a big peak being followed by a decrease with 2 smaller peaks and a big dip – then an up, decrease with 2 peaks and a big dip – etc.. As data was reported, we saw intuitively that the big dip came on Sundays, the big peak on Tuesdays and the rest of the week was a decrease towards Sunday.
And this is confirmed by the analysis of seasonality for confirmed death by reported date:
Here, the top chart is just the data we observed before. Below, the trend shows that, indeed, there was an increase up to end of April and we then see a slow decline. The third graph (“seasonal”) shows the pattern I mentioned earlier. This confirms the lowest reporting on Sundays and the highest reporting on Tuesdays. The bottom chart (“irregular”) shows that, even if there is a pattern, there are a lot of irregularities added to the seasonality.
The same patterns can be observed for the deaths by date of death (when they occurred; see chart below). This shows we are currently also in a decreasing number of deaths, each day (fortunately!). The pattern here is that the number of deaths increase from the lowest on Saturday to the peak on Friday (with an intermediary peak on Wednesday). Again, note the important number of irregularities (at the bottom).
In my opinion, this regular patterns come from the reporting system. I don’t see why COVID-19 patients would die more towards the end of the week and less during the weekend. But please tell me if you have more information about this (in the comments below or by email)!
P.S. I’m not counting probable deaths. The MD Department of Health reports this variable but, as it is dependent of a confirmation, it is highly fluctuating and not necessarily representative of deaths due to COVID-19. If confirmed, these probable deaths are accounted in the confirmed deaths (counted here).
A publicly-available MD COVID-19 metrics that I didn’t investigate much is cases per ZIP code. I created a dashboard where you can highlight one zip code at a time. Tyler Fogarty built a cool Treemap Explorer. Silver Chips has a nice heatmap of all zip codes as part of their extensive dashboard (a bit like the MDH dashboard). How can we make sense of all this?
Today (May 30, 2020), 4 ZIP codes have more than 1,000 cumulative cases: 2 in Prince George’s county (20783, Hyattsville, and 20706, Lanham) and 2 in Montgomery county (20906 and 20902, both in Silver Spring). But among the ZIP codes with the most recent daily positive cases, 21223 and 21224 are also in the top 5, both in Baltimore City. All these ZIP codes are in counties that are closed or partially opened, highlighting the need for these regions to remain vigilant and enforce stay at home, wearing a mask and social distancing (at least).
I’m looking at protests in Baltimore and I can’t imagine how detrimental the spread of coronavirus will be and will add to the other issues. Here is what Prof. Murray advise to protect protested (on top of any other precautions):
There is no public data about hospitalization per ZIP code nor deaths per ZIP code. But there are certainly other ways to make sense of this metric …
Since my previous post (May 21), tests were broadened in some drive-thru locations for anyone to be tested (5/19 actually) and new testing sites were opened (map of sites here) and we had the Memorial Day weekend (5/25). On May 28, Gov. Hogan mentioned that “hospitalizations, ICUs, and testing positivity rates are the key metrics in determining Maryland’s road to recovery”. On May 27, Gov. Hogan announced that further reopening were taking place (outdoor dining, some outdoor activities for kids allowed, …) but still within Stage 1 (I called it “Stage 1b”).
In terms of hospitalizations, the graph above shows the number of patients currently hospitalized (green line). Since beginning of May, hospitalizations decreased, especially thanks to the decrease of patients in acute care (red line). Patients in ICU (Intensive Care Unit) decreased much slowly. This is probably due the severity of these patients, making them stay for a long time and released from ICU at a slower rate than patients in other departments. And the graph also shows that Maryland never needed the additional hospital beds prepared for a worst scenario.
The third key metric the Governor is looking at is testing positivity rate. The chart above represents, on top, the total number of tests reported on the MD Health Department dashboard (adding positive and negative test results). We learned that the Governor is actually not looking at the same positivity rate than the one we can compute from the dashboard:
The dashboard report unique positive and negative tests. If someone was tested twice or more with the same result, it would have been reported only once. If the test result would change, it would have been reported once in each category.
The Governor is looking at all positive and negative tests. If someone was tested twice or more, independently of the result, all tests results would have been counted here.
This difference probably explain why we see a daily number lower than 10,000, despite 500,000+ tests received by Governor from South Korea. But we can’t really say in which direction this difference would drive the testing positivity rate. If more positive tests were under-counted (i.e. counted once instead of the several times they were performed/received), the Governor would have seen a higher positivity rate than on the dashboard. More likely, if more negative tests were under-counted (i.e. negative people tested several times, but counted once), the Governor would have seen a lower positivity rate than on the dashboard. This last option would explain why the Governor decided to go on Stage 1 sooner than expected by just watching the dashboard.
Technically, as a side note, the data for the testing positivity rate that the Governor is looking at is not publicly shared. There is just a PDF with graphs. This difference in what is reported may also explain why, since test broadening (5/19), there was 5 days of ups and downs after which the rate stayed at about 10%.
At the level of the State of Maryland, we are not yet looking at the full picture: the last element (that doesn’t seem to be part of the key metrics) is deaths. So far (since mid-March), there have been 2,390 deaths due to COVID-19 in Maryland with a majority of them occurring in congregated facilities (nursing homes, prisons, etc.). With an about-weekly pattern (see below), the daily number of confirmed deaths also seem to decrease (although much slower than hospitalizations or positivity rate).
But if things seem good at the State level, the decision to reopen Maryland came with the empowerment of Counties (the government level below State) to follow or not the reopening. As noted before, if most counties followed the State in Stage 1, some counties did not (some like Prince Georges and Montgomery even remained “closed”). There is no straightforward way to follow hospitalizations in counties (they are not reported on the MD dashboard). But we can follow deaths in counties in the graph below. There it is a bit surprising to see that counties that re-opened, the % of deaths compared to May 15 is actually increasing (i.e. more daily deaths in counties that re-opened) (see blue dots and average in the blue line). On the other hand, % of daily deaths seems to decrease in counties that partially reopened or remained closed. But one should also note the huge confidence intervals around these averages.
Finally, about counties, the situation is about to get messier: since yesterday, Anne Arundel, Baltimore City and Howard counties further allowed some outdoor activities; and starting June 1st, Montgomery and Prince George’s counties, initially closed, will also start to allow some outdoor activities.
Here is a first attempt to look at the fate of the different counties. My idea here is to set the number of cases in all counties on May 14, 2020 (start date of Stage 1) to 100% and see how counties evolve in terms of number of new daily cases.
On top of the figure below, I represent the cumulative, 7-day average (*) daily new COVID-19-confirmed cases in the different counties of Maryland. The chart at the bottom assign the number of daily cases on May 14, 2020 to 100% for each state and follow the % evolution over the next day. In this chart, the blue lines represents counties that follow Stage 1 (e.g. Garrett or Ken), the green line represents counties that partially follow Stage 1 (e.g. Anne Arundel or Frederick) and the red line represents counties that remain “closed” (Baltimore City, Charles, Prince Georges and Montgomery). The counties that remain closed are the ones that have the most cases and deaths.
I must say that 6 days after Stage 1 (May 20), there is no clear trend. First, it’s normal because any downward or upward trend in number of cases will take a few day to appear (transmission or absence of transmission, incubation, decision to consult and tests, and lag in test reporting). It’s too early to see something. We will also see a confounding factor with the recent decision by the Governor to allow testing of people who do not present any symptoms (in some testing sites). Nevertheless, I was expecting to “see something”; here it just seems it’s the same.
But another reason for “not seeing anything” might be that the cases are not a relevant metric. We can already see that it is fluctuating widely every day. There are even days when less cases were reported than the day before (it might have been a data entry error on my side). The only other parameter that the MDH displays in its dashboard is the number of deaths by counties. I plotted this and it’s the same bizarre chart. How to improve this? Any idea? Don’t hesitate to comment below or to send me an email.
Update on May 24, 2020: I updated the chart of cases after Stage 1 (see above). Currently the confidence intervals (the shades) are so overlapping that differences that we could see are meaningless. Cases may not be a good metric.
I also created the same chart for deaths (see below). Here we see clearly a positive picture: in all counties that are partially open or closed, the mean number of deaths is decreasing. Note however that we are only 10 days (today is 5/24) from May 15 and this may just be a trend that existed before and not something new due to the decision to remain (partially) closed.
In counties that are in Stage 1, the mean number of deaths is actually increasing. The same comment applies: it may be too early to actually see an impact of the opening (especially deaths could be far from the case detection). Besides, the confidence intervals (the blue shades) are very wide. Hopefully things may become clearer in a few days (and for the best, given we are talking about a disease and people dying from it).
Since a few weeks, I report the raw number of COVID-19 deaths in Maryland counties. If this gives an idea of the cumulative number of deaths – which is interesting – it doesn’t reflect the fact that some counties have more inhabitants than others. That’s why I plotted below the number of COVID-19 deaths adjusted for the population (i.e. the COVID-19-specific death rate):
Today (May 16, 2020), in terms of absolute number of deaths, Montgomery, Prince Georges and Baltimore County are the top 3 counties (this is the same for cases but not in the same order). In terms of confirmed deaths per 100,000 population, the top 3 counties are Kent, Prince Georges and Montgomery.
Since a few weeks, I report the raw number of COVID-19 cases in Maryland counties. If this gives an idea of the cumulative number of cases – which is interesting – it doesn’t reflect the fact that some counties have more inhabitants than others. That’s why I plotted below the number of COVID-19 cases adjusted for the population:
Today (May 11, 2020), in terms of absolute number of cases, Prince Georges, Montgomery and Baltimore County are the top 3 counties. In terms of confirmed cases per 100,000 population, the top 3 counties are Prince Georges, Montgomery and Wicomico (due to a recent surge in cases).
Rank on May 11, 2020
Absolute # of COVID-19 cases
COVID-19 cases per 100,000 population
Prince Georges (9,687)
Prince Georges (1,057)
Baltimore County (3,948)
Baltimore City (3353)
Anne Arundel (2492)
Baltimore City (544)
This is a lot given that, today, the average for Maryland is 401/100,000 (source: CDC) and the average for the US is 552/100,000 (source: OurWorldInData).