Dataviz

This afternoon I received a bunch of data accompanied by stacked bar graphs for each dataset. For example, this one: The chart shows the incidence of disease X in various age ranges. That incidence is split by 8 severity levels. The chart shows that the disease especially affects age ranges 4 and 5, at different severity levels. However I didn’t feel comfortable … what are the different levels of severity in age ranges 1, 2 and 3? how can we compare levels C, D and E in age ranges 4 and 5? is there anywhere some severity A? (it’s even worst when some age ranges don’t have any incidence at all: what is happening?) etc. I looked on the web but couldn’t find much information apart from the fact " The Economist says they’re so bad at conveying information, that they’re a great way to hide a bad number amongst good ones" (but are still using them in their graphic detail section) or " a stacked column chart with percentages should always extend to 100%" (this doesn’t really apply here). Then in a post on Junk Charts, someone mentioned Steven Few who would have said “not to use stacked bar charts because you cannot compare individual values very easily and as a rule [he] avoid[s] stacked bars with more than six or seven divisions”. And Steven Few also participated in his forum here. ...

Dataviz

Visualizing categorical data in mosaic with R

About stacked bar graphs