Visualizing categorical data in mosaic with R

A few posts ago I wrote about my discomfort about stacked bar graphs and the fact I prefer to use simple table with gradients as background. My only regret then was that the table was built in a spreadsheet. I would have liked to keep the data as it is but also have a nice representation of these categorical data. This evening I spent some time analysing results from a survey and took the opportunity to buid these representations in R. ...

May 15, 2012 · 2 min · jepoirrier

About stacked bar graphs

This afternoon I received a bunch of data accompanied by stacked bar graphs for each dataset. For example, this one: The chart shows the incidence of disease X in various age ranges. That incidence is split by 8 severity levels. The chart shows that the disease especially affects age ranges 4 and 5, at different severity levels. However I didn’t feel comfortable … what are the different levels of severity in age ranges 1, 2 and 3? how can we compare levels C, D and E in age ranges 4 and 5? is there anywhere some severity A? (it’s even worst when some age ranges don’t have any incidence at all: what is happening?) etc. I looked on the web but couldn’t find much information apart from the fact " The Economist says they’re so bad at conveying information, that they’re a great way to hide a bad number amongst good ones" (but are still using them in their graphic detail section) or " a stacked column chart with percentages should always extend to 100%" (this doesn’t really apply here). Then in a post on Junk Charts, someone mentioned Steven Few who would have said “not to use stacked bar charts because you cannot compare individual values very easily and as a rule [he] avoid[s] stacked bars with more than six or seven divisions”. And Steven Few also participated in his forum here. ...

February 8, 2012 · 3 min · jepoirrier