At their core, statistical analyses always involve making a comparison of something. We might want to know how voter turnout differs between Democrats and Republicans. Or scientists might test the effects of a new treatment by comparing an experimental group to a control group. We might just want to see how global temperature has changed over time. There are infinite examples that we could think up, but the key here is that analytics involves making comparisons of some sort.
It makes sense, then, that Principle #1 in making effective data visualizations is to Show Comparisons, Contrasts, Differences (in an interesting and appropriate way). A good display will show how two or more things (groups of people, treatment levels, years, states, etc.) compare to one another. Not only that, but our visualizations should draw out this comparison in a way that is engaging to the reader. Republicans might be better represented by the color red, for example, and Democrats in blue. Or if only one group differs from a number of other groups in some meaningful way, it would be sensible to draw attention to that one group with a visual representation of anomaly (perhaps by coloring its line on a scatterplot, while keeping all the other lines black). So, in sum, a good visualization will not only represent a comparison with the data, but it must do so in an interesting and sensible way.
Take a look at this example of a poor visualization. I got this from the site WTF Visualizations (which can help you learn what not to do in data science, or can at least give you a few laughs!). Any way, this violates our first principle in a number of ways.
First, there does not seem to be any comparison being made in the display. The visualization is simply listing the functions and benefits of vitamin B3. No comparisons, no contrasts, nothing. This makes for really boring content to visualize right from the get go. Even the slickest graphs in the world would not make this “story” (if you can call it that) interesting.
Not only that, but much to the reader’s confusion, parts of this visualization make it seem like there is a comparison being made (where there is not). For example, two of the statements hint at having an empirical basis (“significant decrease in heart disease” and “reduction in the risk of Alzheimers” both use terminology that sounds like statistical findings from a research publication), whereas the other two statements do not (“Assists in the energy production of cells” and “promotes healthy skin”, compared to the first two, have a more broad and “conjecture”-like format). This makes it unclear to the reader why there is a difference in phrasing at all for these statements. And adding to the confusion, these two sets of statements also differ in that one begins with verbs (“assists” and “promotes”), whereas the other begins with nouns (“reduction” and “decrease”). In both these examples of inconsistency, the visualization misleads the reader into thinking there might be a comparison being made between sets of statements (when, in fact, there is not). This is not only annoying to the reader, but costs the reader energy and time. And always remember: No good visualization should make the reader do more work than is necessary.
The second major flaw in this visualization (which – to me – is actually more bothersome than it having no comparisons) is that the designers use a Venn-diagram style of presenting the four statements, indicating conceptual overlap. Yet there is no such overlap (nor would we expect there to be in an analysis that simply lists a few facts). In this case, if the designers of this visualization were going to stick with the bland content, they should have at least been more prudent to think up a more appropriate way to represent their facts. I actually think a simple list format (albeit extremely boring) would be more suitable in this case. There are many, many good ways to format these statements. In this case the four overlapping circles is not one of them. Leave that for scenarios in which four statements do overlap in some meaningful way.
The one and only area in which this visualization might be said to succeed is in the choice of using coffee (a food source that is high in vitamin B3) to represent the function of vitamin B3. I find this to be an appropriate use of imagery. And I also like that it separates the function of vitamin B3 from its benefits (by presenting the coffee mug as distinct from the three transparent circles). All that said, in my opinion, these points of strength don’t outweigh the major flaws of this visualization.
Remember our Principle #1 states that effective visualizations 1) make a comparison, and 2) represent that comparison in an appropriate and interesting way. The above visualization does not meet either of these criteria and is therefore, a poor visualization.
So far we’ve seen how not to demonstrate Principle #1. In the next entry we will turn briefly to a visualization that is a good demonstration of our principle. Stay tuned!