I think it is much more interesting to live with uncertainty than to live with answers that might be wrong —  Richard Feynman

## An Outstanding Graphic

If statistics is the process of turning data into information then our most useful tool is the graphic (intermediated of course by a model). I would be interested and grateful if readers of the blog could point me to what they consider to be the most successful and/or innovative graphics they have seen. Links would be especially useful so that I can collect them for another post. The example below has been claimed to be the best statistical graphic ever drawn! He obviously lived in an age of hyperbole as he was himself described as “the Leonardo da Vinci of Data”! Anyway, it is a pretty impressive graphic and I want to describe it below.

The chart described the tragedy of Napoleon’s Russian campaign of 1812. It was created some 50 years after that war by Minard, during his retirment after a successful career as a civil engineer. Part of his job had involved displaying information and he had developed innovative techniques for displaying flows, which he adapted to display the flow of troops.

Minard’s chart shows six types of information: geography, time, temperature, the course and direction of the army’s movement, and the number of troops remaining. The outward march of troops is in gold and the return journey in black. The width of the band is proportional to the number of troop so wherever you see the band narrow you are seeing mass casualties. (The Grand Army left Poland with a force of 422,000; only 100,000 reached Moscow; and only 10,000 returned.) This in itself is pretty clever I think - you see both the path and the number of soldiers. You can see how the troops tried, and mostly failed, to cross the Bérézina river as the width of the black line halves: another 20,000 or so gone. The French now use the expression “C’est la Bérézina” to describe a total disaster. Geographical features and major battles are marked and named, and temperatures on the return journey are shown along the bottom, so you can see where low temperatures resulted in deaths.

In 1871, the year after Minard died, his obituarist cited particularly his graphical innovations:

For the dry and complicated columns of statistical data, of which the analysis and the discussion always require a great sustained mental effort, he had substituted images mathematically proportioned, that the first glance takes in and knows without fatigue, and which manifest immediately the natural consequences or the comparisons unforeseen.

The particular chart of the Russian Campaign of 1812 is singled out for special mention: it “inspires bitter reflections on the cost to humanity of the madnesses of conquerors and the merciless thirst of military glory”. I would especially like to see a chart of the casualties in Iraq over the past five years and have it reproduced on the front page of the Australian.

I admit that the original graph reproduced below is a little difficult to see - which is a fatal weakness for a graphic! A higher definition version is here and a modern version of this same graph is available on Encyclopaedia Brittanice On-line. As mentioned in the preamble Edward Tufte, whose book, “The Visual Display of Quantitative Information” was once a bible to statisticians, calls it “the best statistical graphic ever drawn”. What do you think? Can you find a better example?

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

### 12 Responses to “An Outstanding Graphic”

1. Harry Beck’s map of the London Underground has certainly been an influential graphic. Wikipedia has a brief history: http://en.wikipedia.org/wiki/Tube_map

2. Melanie Bahlo Says:

A good picture is worth a thousand words, isn’t that what they say?

Over christmas I had a look through the Christmas issues of “Scientific American” and the “New scientist” and found the following link to a lecture on world demographics by Hans Rosling, a swedish medical scientist.

It is well worth a look. Here it’s the animations in particular which leave you with the important messages. Hans Rosling is a passionate presenter with charisma.

Cheers,
Melanie

3. My favourite graph is the train timetable by Marey. Appears in Tufte and also on Michael Friendly’s website: http://www.math.yorku.ca/SCS/Gallery/milestone/thumb6/

4. Janet Walstab Says:

I currently have a book on ILL “The Visual Display of Quantitative Information” (2nd Ed) By Edward R Tufte, Graphics Press-Cheshire, Connecticut. The graphic included in your article is reproduced in this book. It contains may imaginative examples of graphics - some so imaginative that the meaning can be rather obscure. I particularly like the one on P73 (bottom).

Janet Walstab, Murdoch Childrens Research Institute.

5. “I would especially like to see a chart of the casualties in Iraq over the past five years” .. do we have those data points, Chris? Has the survey that produced the 600,000 estimate been repeated?

I saw recently on Stat-L that Leland Wilkinson (ex “the Grammar of Graphics” and SYSTAT) is giving a free seminar in Melbourne:

A free presentation by Dr. Leland Wilkinson and Professor T. Krishna:

• Tuesday 19th February, 9:30 - 11:30, Parkview Hotel, 562 St Kilda Rd, Melbourne.
• Thursday 21st February, 9:30 - 11:30, Vibe Hotel, Rushcutters Bay, Sydney.

Leland Wilkinson will present his research and talk about the development of visual analytics and statistics. This research has been published recently in IEEE Transactions on Visualization and Computer Graphics and presented at Harvard University, the University of California, Berkeley and American Association for the Advancement of Science.

http://www.hearne.com.au/products/systat/details/781/

so you might ask him about the “best graphic”. There is a collection of links (data visualization, not “statistical graphics”) in 2 of my posts

which you might find of use or interest (included are links to R graphics, and Graphviz), and also
http://infosthetics.com/ and http://abeautifulwww.com/ and “manyEyes” http://services.alphaworks.ibm.com/manyeyes/home

which has some featured visualizations. Of course Tufte has his own (large) site, where you can find some graphics that he considers to be “brilliant”, including a link to the Amanda Cox graphic HERE.

Well, I could go on.

ManyEyes has an interesting statement “Many Eyes is a bet on the power of human visual intelligence to find patterns”. I am not sure if I would bet for or against that proposition .. at least not without it being heavily qualified.

These days, very large and complex graphs can easily be produced, and I think they can overwhelm the visual cortex. .. certainly mine, at least. (btw, the Minard graph contains a fair mount of text, a lot of modern graphics do not). I personally like extremely simple graphs, graphs that display order relationships (essentially not much more than a proportionately spaced ordered list) or graphs that have a few points in 2D. I don’t, for instance, warm to the Cox graph linked to above. I don’t like most graphs produced by commercial packages, don’t like pie charts, don’t much like bar charts .. I’d rather just have the data, suitably ordered.

There is some theoretical/biological support for “simplicity” (in User Interface and Web design and presumably for any graphical representation) in Vision and Art: The Biology of Seeing by Margaret S. Livingstone .. briefly, that there are two types of vision system of which the “Where” system is much faster and is sensitive to contrast, movement, direction and edges. The slower “What” system is sensitive to color, detail and texture. There may be some clues there.

meta comment deleted (font fixed)

6. hmm, curious .. the font is not fixed in my version of Firefox (2.0.0.11) but it seems fine in Opera and IE. Usually if there is an “issue” it is IE that is out of step with the others. Anyway, if you can’t fix it, I am sure we can live with it.

7. Megan Pledger Says:

Here is a graphic that is representing wind direction over time (in a home weather station, close to where I live).
http://zl2ts.net.nz/dirplot.gif

The points closest to the centre are the oldest points.

Anyone have any ideas about how to represent the data better i.e. so that the variation in wind direction doesn’t look like it is increasing with time. (Although, in this particular graph the wind looks like it is getting more variable with time.)

8. I quite liked this graphic on the odds of dying
“http://chance.dartmouth.edu/chancewiki/index.php/Image:Odds_dying.jpg”
.. it would be quite nice, Chris, if we could see some of these favourite graphics on this post, but I don’t believe Wordpress has a feature for an upload file (except for the administrator, of course).

This was referenced from Chance News 21 http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_21
which also has a big/complex/maybe attractive graphic on “Death and Taxes” (ex boingboing), and a simplified one on mandatory spending.

9. Just looking at Megan’s circular wind direction graph, with the artefact that points closer to the center are more radially compressed and give a false sense that wind direction is getting more variable over time, it seems to me that a simple line chart with the x axis as time, the series centered on its mean (in this case about 10 degrees, say) and all points plotted as deviations from the mean - eg a NW point would be treated as a deviation of -55 degrees from the mean of +10 degrees would avoid this artefact .

I can see a difficulty, however, if one wished to compare multiple such series, the ambiguity arising from where two series which have means close to 0/360 degress should be placed .. perhaps the only solution then would be to inscribe them on a cylinder!

There might be some insights in Time Series Analysis of Circular Data
N. I. Fisher, A. J. Lee
Journal of the Royal Statistical Society. Series B (Methodological), Vol. 56, No. 2 (1994), pp. 327-339

fwiw

10. Just to be a stick-in-the-mud, the one piece of information included in Minard’s graph that is difficult to grasp is time. That’s because it is not included graphically. When looking at the graph I found myself wondering how quickly the army was moving on the coldests days and when it suffered the greatest losses. As the time points were indecipherable to me, I could not answer this. I thought the graphic could be improved by using tick marks equi-spaced in time (end of each day) and labeled every Sunday (say). Of course Minard may not have had this level of precision in his data.

11. To John Aitchison,

The link to Amanda Cox’s graphic didn’t work for me. I suspect it is because of the full stop “.” at the end of the URL. (Link fixed by adnministrator)

Also, Hans Rosling’s World Health has a downloadable application that can be used to explore and compare the public health of different countries and regions over time. The application is regularly updated. I remember being able to examine the health populations by economic quintiles within nations and being struck by how much inequality there was in (say) Brazil versus (say) Japan.

12. for those interested in where-what vision systems, I found an article at evolt:

To Bruce Tabor: http://data.un.org/ has health data, and a lot more