The only way to make money in a casino is to own one — Steve Wynn, Casino Magnate

The worst statistical summary ever?

A recent article in the Age pointed out that Melbourne’s dams are approaching 50% capacity for the first time in 4 years.  They included a time series of % storage for the past 40 years which tells the whole story pretty well, placing the last four years of drought in context, as well as the recent rains. Unfortunately, they added a statistical summary which, in my opinion, ranks with the worst ever*.

They listed a time series of the number of days during that year where capacity was less than 50%. Not surprisingly, for many wet years there we no days under 50%. So there are lots of zeros. It’s almost as dumb as summarising my growth curve by listing the number of days during each year of my life I was over 5 feet tall.  Bearing in mind how highly correlated the underlying measurement is, why would you ever try to summarise it by looking at the number of consecutive values that are in a range?  If you were going to move away from the mean, the yearly high and low would be much more informative.

But it gets worse. They indicated on the table when new dams were opened, including the humungous Thompson dam.  Now when a dam opens it is surely empty. The state Premier cuts the ribbon when the wall and sluice gates are complete. And then the water flows in over a number of years. So with a huge damn like the Thomspon which accounts for most of Melbourne’s capacity, you would drop below 50% on the day it opens and for a few years after that while it fills. But the table doesn’t show this. In fact, the opening of the Thompson dams does not seem to affect their daft statistic “number of days under 50%” much at all. So they maanged to add a confusing bit of extra information to a meaningless list of numbers. 

*Unfortunately, the graphic I am describing is not on the link. It only appeared in hard copy.


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

AddThis Social Bookmark Button

4 Responses to “The worst statistical summary ever?”

  1. Andrew Robinson Says:

    Agreed! But furthermore, the percentage capacity is a meaningless statistic.

    First, it’s misleading because it refers to the absolute holding rather than the amount of potable water available. We’ll run out of water way before that number gets down to 0%.

    Second, it doesn’t tell us anything useful. I’d much rather know how many years or days of water we have left assuming current consumption patterns and zero rain.

  2. I was more impressed by The Age reporting (back in August from memory) that having friends reduced the probability of death by 50%.

  3. “I’d much rather know how many years or days of water we have left assuming current consumption patterns and zero rain.” What politician with a self preservation instinct would ever allow reporting of a figure like that?!

  4. Nice one Ian! The Age may have been sourcing the story from here: http://www.physorg.com/news199431314.html, which is a news service that is supposed to specialise in science.

Leave a Reply