The Optimiser’s Curse
I have recently come across an interesting paper in the journal Management Science which rings a few bells for me. It concerns decision analysis and how well we can estimate the added value of the optimal decision. The bottom line is that the same data is used to make the optimal decision as is used to estimate its added value. And this results in an overly optimistic assessment.
The paper is entitled “The Optimiser’s Curse” by James Smith and Robert Winkler. Download it HERE. It is a pretty easy read but here is a quick overview. When we do decision analysis we have to estimate the values and probabilities associated with the different decision options - the values on the various branches of the decision tree, if you like. When we identify the highest utiliy decision we will typically quote the value of this optimised utility (compared to some baseline) as a measure of how the value of optimising the decision. this is certainly what consultants do. Leave aside the possibility that the values and probabilities we estimated may be biased. Even if they are unbiased, we will tend to over-estimate the value of the optimal decision because we have used the same data to select and evaluate the decision.
The most extreme but simplest case is when all choices have true utility equal (to zero without loss of generality). In this case, decision analysis is pointless and has value zero. Yet, if we estimate these utilities (with some unacknowledged error) and select the highest one and think we have gained value. But the gain is illusory. And the more choices there are the greater this bias will be. On the other hand, if the choices have different utilities then this effect will be less but it is still present.
The solution to this problem in a frequentist setting would be to use one set of data to select the best decision and another independent set to evaluate the utility. I am not sure at all how one would do this. Perhaps you could get two sets of experts to estimate the values and probabilities driving the decisions. But I cannot see how the two sets of estimates would be independent.
So we are driven to the Bayesian universe again. Bayesian methods can resolve any problem once you sign on to the idea - a bit like your favourite conspiracy theory. In this case, all we need do is describe the uncertainty in our estimated values and probabilities with a statistical model and also assign priors to the true utilities of the various decisions. We then use posterior expectations of the value of the best decision given the value estimates. And this is unbiased for simple models. How you come up with priors for your valuations though is pretty unclear to me. I would have thought that before you look at your estimated values you would have very litte idea at all about the utilities of the different decisions. Perhaps you could get a panel of experts to assign values, average them, and use the mean value as the estimate and our old friend “s over root n” as a variability measure. This would cover the priors. And then you could estimate your own values and assign “gut feeling” errors to these. At this stage, it all start to feel less like maangement science and more like yoga.
This paper makes me wonder how many opportunities people in our field are missing to make contributions in other related fields. I teach decision trees every term and I am very familiar with accounting for the effects of uncertainty in any analysis. Why didn’t I identify and solve this problem? How much more low hanging fruit is out there?
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Leave a Reply