Saturday, September 01, 2007

Statistical Frustration

Okay... I must be getting old.

Anytime you find yourself saying things like, "Friggen kids these days know nothing about statistics", you pretty much know that oldness has happened. It's just driving me crazy that people don't understand what the hell "expected value" is or what the difference between the mean and the median is. Golly, people have a hard time with standard deviation, how the heck am I supposed to explain confidence in an estimate to them?

At the heart of this is the problem that most people don't get what is meant by probability or what is probable. Suppose for a minute that the VP of Customer Placation comes into your office and asks, "How long will it take to add frammerstammers to the hoopydo?"

Now, because you're not a n00b at estimation you hand him back a range.

"That'll take 5-20 days", you say. And you add, "I'm about 90% confident."

This is where about an hour and a half of useless discussion about "what the heck you mean" is going to go on because Mr. (or Mrs., Ms., Miss) VP just doesn't really want to know what is going on here. They just want a date that you promise that the hoopydo will have frammerstammers.

While willful ignorance is pathetic and deplorable (like spam) it is just a fact of life (like spam). Lecturing on probability and what it means to be confident will do you no good. They want a social contract not a lesson in Stat101.

So what to do?

Well, I think you need to know more about statistics than they do and be able to abstract away the math and crap like that from the discussion so that they come away with a good english language understanding of what you have promised them.

You need to explain that you don't know how long it is going to take. "Whadda I look like? The amazing Randy?" Then you can go on to explain that

You need to explain to them that if they make you promise to give them frammerstammers for the hoopydo on a specific date that the date will need to be out past the end of your range and that no, you can't just take the average and call it the promise. Be wary, be very wary, of giving a single number at this point. They will want to start using one. Don't fall for it.

Because the minute that you use a single number it becomes the default promise date. 3 months from now nobody but you will remember that you said 5-10 weeks. They'll only remember the 7 week number that everyone kept saying in the meeting. "Seven is in the middle, right?"

Yeah... uh... right.

If people could just grasp a couple of simple concepts from statistics this would all be so much simpler. We'd have a shared vocabulary to use when talking about estimates.

At its heart a good estimate is a range of values that you think the actual value will fall inside of. The narrower the range, the less likely you are to be right. Let me give an example. I'm going to give several estimates of the number of teeth you have.

32 is a bad estimate. It's a perfectly good guess. But a bad estimate. Because it is only correct if you have the normal number of teeth for an adult human and have not had your wisdom teeth removed.

0-50 is a better (though nearly useless estimate) since I'm almost certain that the actual number of teeth you have falls somewhere inside that range (100% confidence). Knowing nothing else I'd estimate the number of teeth you have at 25-32 and 80% confidence.

With more information my estimates get better. If you tell me that you are a 90 year old ex-hockey player I would estimate that you have 0-20 teeth. For a 16 year old bookworm I'll estimate 27-32 teeth. Again, 80% confidence.

The confidence part there is important. It helps tell you how statistically likely I think the actual value is to fall within my range. Once you start viewing these things as statements of probabilities a whole world of possibilities opens up. I'll explain some of the fantastic things that simply fall out of using probabilities in 2-5 future posts.

1 comment:

Matthew Swank said...

Are you familiar with E.T. Jaynes' "Probability Theory: The Logic of Science"? Not a standard treatment, but a wonderful (so far) explanation of a Bayesian point of view.