Saturday, October 6, 2012

How To Lie About Statistics

I left you the other day with a homework problem one of my students brought me. I am often dismayed by what goes on in the education system, but this little item sums it all up pretty nicely for me. I'll let you read the problem, and see if it bothers you in any way. Then I'll tell you what's wrong with it.

Here is the problem:

"You're going to a two-day conference and can't decide what shoes to pack. You own 5 pairs of flats, 7 pairs of heels, and 8 pairs of boots. To save time you decide to randomly pick your shoes. If you sample two pairs of shoes, one at a time, with replacement, what is the probability you will get a pair of heels and a pair of boots in that order"

I always like word problems in math. The nice thing about word problems is they don't (or shouldn't) tell you how what formula to use: you have to understand what's going on and put two and two together for yourself. And of course, that's why a lot of students don't like word problems. It's not that they don't like to think: it's that the education system has never really encouraged them to think for themselves. The system is all about teaching you the rules for how to solve specific problems. If you are a good student, and learn the rules, you will be able to solve the problems. Within this paradigm, it is completely irrelevant whether you understand what you are doing. The system puts a premium on the ability to follow instructions.

Except when it comes to word problems. That is where a small complication creeps in: you have to interpret the problem and decide which formula to use in solving it. Students who are conditioned to learning by rules are uncomfortable with this, and so the teachers cater to them. They give a whole worksheet of word problems all based on the same formula, so you don't really have to think about what's going on; you just have to pick out the relevant numbers. It's fundamentally dishonest, because it pretends to encourage students to interpret math realistically, whereas it actually ends up being all about plugging numbers into formulas.

That's all very well for me to make these claims, but what is it about this particular problem that I find so offensive? Well, let's read it over again. I was okay about the woman needing to pick to random pairs of shoes, although it really is a very awkward proposition. We all know how to pick a random card out of a deck, or a handful of scrabble tiles from a bag. But how do you pick a random pair of shoes out of a closet? Do you close your eyes and reach in, fumbling around on your hands and knees? It really doesn't make a convincing scenario, but for the sake of the math, we can try to work around it. We might imagine that all her shoes are in boxes, and the boxes are lined up in a row on her shelf, in random order. I don't know any woman who stores her shoes that way, but let it be.

The red flags start to fly when the jargon words appear: we are told the woman samples two pairs of shoes. How the hell do you sample a pair of shoes? Do you take a bite out of the heel? It doesn't make any sense. Unless...you ignore the whole business about the woman going on the trip and you flip through your math book for a formula that relates to something about "sampling". I don't know any such formulas. I understand the idea of probabilities, and I can figure out how to calculate them in all kinds of situations, but I really don't know any formulas about sampling. So I'm starting to get annoyed.

Then it gets worse: we are told that the woman samples her shoes "one at a time, with replacement". What does this even mean? Surely she selects her shoes two at a time...in pairs, that is. She "randomly" chooses a pair of boots, and then a pair of flats or whatever. Surely we don't expect her to take a left heel and then a right flat. Why do they tell us she samples them one at a time?

And what does it mean when they tell us she samples them "with replacement"? She is going on a trip. She takes two pairs of shoes from her closet. What could "replacement" mean in that context?

It took me some time but all at once I saw what was going on. The students were given a formula for "sampling with replacement", and the professor puts those code words into the problem statement so the students will know which formula to use. The whole story about the woman who needs to pack for a trip has nothing to do with anything. The clincher is the last line of the problem, where it asks "what are the chances she will get a pair of heels and a pair of boots....in that order?"

What difference does the order make if she's packing for a trip? All you care about is what she ends up with. It's true that in probability and statistics, sometimes the order matters and sometimes it doesn't. But the beauty of math is that when something matters, it matters for a reason. In this case it doesn't matter because once you have your two pairs of shoes, it doesn't matter what order you packed them. So to write up a math problem where it the order doesn't matter, and then at the very end to tell the student to use the formula for when it does matter....well, that's a total perversion of everything that math is supposed to be all about.

It's a perversion because it tells the student in no uncertain terms: don't try and think about what the problem means. Don't try to make sense of what is going on. Just use the formula you were taught in class. Otherwise you will fail.

I hope you see why I don't like this problem. But I'm not done yet. There is one more outrageous aspect to this question that I haven't yet explained, although it's been mentioned in passing. Do you know what it is? I'll let you think about it....
*
*
*
*
*
*
*
*
*
*
*
spoiler alert
*
*
*
*
*
*
*
*
*
*
*
*
..,.did you figure out what's wrong about "sampling with replacement"? I'll take up this topic when we return.


No comments: