Sampling

Leave a comment January 9, 2014 Angus Andrea Grieve-Smith

In my previous post, I discussed the differences between existential and universal statements. In particular, the standard of evidence is different: to be sure that an existential statement is correct we only need to see one example, but to be sure a universal is correct we have to have examined everything.

But what if we don’t have the time to examine everything, and we don’t have to be absolutely sure? As it turns out, a lot of times we can be pretty sure. We just need a representative sample of everything. It’s quicker than examining every member of your population, and it may even be more accurate, since there are always measurement errors, and measuring a lot of things increases the chance of an error.

Pierre-Simon Laplace figured that out for the French Empire. In the early nineteenth century, Napoleon had conquered half Europe, but he didn’t have a good idea how many subjects he had. Based on the work of Thomas Bayes, Laplace knew that a relatively small sample of data would give him a good estimate. He also figured out that he needed the sample to be representative to get a good estimate.

?The most precise method consists of (1) choosing districts distributed in a roughly uniform manner throughout the Empire, in order to generalize the result, independently of local circumstances,? wrote Laplace in 1814. If you didn’t have a uniform distribution, you might wind up getting all your data from mountainous districts and underestimating the population, or getting data from urban districts and overestimating. Another way to avoid basing your generalizations on unrepresentative data is random sampling.

A lot of social scientists, including linguists, understand the value of sampling. But many of them don’t understand that it’s representative sampling that has value. Unrepresentative samples are worse than no samples, because they can give you a false sense of certainty.

A famous example mentioned in Wikipedia is when a Literary Digest poll forecast that Alfred M. Landon would defeat Franklin Delano Roosevelt in the 1936 Presidential election. That poll was biased because the sample was taken from lists of people who owned telephones and automobiles, and those people were not representative of the voters overall. The editors of the Literary Digest were not justified in generalizing those universal statements to the electorate as a whole, and thus failed to predict Roosevelt’s re-election.

“Average Italian Female” by Colin Spears

What can be deceiving is that you get things that look like averages and percentages. And they are averages and percentages! But they’re not necessarily averages of the things you want an average of. A striking example comes from a blogger named Colin Spears, who was intrigued by a “facial averaging” site set up by some researchers at the University of Aberdeen (they’ve since moved to Glasgow). Spears uploaded pictures from 41 groups, including “Chad and Cameroonian” and created “averages.” These pictures were picked up by a number of websites, stripped of their credits, and bundled with all kinds of misleading and inaccurate information, as detailed by Lisa De Bruine, one of the creators of the software used by Spears.

Some bloggers, like Jezebel’s Margaret Hartmann, noted that the “averages” all looked to be around twenty years old, which is not the median age for most countries according to the CIA World Fact Book (which presumably relies on better samples). In fact, the median age for Italian women (see image) is 45.6. The average look of the image is in the twenties, because that’s the age of the images that Spears uploaded to the Aberdeen site. So we got averages of some Italian women, but nothing that actually represents the average (of all) Italian women. (Some blog posts about this even showed a very light-skinned face for “Average South African Woman,” but that was just a mislabeled “Average Argentine Woman.”)

Keep this in mind the next time you see an average or a percentage. What was their sampling method? If it wasn’t uniform or random, it’s not an average or percentage of anything meaningful. If you trust it, you may wind up spreading inaccuracies, like a prediction for President Landon or a twentysomething average Italian woman. And won’t your face be red!