A biography of my grandfather from a local business club

The Scotch Referendum

The American linguist Lauren Hall-Lew, currently living in Edinburgh, was musing on Twitter recently about how both Scotch and Oriental are considered offensive when categorizing people, but not offensive when describing alcohol or rugs. Her main point is valid and very important: as I’ve discussed before, emotions can run very high when discussing how to categorize people, this is because so much more is at stake.

A bunch of us derailed the discussion by questioning Hall-Lew’s assertion that scotch was offensive, and I want to continue that derail here. I had heard that assertion, but never from my father, whose own father came from Dundee in 1909 and whose mother was descended from “Scotch-Irish” immigrants from County Antrim.

Hall-Lew provided links to Wikipedia, the Grammarphobia Blog and the Urban Dictionary, which all reported that “many Scots” objected to using Scotch to categorize them, sometimes on the basis that “scotch is a type of liquor.”

The Wikipedia and Grammarphobia articles are particularly intriguing, because they tell us that the taboo declaration for Scotch describing people has been contested for its entire life, beginning with Robert Burns who said “The appellation of a Scotch Bard, is by far my highest pride; to continue to deserve it is my most exalted ambition.” While people in Scotland seem fairly united on declaring Scotch taboo to refer to people, people of Scottish/Scotch heritage living in North America have shown strong resistance to the taboo. Canadian politician Tommy Douglas referred self-deprecatingly to “my thick Scotch head.”

This controversy over Scotch reminds me of similar verbal hygiene practices among another group that I belong to, the transgender community. If you talk to certain people, as Jessica Roy of Open Source TV did, you can get the impression that we’re united in declaring transvestite and tranny taboo, and we love the word cis.

But just as with transvestite, nobody actually went around Scotland asking people if they agreed on this. There were simply some “community leaders” who decided that Scotch was bad, and convinced everyone who had any significant power in Scotland to go along with this. But they didn’t think to go talk to my grandfather, or Tommy Douglas, or any of the other people of Scottish heritage living outside Scotland.

Overvaluing the opinions of vocal “community leaders” can get you into trouble. For over a century “everyone knew” that if you were really from San Francisco you didn’t call it Frisco, you called it San Fran or SF or something. There was even a Don’t Call it Frisco Laundromat. But on New Year’s, Joe Eskenazi in the SF Weekly found that not only were younger residents embracing the name Frisco, but that it had long been popular among the city’s African American residents – it was primarily rejected by white people.

As I discussed in my post, you can’t have a vote of the transgender community because so many of us are in the closet or stealth. You could actually have a vote of the “Scottish community,” and in fact residents of Scotland will vote in September on whether Scotland should become an independent country again. There is some controversy over whether expatriates born in Scotland, including some 800,000 living in other parts of the United Kingdom, should be allowed to vote. They could even open up the vote to expatriates of Scottish ancestry like myself, but it doesn’t look like that will happen.

If you can have a vote on independence, you can at least have an opinion poll, based on a decent sample, on issues like the usage of Scotch. This might not have been feasible a hundred years ago, but it is certainly doable in this day and age for linguistic researchers to partner with opinion pollsters on questions of the acceptability of certain terms.

Over email, Hall-Lew clarified that by saying that Scotch “is offensive,” she didn’t mean to signal that she was accepting the word of “the community” on verbal hygiene issues. She was simply pointing to the existence of ideologies that mark Scotch and Oriental as offensive with regard to people. This kind of nuance is difficult to convey on Twitter, which is why I followed up with tweets and emails.

I actually believe that it’s very difficult, even among well-informed linguists, to say that something “is offensive” without implying that it’s a universally held opinion in the community. In public, it’s well-nigh impossible; someone will always assume that the group is united on this issue.

I’ve observed, particularly on Tumblr, that this is in fact how some of these ideas are transmitted: someone will declare that transgendered is offensive, and unless that is challenged it will be taken by others as a statement of community consensus. Even if, as with Frisco, the consensus leaves out the city’s black population.

For linguists (and grammarians and encyclopedia editors), especially those who try to be impartial observers, there is an observer’s paradox here. Just by stating that something “is offensive,” we can reinforce that ideology. We should be aware of this, and take care with our words. At the very least we should say “is considered offensive by some.” We can take steps to identify the most vocal opinion-makers. And if we’re really interested, we can verify the extent to which the population in question really agrees with a particular opinion or not.

The case of “Frisco” shows clearly that this is not a pedantic matter of crossing “i”s and dotting “t”s. It’s a matter of basic fairness. If people hear that “San Franciscans” don’t like Frisco, it excludes black San Franciscans and implies that they don’t matter. If they hear that people don’t like Scotch, they get a message that people in the Scottish diaspora don’t matter. That’s not right.

Choose your Own Speech Role Model

In a couple of recent posts I talked about the idea of speech role models for language learning, specifically on fluent, clear non-native speakers providing more accessible models for students learning after the teenage years. I ended with a caution against ?cloning? a single non-native speaker, raising the specter of a class of students who all come out speaking English like Javier Bardem. I believe this can be avoided by giving students a greater range of options for role models, and a greater role in choosing them.

Again, I can speak from personal experience in this area. As a second-language learner of French and later Portuguese I chose a variety of speech role models. No one has ever said I sound like Jacques Dutronc or Karl Z?ro when speaking French, but I was motivated to reach for those goals because I believed I could sound kind of like them.

Thinking back on my speech role models for French, and even for my native English, it was clear that my unique voice is a result of having a diversity of speech role models, and my comfort with my voice was due to the fact that I had chosen all those role models. I sound like me because I sound like a combination of several people that I have admired over the years.

As language teachers, we owe it to our students not to turn them into Javier Bardem clones, or to discourage those who feel like they could never be Bardem. John Murphy?s study of reactions to Bardem is valuable because it establishes that a non-native speaker can be an acceptable role model, but we can?t stop at him, or even at the other fourteen that Murphy lists in his Appendix A.

With sites like YouTube at their fingertips, students have access to millions of non-native English speakers. We need to give them the opportunity to choose several non-native speakers, and be prepared to evaluate those speakers as potential role models, so that they can sound like their unique selves, but speaking clear, fluent English (or French or Hmong or whatever).

Why I probably won’t take your survey

I wrote recently that if you want to be confident in generalizing observations from a sample to the entire population, your sample needs to be representative. But maybe you’re skeptical. You might have noticed that a lot of people don’t pay much attention to representativeness, and somehow there are hardly any consequences for them. But that doesn’t mean that there are never consequences, for them or other people.

In the “hard sciences,” sampling can be easier. Unless there is some major impurity, a liter of water from New York usually has the same properties as one from Buenos Aires. If you’re worried about impurities you can distill the samples to increase the chance that they’re the same. Similarly, the commonalities in a basalt column or a wheel often outweigh any variation. A pigeon in New York is the same as one in London, right? A mother in New York is the same as a mother in Buenos Aires

Well, maybe. As we’ve seen, a swan in New York can be very different from a swan in Sydney. And when we get into the realm of social sciences, things get more complex and the complexity gets hard to avoid. There are probably more differences between a mother in New York and one in Buenos Aires than for pigeons or stones or water, and the differences are more important to more people.

This is not just speculation based on rigid rules about sampling. As Bethany Brookshire wrote last year, psychologists are coming to realize the drawbacks of building so much of their science around WEIRD people. And when she says WEIRD, she means WEIRD like me: White, Educated and from an Industrialized, Rich, Democratic country. And not just any WEIRD people, but college sophomores. Brookshire points out how much that skews the results in a particular study of virginity, but she also links to a review by Heinrich, Heine and Norenzayan (2010) that examines several studies and concludes that “members of WEIRD societies, including young children, are among the least representative populations one could find for generalizing about humans.”

I think about this whenever I get an invitation to participate in a social science study. I get them pretty frequently, probably at least twice a week, on email lists and Twitter, and occasionally Tumblr and even Facebook. Often they’re directly from the researchers themselves: “Native English speakers, please fill out my questionnaire on demonstratives!” That means that they’re going primarily to a population of educated people, most of whom are white from an industrialized, rich, democratic country.

(A quick reminder, in case you just tuned in: This applies to universal observations – percentages, averages and all or none statements. It does not apply to existential statements, where you simply say that you found ten people who say “less apples.” You take those wherever you find them, as long as they’re reliable sources.)

I don’t have a real problem with using non-representative samples for pilot studies. You have a hunch about something, you want to see if it’s not just you before you spend a lot of time sending a survey out to people you don’t know. I have a huge problem with it being used for anything that’s published in a peer-reviewed journal or disseminated in the mainstream media. And yeah, that means I have a huge problem with just about any online dialect survey.

I also don’t like the idea of students generalizing universal observations from non-representative online surveys for their term papers and master’s theses. People learn skills by doing. If they get practice taking representative samples, they’ll know how to do that. If they get practice making qualitative, existential observations, they’ll be able to do those. If they spend their time in school making unfounded generalizations from unrepresentative samples (with a bit of handwaving boilerplate, of course!), most of them will keep doing that after they graduate.

So that’s my piece. I’m actually going to keep relatively quiet about this because some of the people who do those studies (or their friends) might be on hiring committees, but I do want to at least register my objections here. And if you’re wondering why I haven’t filled out your survey, or even forwarded it to all my friends, this is your answer.

Non-native speech role models

In a recent post, I talked about using speech role models to teach English as a Second Language (ESL). In my class at Saint John?s University I told my students to find a native English speaker that they admired and wanted to sound like, but some of the students seemed discouraged and the distance between their accents and the accents of their role models was very large. I guessed that they may have felt that the gap was insurmountable.

I wondered if non-native English speakers might make better role models, so I asked the students to find online video clips of people who were from their country and native speakers of their own language, and who they felt spoke English well. For examples, I showed them clips of interviews with native English speakers speaking other languages, like New York Mayor Michael Bloomberg in Spanish (this was before the El Bloombito nastiness, which deserves its own post) and John Beyrle, then US Ambassador to Russia in Russian.

The students? answers revealed two problems with the assignment. The first was that some of the speakers were too good, and for a specific reason: they had the unfair advantage of living in the United States as teenagers, which made them almost native speakers. Some, like boxer Oscar de la Hoya, were from immigrant families. Others, like tennis player Maria Sharapova, were sports stars who moved to the US as teenagers for training camps. The English of these role models was as inaccessible to my students as those of people who had lived in the US their entire lives.

The second problem was that it was simply hard to find examples of non-native speakers with accents who were not stigmatized. Some of my students found good examples: Columbian singer Shakira, Russian tennis player Elena Dementieva, Chinese television presenter Rui Chenggang, Serbian tennis player Jelena Jankovic and Chinese basketball star Yao Ming. For students who were unable to find an acceptable role model, I found UN Secretary General Ban Ki-Moon from Korea, Salvadoran computer scientist Luis von Ahn and Mexican film director Guillermo del Toro. The students readily accepted these speakers as role models.

I followed this up with transcription tasks and two further assignments: ?Your Second Speech Role Model?s Accent,? where the students identified a feature of their role model that marked them as non-native, and ?Outdo your Second Speech Role Model,? where the students recorded themselves trying to say the same sentences without that marked feature. I have the impression that this was valuable for the students, but I did not have a chance to study it systematically.

In my own searches, I came to appreciate the difficulty of finding good non-native role models, and of second language acquisition in general. I was simply unable to find a single non-native speaker who had achieved nativelike pronunciation in English without being immersed in English during the critical period of adolescence. Discussions with other ESL faculty confirmed this. I had already prioritized clarity over correctness, and this confirmed that I was on the right track. I took this into account when grading the students? in-class presentations and assignments.

While it is difficult to find non-native speakers who express themselves clearly in English and have prestige, the existence of people like Yao Ming, Guillermo del Toro and Ban Ki-Moon shows that they are out there. It would be valuable to introduce non-native role models like these earlier, to help the students with setting goals and to give them perspective on the second language enterprise.

I was a bit disturbed by the term ?cloning? coined by Joanne Kenworthy and Jennifer Jenkins and used by Robin Walker, because to me it implies copying another person?s accent wholesale, leading me to imagine an ESL program where every graduate sounds like Javier Bardem. There are two elements that can counteract this: having a variety of role models and allowing the students to participate as much as possible in choosing their role models. I?ll talk about those more in a future post.

You can’t get significance without a representative sample

Recently I’ve talked about the different standards for existential and universal claims, how we can use representative samples to estimate universal claims, and how we know if our representative sample is big enough to be “statistically significant.” But I want to add a word of caution to these tests: you can’t get statistical significance without a representative sample.

If you work in social science you’ve probably seen p-values reported in studies that aren’t based on representative samples. They’re probably there because the authors took one required statistics class in grad school and learned that low p-values are good. It’s quite likely that these p-values were actually expected, if not explicitly requested, by the editors or reviewers of the article, who took a similar statistics class. And they’re completely useless.

P-values tell you whether your observation (often a mean, but not always) is based on a big enough sample that you can be 99% (or whatever) sure it’s not the luck of the draw. You are clear to generalize your representative sample to the entire population. But if your sample is not representative, it doesn’t matter!

Suppose you need 100% pure Austrian pumpkin seed oil, and you tell your friend to make sure he gets only the 100% pure kind. Your friend brings you 100% pure Australian tea tree oil. They’re both oils, and they’re both 100% pure, so your friend doesn’t understand why you’re so frustrated with him. But purity is irrelevant when you’ve got the wrong oil. P-values are the same way.

So please, don’t report p-values if you don’t have a representative sample. If the editor or reviewer insists, go ahead and put it in, but please roll your eyes while you’re running your t-tests. But if you are the editor or reviewer, please stop asking people for p-values if they don’t have a representative sample! Oh, and you might want to think about asking them to collect a representative sample?

Speech role models

John Murphy of Georgia State published an article about using non-native speakers, and specifically the Spanish actor Javier Bardem, as models for teaching English as a Second Language (ESL) or as a foreign language (EFL). Mura Nava tweeted a blog post from Robin Walker connecting Murphy’s work to similar work by Kenworthy and Jenkins, Peter Roach and others. I tried something like this when I taught ESL back in 2010, more or less unaware of all the previous work that Murphy cites, and Mura Nava was interested to know how it went, so here?s the first part of a quick write-up.

When I was asked to teach a class in ESL Speech ?Advanced Oral/Aural Communication? at Saint John?s University in the fall of 2010, I had taught French and Linguistics, but I had only tutored English one-on-one. My wife is an experienced professor of ESL and was a valuable source of advice, but our student populations and our goals were different, so I did not simply copy her methods.

One concept that I introduced was that of a Speech Role Model. When I was learning French, I found it invaluable to imitate entertainers; I?ve never met Jacques Dutronc, but I often say that he was one of my best French teachers because of the clever lyricists he worked with and his clear, wry delivery. He was just one of the many French people that I imitated to improve my pronunciation.

This was all back in the days of television and cassettes, and most of the French culture that we had access to here in the United States was filtered through the wine, Proust and Rohmer tastes of American Francophiles. As a geeky kid with a fondness for comedy I found Edith Piaf and even G?rard Depardieu too alien to emulate. I found out about Dutronc in college through a bootleg tape made for me by a student from France who lived down the hall, and then I had to study abroad in France to find more role models.

With today’s multimedia Internet technology, we have an incredible the ability to listen to millions of people from around the world. At Saint John’s I asked my students to choose a Speech Role Model for English: a native speaker that they personally admired and wanted to sound like. I was surprised by the number of students who named President Obama as their role model, including female students from China, but on reflection it was an obvious choice, as he is a clear, forceful and eloquent speaker. Other students chose actresses Meryl Streep and Jennifer Anniston, talk-show host Bill O?Reilly and local newscaster Pat Kiernan.

One notable choice, hip-hop artist Eminem, gave me the opportunity to discuss covert prestige and its challenges. Another, the character of Sheldon Cooper from the television series ?The Big Bang Theory,? was too scripted, and I was debating whether to accept it when I discovered that it was just a cover so that the student could plagiarize crowdsourced transcriptions.

In subsequent assignments I asked the students to find a YouTube video of their role model and to transcribe a short excerpt. I then asked the students to record themselves imitating that excerpt from their Speech Role Models. Some of the students were engaged and interested, but others seemed frustrated and discouraged. When I listened to my students and comparing their speech to their chosen role models, I had an idea why. The students who were engaged were either naturally enthusiastic or good mimics, but the challenge was to motivate the others. There was so much distance between them and the native English speakers, much more than could be covered in a semester. That was when I thought of adding a non-native Second Speech Role Model. I’ll have to leave that for another post.

How big a sample do you need?

In my post last week I talked about the importance of representative samples for making universal statements, including averages and percentages. But how big should your sample be? You don’t need to look at everything, but you probably need to look at more than one thing. How big a sample do you need in order to be reasonably sure of your estimates?

One of the pioneers in this area was a mysterious scholar known only to the public as Student. He took that name because he had been a student of the statistician Karl Pearson, and because he was generally a modest person. After his death, he was revealed to be William Sealy Gosset, Head Brewer for the Guinness Brewery. He had published his findings (PDF) under a pseudonym so that the competing breweries would not realize the relevance of his work to brewing.

Pearson had connected sampling to probability, because for every item sampled there is a chance that it is not a good example of the population as a whole. He used the probability integral transformation, which required relatively large samples. Pearson?s preferred application was biometrics, where it was relatively easy to collect samples and get a good estimate of the probability integral.

The Guinness brewery was experimenting with different varieties of barley, looking for ones that would yield the most grain for brewing. The cost of sampling barley added up over time, and the number of samples that Pearson used would have been too expensive. Student?s t-test saved his employer money by making it easy to tell whether they had the minimum sample size that they needed for good estimates.

Both Pearson?s and Student?s methods resulted in equations and tables that allowed people to estimate the probability that the mean of their sample is inaccurate. This can be expressed as a margin of error or as a confidence interval, or as the p-value of the mean. The p-value depends on the number of items in your sample, and how much they vary from each other. The bigger the sample and the smaller the variance, the smaller the p-value. The smaller the p-value, the more likely it is that your sample mean is close to the actual mean of the population you?re interested in. For Student, a small p-value meant that the company didn’t have to go out and test more barley crops.

Before you gather your sample, you decide how much uncertainty you?re willing to tolerate, in other words, a maximum p-value designated by α (alpha). When a sample?s p-value is lower than the α-value, it is said to be significant. One popular α-value is 0.05, but this is often decided collectively, and enforced by journal editors and thesis advisors who will not accept an article where the results don?t meet their standards for statistical significance.

The tests of significance determined by Pearson, Student, Ronald Fisher and others are hugely valuable. In science it is quite common to get false positives, where it looks like you?ve found interesting results but you just happened to sample some unusual items. Achieving statistical significance tells you that the interesting results are probably not just an accident of sampling. These tests protect the public from inaccurate data.

Like every valuable innovation, tests of statistical significance can be overused. I?ll talk about that in a future post.

A tool for annotating corpora

My dissertation focused on the evolution of negation in French, and I’ve continued to study this change. In order to track the way that negation was used, I needed to collect a corpus of texts and annotate them. I developed a MySQL database to store the annotations (and later the texts themselves) and a suite of PHP scripts to annotate the texts and store them in the database. I then developed another suite of PHP scripts to query the database and tabulate the data in a form that could be imported into Microsoft Excel or a more specialized statistics package like SPSS.

I am continuing to develop these scripts. Since I finished my dissertation, I added the ability to load the entire text into the database, and revamped the front end with AJAX to streamline the workflow. The new front end actually works pretty well on a tablet and even a smartphone when there’s a stable internet connection, but I’d like to add the ability to annotate offline, on a workstation or a mobile device. I also need to redo the scripts that query the database and generate reports. Here’s what the annotation screen currently looks like:

I’ve put many hours of work into this annotation system, and it works so well for me, that it’s a shame I’m the only one who uses it. It would take some work to adapt it for other projects, but I’m interested in doing that. If you think this system might work for your project, please let me know (grvsmth@panix.com) and I’ll give you a closer look.