African American English has accents too

Diversity is notoriously subjective and difficult to pin down. In particular, we tend be impressed if we know the names of a lot of categories for something. We might think there are more mammal species than insect species, but biologists tell us that there are hundreds of thousands of species of beetles alone. This is true in language as well: we think of the closely-related Romance and Germanic languages as separate, while missing the incredible diversity of “dialects” of Chinese or Arabic.

This is also true of English. As an undergraduate I was taught that there were four dialects in American English: New England, North Midland, South Midland and Coastal Southern. Oh yeah, and New York and Black English. The picture for all of those is more complicated than it sounds, and I went to Chicago I discovered that there are regional varieties of African American English.

In 2012 Annie Minoff, a blogger for Chicago public radio station WBEZ, took this oversimplification for truth: “AAE is remarkable for being consistent across urban areas; that is, Boston AAE sounds like New York AAE sounds like L.A. AAE, etc.” Fortunately a commenter, Amanda Hope, challenged her on that assertion. Minoff confirmed the pattern in an interview with variationist Walt Wolfram, and posted a correction in 2013.

In 2013 I was preparing to teach a unit on language variation and didn’t want to leave my students as misinformed as I – or Minoff – had been. Many of my students were African American, and I saw no reason to spend most of the unit on white varieties and leave African American English as a footnote. But the documentation is spotty: I know of no good undergraduate-level discussion of variation in African American English.

A few years before I had found a video that some guy took of a party in a parking lot on the West Side of Chicago. It wasn’t ideal, but it sort of gave you an idea. The link was dead, so I typed “Chicago West Side” into Google. The results were not promising, so on a whim I added “accent” and that’s how I found my first accent tag video.

Accent tag videos are an amazing thing, and I could write a whole series of posts about them. Here was a young black woman from Chicago’s West Side, not only talking about her accent but illustrating it, with words and phrases to highlight its differences from other dialects. She even talks (as many people do in these videos) about how other African Americans hear her accent in other places, like North Carolina. You can compare it (as I did in class) with a similar video made by a young black woman from Raleigh (or New York or California), and the differences are impossible to ignore.

In fact, when Amanda Hope challenged Minoff’s received wisdom on African American regional variation, she used accent tag videos to illustrate her point. These videos are amazing, particularly for teaching about language and linguistics, and from then on I made extensive use of them in my courses. There’s also a video made by two adorable young English women, one from London and one from Bolton near Manchester, where you can hear their accents contrasted in conversation. I like that I can go not just around the country but around the world (Nigeria, Trinidad, Jamaica) illustrating the diversity of English just among women of African descent, who often go unheard in these discussions. I’ll talk more about accent tag videos in future posts.

You can also find evidence of regional variation in African American English on Twitter. Taylor Jones has a great post about it that also goes into the history of African American varieties of English.

Describing differences in pronunciation

Last month I wrote that instead of only two levels of phonetic transcription, “broad” and “narrow,” what people do in practice is to adjust their level of detail according to the point they want to make. In this it is like any other form of communication: too much detail can be a distraction.

But how do we decide how much detail to put in a given transcription, and how can we teach this to our students? In my experience there is always some kind of comparison. Maybe we’re comparing two speakers from different times or different regions, ethnicities, first languages, social classes, anatomies. Maybe we’re comparing two utterances by the same person in different phonetic, semantic, social or emotional contexts.

Sometimes there is no overt comparison, but at those times there is almost always an implicit comparison. If we are presenting a particular pronunciation it is because we assume our readers will find it interesting, because it is pathological or nonstandard. This implies that there is a normal or standard pronunciation that we have in our heads to contrast it to.

The existence of this comparison tells us the right level of detail to include in our transcriptions: enough to show the contrasts that we are describing, maybe a little more, but not so much to distract from this contrast. And we want to focus on that contrast, so we will include details about tone, place of articulation or laryngeal timing, and leave out details about nasality, vowel tongue height or segment length.

This has implications for the way we teach transcription. For our students to learn the proper level of detail to include, they need practice comparing two pronunciations, transcribing both, and checking whether their transcriptions highlight the differences that they feel are most relevant to the current discussion.

I can illustrate this with a cautionary tale from my teaching just this past semester. I had found this approach of identifying differences to be useful, but students found the initial assignments overwhelming. Even as I was jotting down an early draft of this blog post, I just told my students to transcribe a single speech sample. I put off comparison assignments for later, and then put them off again.

As a result, I found myself focusing too much on some details while dismissing others. I could sense that my students were a bit frustrated, but I didn’t make the connection right away. I did ask them to compare two pronunciations on the final exam, and it went well, but not as well as it could have if they had been practicing it all semester. Overall the semester was a success, but it could have been better.

I’ll talk about how you can find comparable pronunciations in a future post.

Levels of phonetic description

When I first studied phonetic transcription I learned about broad and narrow transcription, where narrow transcription contains much more detail, like the presence of aspiration on consonants and fine distinctions of tongue height. Of course it makes sense that you wouldn’t always want to go into such detail, but at the time I didn’t think about what detail was excluded from broad transcription and why.

In phonology we learned about phonemes, and how phoneme categories glossed over many of those same details that were excluded from broad transcription. For reasons I never quite grasped, though, we were told that phonemic transcription was a very different thing from broad transcription, and we were not to confuse them. Okay.

I got a better explanation from my first phonetics professor, Jacques Filliolet, who used three levels of analysis: niveau généralisant, niveau pertinent and niveau particularisant. We can translate them as general, specific and detailed levels.

When I started teaching phonology, I realized that the broad vs. narrow distinction did not reflect what I read in books and papers and saw at conferences. When people are actually using phonetic transcription there is no consistent set of features that they leave out or include.

What people do instead is include the relevant features and leave out the irrelevant ones. Which features are relevant depends on the topic of discussion. If it’s a paper about aspiration, or a paper about variation where aspiration may or may not be relevant, they will include aspiration. If it isn’t, they won’t.

I realized that sometimes linguists need to go into more detail than phonetic transcription can easily handle, so they use even finer-grained representations like formant frequencies, gestural scores and voice onset times.

Recently I realized that this just means phonetic transcription is a form of communication. In all forms of communication we adjust the level of detail we provide to convey the relevant information to our audience and leave out the irrelevant parts.

Phonemes are another, more organic way that we do this. This explains why phonemic transcription is not the same as broad transcription: we often want to talk about what sounds go into a phoneme without adding other details. For example, we may want to talk about how English /t/ typically includes both aspirated and unaspirated stops, without talking about fundamental frequency or lip closure.

Another possible translation of Filliolet’s niveau pertinent is “the appropriate level.” This is really what we’re all aiming for: the level of detail that is most appropriate for the circumstances.

Finding the right level of detail for phonetic transcription is actually not hard for students to learn; they do it all the time in regular language. The simplest way to teach it is to give the students assignments that require a particular level of detail.

Students are sometimes frustrated that there is not a single way to transcribe a given utterance. In addition to these differences of level of description, there are stylistic differences: do you write [r] instead of [ɹ] for an English bunched /r/?

Of course the International Phonetic Alphabet was sold as just such a consistent system: one symbol for one sound, in contrast with the messy reality of writing systems. To me this feels very Modernist and Utopian, and it is no accident that it was invented at the same time as other big modernist projects like Esperanto, Principia Mathematica, and International Style architecture.

The IPA falls short of the ideal consistent representation that was sold to people, but has largely succeeded in providing enough consistency, and keeping enough of the mess at bay, for specific purposes like documenting language variation and language acquisition.

Teaching phonetic transcription in the digital age

When I first taught phonetic transcription, almost seven years ago, I taught it almost the same way I had learned it twenty-five years ago. Today, the way I teach it is radically different. The story of the change is actually two stories intertwined. One is a story of how I’ve adopted my teaching to the radical changes in technology that occurred in the previous eighteen years. The other is a story of the more subtle evolution of my understanding of phonetics, phonology, phonological variation and the phonetic transcription that allows us to talk about them.

When I took Introduction to Linguistics in 1990 all the materials we had were pencil, paper, two textbooks and the ability of the professor to produce unusual sounds. In 2007 and even today, the textbooks have the same exercises: Read this phonetic transcription, figure out which English words were involved, and write the words in regular orthography. Read these words in English orthography and transcribe the way you pronounce them. Transcribe in broad and narrow transcription.

The first challenge was moving the homework online. I already assigned all the homework and posted all the grades online, and required my students to submit most of the assignments online; that had drastically reduced the amount of paper I had to collect and distribute in class and schlep back and forth. For this I had the advantage that tuition at Saint John’s pays for a laptop for every student. I knew that all of my students had the computing power to access the Blackboard site.

Thanks to the magic of Unicode and Richard Ishida’s IPA Picker, my students were able to submit their homework in the International Phonetic Alphabet without having to fuss with fonts and keyboard layouts. Now, with apps like the Multiling Keyboard, students can even write in the IPA on phones and tablets.

The next problem was that instead of transcribing, some students would look up the English spellings on dictionary sites, copy the standard pronunciation guides, and paste them into the submission box. Other students would give unusual transcriptions, but I couldn’t always tell whether these transcriptions reflected the students’ own pronunciations or just errors.

At first, as my professors had done, I made up for these homework shortcomings with lots of in-class exercises and drills, but they still all relied on the same principle: reading English words and transcribing them. Both in small groups and in full-class exercises, we were able to check the transcriptions and correct each other because everyone involved was listening to the same sounds. It wasn’t until I taught the course exclusively online that I realized there was another way to do it.

When I tell some people that I teach online courses, they imagine students from around the world tuning in to me lecturing at a video camera. This is not the way Saint John’s does online courses. I do create a few videos every semester, but the vast majority of the teaching I do is through social media, primarily the discussion forums on the Blackboard site connected with the course. I realized that I couldn’t teach phonetics without a way to verify that we were listening to the same sounds, and without that classroom contact I no longer had a way.

I also realized that with high-speed internet connections everywhere in the US, I had a new way to verify that we were listening to the same sounds: use a recording. When I took the graduate Introduction to Phonetics in 1993, we had to go to the lab and practice with the cassette tapes from William Smalley’s Manual of Articulatory Phonetics, but if I’m remembering right we didn’t actually do any transcription of the sounds; we just practiced listening to them and producing them. Some of us were better at that than others.

In 2015 we are floating in rivers of linguistic data. Human settlements have always been filled with the spontaneous creation of language, but we used to have to pore over their writings or rely on our untrustworthy memories. In the twentieth century we had records and tape, film and video, but so much of what was on that was scripted and rehearsed. If we could get recordings of the unscripted language it was hard to store, copy and distribute them.

Now people create language in forms that we can grab and hold: online news articles, streaming video, tweets, blog posts, YouTube videos, Facebook comments, podcasts, text messages, voice mails. A good proportion of these are even in nonstandard varieties of the language. We can read them and watch them and listen to them – and then we can reread and rewatch and relisten, we can cut and splice in seconds what would have taken hours – and then analyze them, and compare our analyses.

Instead of telling my students to read English spelling and transcribe in IPA, now I give them a link to a video. This way we’re working from the exact same sequence of sounds, a sequence that we can replay over and over again. I specifically choose pronunciations that don’t match what they find on the dictionary websites. This is precisely what the IPA is for.

Going the other way, I give my students IPA transcriptions and ask them to record themselves pronouncing the transcriptions and post it to Blackboard. Sure, my professor could have assigned us something like this in 1990, but then he would have had to take home a stack of cassettes and spend time rewinding them over and over. Now all my students have smartphones with built-in audio recording apps, and I could probably listen to all of their recordings on my own smartphone if I didn’t have my laptop handy.

So that’s the story about technology and phonetic transcription. Stay tuned for the other story, about the purpose of phonetic transcription.

Choose your Own Speech Role Model

In a couple of recent posts I talked about the idea of speech role models for language learning, specifically on fluent, clear non-native speakers providing more accessible models for students learning after the teenage years. I ended with a caution against “cloning” a single non-native speaker, raising the specter of a class of students who all come out speaking English like Javier Bardem. I believe this can be avoided by giving students a greater range of options for role models, and a greater role in choosing them.

Again, I can speak from personal experience in this area. As a second-language learner of French and later Portuguese I chose a variety of speech role models. No one has ever said I sound like Jacques Dutronc or Karl Zéro when speaking French, but I was motivated to reach for those goals because I believed I could sound kind of like them.

Thinking back on my speech role models for French, and even for my native English, it was clear that my unique voice is a result of having a diversity of speech role models, and my comfort with my voice was due to the fact that I had chosen all those role models. I sound like me because I sound like a combination of several people that I have admired over the years.

As language teachers, we owe it to our students not to turn them into Javier Bardem clones, or to discourage those who feel like they could never be Bardem. John Murphy’s study of reactions to Bardem is valuable because it establishes that a non-native speaker can be an acceptable role model, but we can’t stop at him, or even at the other fourteen that Murphy lists in his Appendix A.

With sites like YouTube at their fingertips, students have access to millions of non-native English speakers. We need to give them the opportunity to choose several non-native speakers, and be prepared to evaluate those speakers as potential role models, so that they can sound like their unique selves, but speaking clear, fluent English (or French or Hmong or whatever).

Non-native speech role models

In a recent post, I talked about using speech role models to teach English as a Second Language (ESL). In my class at Saint John’s University I told my students to find a native English speaker that they admired and wanted to sound like, but some of the students seemed discouraged and the distance between their accents and the accents of their role models was very large. I guessed that they may have felt that the gap was insurmountable.

I wondered if non-native English speakers might make better role models, so I asked the students to find online video clips of people who were from their country and native speakers of their own language, and who they felt spoke English well. For examples, I showed them clips of interviews with native English speakers speaking other languages, like New York Mayor Michael Bloomberg in Spanish (this was before the El Bloombito nastiness, which deserves its own post) and John Beyrle, then US Ambassador to Russia in Russian.

The students’ answers revealed two problems with the assignment. The first was that some of the speakers were too good, and for a specific reason: they had the unfair advantage of living in the United States as teenagers, which made them almost native speakers. Some, like boxer Oscar de la Hoya, were from immigrant families. Others, like tennis player Maria Sharapova, were sports stars who moved to the US as teenagers for training camps. The English of these role models was as inaccessible to my students as those of people who had lived in the US their entire lives.

The second problem was that it was simply hard to find examples of non-native speakers with accents who were not stigmatized. Some of my students found good examples: Columbian singer Shakira, Russian tennis player Elena Dementieva, Chinese television presenter Rui Chenggang, Serbian tennis player Jelena Jankovic and Chinese basketball star Yao Ming. For students who were unable to find an acceptable role model, I found UN Secretary General Ban Ki-Moon from Korea, Salvadoran computer scientist Luis von Ahn and Mexican film director Guillermo del Toro. The students readily accepted these speakers as role models.

I followed this up with transcription tasks and two further assignments: “Your Second Speech Role Model’s Accent,” where the students identified a feature of their role model that marked them as non-native, and “Outdo your Second Speech Role Model,” where the students recorded themselves trying to say the same sentences without that marked feature. I have the impression that this was valuable for the students, but I did not have a chance to study it systematically.

In my own searches, I came to appreciate the difficulty of finding good non-native role models, and of second language acquisition in general. I was simply unable to find a single non-native speaker who had achieved nativelike pronunciation in English without being immersed in English during the critical period of adolescence. Discussions with other ESL faculty confirmed this. I had already prioritized clarity over correctness, and this confirmed that I was on the right track. I took this into account when grading the students’ in-class presentations and assignments.

While it is difficult to find non-native speakers who express themselves clearly in English and have prestige, the existence of people like Yao Ming, Guillermo del Toro and Ban Ki-Moon shows that they are out there. It would be valuable to introduce non-native role models like these earlier, to help the students with setting goals and to give them perspective on the second language enterprise.

I was a bit disturbed by the term “cloning” coined by Joanne Kenworthy and Jennifer Jenkins and used by Robin Walker, because to me it implies copying another person’s accent wholesale, leading me to imagine an ESL program where every graduate sounds like Javier Bardem. There are two elements that can counteract this: having a variety of role models and allowing the students to participate as much as possible in choosing their role models. I’ll talk about those more in a future post.

Speech role models

John Murphy of Georgia State published an article about using non-native speakers, and specifically the Spanish actor Javier Bardem, as models for teaching English as a Second Language (ESL) or as a foreign language (EFL). Mura Nava tweeted a blog post from Robin Walker connecting Murphy’s work to similar work by Kenworthy and Jenkins, Peter Roach and others. I tried something like this when I taught ESL back in 2010, more or less unaware of all the previous work that Murphy cites, and Mura Nava was interested to know how it went, so here’s the first part of a quick write-up.

When I was asked to teach a class in ESL Speech “Advanced Oral/Aural Communication” at Saint John’s University in the fall of 2010, I had taught French and Linguistics, but I had only tutored English one-on-one. My wife is an experienced professor of ESL and was a valuable source of advice, but our student populations and our goals were different, so I did not simply copy her methods.

One concept that I introduced was that of a Speech Role Model. When I was learning French, I found it invaluable to imitate entertainers; I’ve never met Jacques Dutronc, but I often say that he was one of my best French teachers because of the clever lyricists he worked with and his clear, wry delivery. He was just one of the many French people that I imitated to improve my pronunciation.

This was all back in the days of television and cassettes, and most of the French culture that we had access to here in the United States was filtered through the wine, Proust and Rohmer tastes of American Francophiles. As a geeky kid with a fondness for comedy I found Edith Piaf and even Gérard Depardieu too alien to emulate. I found out about Dutronc in college through a bootleg tape made for me by a student from France who lived down the hall, and then I had to study abroad in France to find more role models.

With today’s multimedia Internet technology, we have an incredible the ability to listen to millions of people from around the world. At Saint John’s I asked my students to choose a Speech Role Model for English: a native speaker that they personally admired and wanted to sound like. I was surprised by the number of students who named President Obama as their role model, including female students from China, but on reflection it was an obvious choice, as he is a clear, forceful and eloquent speaker. Other students chose actresses Meryl Streep and Jennifer Anniston, talk-show host Bill O’Reilly and local newscaster Pat Kiernan.

One notable choice, hip-hop artist Eminem, gave me the opportunity to discuss covert prestige and its challenges. Another, the character of Sheldon Cooper from the television series “The Big Bang Theory,” was too scripted, and I was debating whether to accept it when I discovered that it was just a cover so that the student could plagiarize crowdsourced transcriptions.

In subsequent assignments I asked the students to find a YouTube video of their role model and to transcribe a short excerpt. I then asked the students to record themselves imitating that excerpt from their Speech Role Models. Some of the students were engaged and interested, but others seemed frustrated and discouraged. When I listened to my students and comparing their speech to their chosen role models, I had an idea why. The students who were engaged were either naturally enthusiastic or good mimics, but the challenge was to motivate the others. There was so much distance between them and the native English speakers, much more than could be covered in a semester. That was when I thought of adding a non-native Second Speech Role Model. I’ll have to leave that for another post.