Teaching phonetic transcription online

When I was teaching introductory linguistics, I had a problem with the phonetic transcription exercises in the textbooks I was using: they asked students to transcribe “the pronunciation” of individual words – implying that there is a single correct pronunciation with a single correct transcription. I worked around it in face-to-face classes by hearing the students’ accents and asking them to pronounce any words if their transcriptions differed from what I expected. I was also able to illustrate the pronunciation of various IPA symbols by pronouncing the sounds in class.

In the summer of 2013 I taught linguistics online for the first time, and it was much more difficult to give students a sense of the sounds I expected them to produce, and to get a sense of the sounds they associated with particular symbols. On top of that I discovered I had another challenge: I couldn’t trust these students to do the work if the answers were available anywhere online. Some of them would google the questions, find the answers, copy and paste. Homework done!

Summer courses move so fast that I wasn’t able to change the exercises until it was too late. In the fall of 2014 I taught the course again, and created several new exercises. I realized that there was now a huge wealth of speech data available online, in the form of streaming and downloadable audio, created for entertainment, education and archives. I chose a podcast episode that seemed relatively interesting and asked my students to transcribe specific words and phrases.

It immediately became clear to me that instead of listening to the sounds and using Richard Ishida’s IPA Picker or another tool to transcribe what they heard, the students were listening to the words, looking them up one by one in the dictionary, and copying and pasting word transcriptions. In some cases Roman Mars’s pronunciations were different from the dictionary transcriptions, but they were close enough that my low grades felt like quibbling to them.

I tried a different strategy: I noticed that another reporter on the podcast, Joel Werner, spoke with an Australian accent, so I asked the students to transcribe his speech. They began to understand: “Professor, do we still have to transcribe the entire word even though a letter from the word may not be pronounced due to an accent?” asked one student. Others noticed that the long vowels were shifted relative to American pronunciations.

For tests and quizzes, I found that I could make excerpts of sound and video files using editing software like Audacity and Microsoft Movie Maker. That allowed me to isolate particular words or groups of words so that the students didn’t waste time locating content in a three-minute video, or a twenty-minute podcast.

This still left a problem: how much detail were the students expected to include, and how could I specify that for them in the instructions? Back in 2013, in a unit on language variation, I had used accent tag videos to replace the hierarchy implied in most discussions of accents with a more explicit, less judgmental contrast between “sounds like me” and “sounds different.” I realized that the accent tags were also good for transcription practice, because they contained multiple pronunciations of words that differed in socially meaningful ways – in fact, the very purpose that phonetic transcription was invented for. Phonetic transcription is a tool for talking about differences in pronunciation.

The following semester, Spring 2015, I created a “Comparing Accents” assignment, where I gave the students links to excerpts of two accent tag videos, containing the word list segment of the accent tag task. I then asked them to find pairs of words that the two speakers pronounced differently and transcribe them in ways that highlighted the differences. To give them practice reading IPA notation, I gave them transcriptions and asked them to upload recordings of themselves pronouncing the transcriptions.

I was pleased to find that I actually could teach phonetic transcription online, and even write tests that assessed the students’ abilities to transcribe, thanks to accent tag videos and the principle that transcription is about communicating differences.

I found these techniques to be useful for teaching other aspects of linguistics. I’ll talk about that in future posts.

Levels of phonetic description

When I first studied phonetic transcription I learned about broad and narrow transcription, where narrow transcription contains much more detail, like the presence of aspiration on consonants and fine distinctions of tongue height. Of course it makes sense that you wouldn’t always want to go into such detail, but at the time I didn’t think about what detail was excluded from broad transcription and why.

In phonology we learned about phonemes, and how phoneme categories glossed over many of those same details that were excluded from broad transcription. For reasons I never quite grasped, though, we were told that phonemic transcription was a very different thing from broad transcription, and we were not to confuse them. Okay.

I got a better explanation from my first phonetics professor, Jacques Filliolet, who used three levels of analysis: niveau généralisant, niveau pertinent and niveau particularisant. We can translate them as general, specific and detailed levels.

When I started teaching phonology, I realized that the broad vs. narrow distinction did not reflect what I read in books and papers and saw at conferences. When people are actually using phonetic transcription there is no consistent set of features that they leave out or include.

What people do instead is include the relevant features and leave out the irrelevant ones. Which features are relevant depends on the topic of discussion. If it’s a paper about aspiration, or a paper about variation where aspiration may or may not be relevant, they will include aspiration. If it isn’t, they won’t.

I realized that sometimes linguists need to go into more detail than phonetic transcription can easily handle, so they use even finer-grained representations like formant frequencies, gestural scores and voice onset times.

Recently I realized that this just means phonetic transcription is a form of communication. In all forms of communication we adjust the level of detail we provide to convey the relevant information to our audience and leave out the irrelevant parts.

Phonemes are another, more organic way that we do this. This explains why phonemic transcription is not the same as broad transcription: we often want to talk about what sounds go into a phoneme without adding other details. For example, we may want to talk about how English /t/ typically includes both aspirated and unaspirated stops, without talking about fundamental frequency or lip closure.

Another possible translation of Filliolet’s niveau pertinent is “the appropriate level.” This is really what we’re all aiming for: the level of detail that is most appropriate for the circumstances.

Finding the right level of detail for phonetic transcription is actually not hard for students to learn; they do it all the time in regular language. The simplest way to teach it is to give the students assignments that require a particular level of detail.

Students are sometimes frustrated that there is not a single way to transcribe a given utterance. In addition to these differences of level of description, there are stylistic differences: do you write [r] instead of [ɹ] for an English bunched /r/?

Of course the International Phonetic Alphabet was sold as just such a consistent system: one symbol for one sound, in contrast with the messy reality of writing systems. To me this feels very Modernist and Utopian, and it is no accident that it was invented at the same time as other big modernist projects like Esperanto, Principia Mathematica, and International Style architecture.

The IPA falls short of the ideal consistent representation that was sold to people, but has largely succeeded in providing enough consistency, and keeping enough of the mess at bay, for specific purposes like documenting language variation and language acquisition.

Teaching phonetic transcription in the digital age

When I first taught phonetic transcription, almost seven years ago, I taught it almost the same way I had learned it twenty-five years ago. Today, the way I teach it is radically different. The story of the change is actually two stories intertwined. One is a story of how I’ve adopted my teaching to the radical changes in technology that occurred in the previous eighteen years. The other is a story of the more subtle evolution of my understanding of phonetics, phonology, phonological variation and the phonetic transcription that allows us to talk about them.

When I took Introduction to Linguistics in 1990 all the materials we had were pencil, paper, two textbooks and the ability of the professor to produce unusual sounds. In 2007 and even today, the textbooks have the same exercises: Read this phonetic transcription, figure out which English words were involved, and write the words in regular orthography. Read these words in English orthography and transcribe the way you pronounce them. Transcribe in broad and narrow transcription.

The first challenge was moving the homework online. I already assigned all the homework and posted all the grades online, and required my students to submit most of the assignments online; that had drastically reduced the amount of paper I had to collect and distribute in class and schlep back and forth. For this I had the advantage that tuition at Saint John’s pays for a laptop for every student. I knew that all of my students had the computing power to access the Blackboard site.

Thanks to the magic of Unicode and Richard Ishida’s IPA Picker, my students were able to submit their homework in the International Phonetic Alphabet without having to fuss with fonts and keyboard layouts. Now, with apps like the Multiling Keyboard, students can even write in the IPA on phones and tablets.

The next problem was that instead of transcribing, some students would look up the English spellings on dictionary sites, copy the standard pronunciation guides, and paste them into the submission box. Other students would give unusual transcriptions, but I couldn’t always tell whether these transcriptions reflected the students’ own pronunciations or just errors.

At first, as my professors had done, I made up for these homework shortcomings with lots of in-class exercises and drills, but they still all relied on the same principle: reading English words and transcribing them. Both in small groups and in full-class exercises, we were able to check the transcriptions and correct each other because everyone involved was listening to the same sounds. It wasn’t until I taught the course exclusively online that I realized there was another way to do it.

When I tell some people that I teach online courses, they imagine students from around the world tuning in to me lecturing at a video camera. This is not the way Saint John’s does online courses. I do create a few videos every semester, but the vast majority of the teaching I do is through social media, primarily the discussion forums on the Blackboard site connected with the course. I realized that I couldn’t teach phonetics without a way to verify that we were listening to the same sounds, and without that classroom contact I no longer had a way.

I also realized that with high-speed internet connections everywhere in the US, I had a new way to verify that we were listening to the same sounds: use a recording. When I took the graduate Introduction to Phonetics in 1993, we had to go to the lab and practice with the cassette tapes from William Smalley’s Manual of Articulatory Phonetics, but if I’m remembering right we didn’t actually do any transcription of the sounds; we just practiced listening to them and producing them. Some of us were better at that than others.

In 2015 we are floating in rivers of linguistic data. Human settlements have always been filled with the spontaneous creation of language, but we used to have to pore over their writings or rely on our untrustworthy memories. In the twentieth century we had records and tape, film and video, but so much of what was on that was scripted and rehearsed. If we could get recordings of the unscripted language it was hard to store, copy and distribute them.

Now people create language in forms that we can grab and hold: online news articles, streaming video, tweets, blog posts, YouTube videos, Facebook comments, podcasts, text messages, voice mails. A good proportion of these are even in nonstandard varieties of the language. We can read them and watch them and listen to them – and then we can reread and rewatch and relisten, we can cut and splice in seconds what would have taken hours – and then analyze them, and compare our analyses.

Instead of telling my students to read English spelling and transcribe in IPA, now I give them a link to a video. This way we’re working from the exact same sequence of sounds, a sequence that we can replay over and over again. I specifically choose pronunciations that don’t match what they find on the dictionary websites. This is precisely what the IPA is for.

Going the other way, I give my students IPA transcriptions and ask them to record themselves pronouncing the transcriptions and post it to Blackboard. Sure, my professor could have assigned us something like this in 1990, but then he would have had to take home a stack of cassettes and spend time rewinding them over and over. Now all my students have smartphones with built-in audio recording apps, and I could probably listen to all of their recordings on my own smartphone if I didn’t have my laptop handy.

So that’s the story about technology and phonetic transcription. Stay tuned for the other story, about the purpose of phonetic transcription.