I was inspired by two tweets I saw within minutes of each other on July Fourth. First, Médéric Gasquet-Cyrus, a professor at Aix-Marseille, posted a picture of his colleague Pascal Roméas wearing a “triangle vocalique” T-shirt designed by the linguistics YouTuber Romain Filstroff, known as Linguisticae. Gasquet-Cyrus’s tweet translates to “When you eat out with a phonetician colleague, you get a chance to practice your vowel quadrilateral!”
The vowel quadrilateral is one of the great data visualizations of linguistics: a two-dimensional diagram of the tongue height and position assigned to the vowel symbols of the Interneational Phonetic Alphabet, as viewed from the left side of the face. It is also known as the vowel triangle, depending on how much wiggle room you think people have for their tongues when their mouths are fully open. It can even be plotted based on the formant frequencies extracted from acoustic analysis.
Seeing the two pictures one after the other, I realized that rather than a random grid, I could put a vowel quadrilateral on an IPA mask. Then I realized that if I placed the quadrilateral on one side, I could get it to line up with the wearer’s mouth. I also had to make a corresponding chart for the right side.
I decided that I wanted the money to go to a charity that was helping with COVID-19. Doctors Without Borders has been doing good work around the world for years, and with COVID they’ve really stepped up. Here in New York they provided support to several local organizations and operated two shower trailers in Manhattan at the height of the outbreak.
From July 16 through 29 I ran a fundraiser through Custom Ink where we raised $310 in profits for Doctors Without Borders, and masks were sent to 23 supporters.
On November 27 I started a new Custom Ink fundraiser. It will run through December 25, and if at least twenty people buy masks by then, they will be distributed in early January. The masks are soft and comfortable; I’ve worn mine almost every day since I got them in August!
When I was teaching introductory linguistics, I had a problem with the phonetic transcription exercises in the textbooks I was using: they asked students to transcribe “the pronunciation” of individual words – implying that there is a single correct pronunciation with a single correct transcription. I worked around it in face-to-face classes by hearing the students’ accents and asking them to pronounce any words if their transcriptions differed from what I expected. I was also able to illustrate the pronunciation of various IPA symbols by pronouncing the sounds in class.
In the summer of 2013 I taught linguistics online for the first time, and it was much more difficult to give students a sense of the sounds I expected them to produce, and to get a sense of the sounds they associated with particular symbols. On top of that I discovered I had another challenge: I couldn’t trust these students to do the work if the answers were available anywhere online. Some of them would google the questions, find the answers, copy and paste. Homework done!
Summer courses move so fast that I wasn’t able to change the exercises until it was too late. In the fall of 2014 I taught the course again, and created several new exercises. I realized that there was now a huge wealth of speech data available online, in the form of streaming and downloadable audio, created for entertainment, education and archives. I chose a podcast episode that seemed relatively interesting and asked my students to transcribe specific words and phrases.
It immediately became clear to me that instead of listening to the sounds and using Richard Ishida’s IPA Picker or another tool to transcribe what they heard, the students were listening to the words, looking them up one by one in the dictionary, and copying and pasting word transcriptions. In some cases Roman Mars’s pronunciations were different from the dictionary transcriptions, but they were close enough that my low grades felt like quibbling to them.
I tried a different strategy: I noticed that another reporter on the podcast, Joel Werner, spoke with an Australian accent, so I asked the students to transcribe his speech. They began to understand: “Professor, do we still have to transcribe the entire word even though a letter from the word may not be pronounced due to an accent?” asked one student. Others noticed that the long vowels were shifted relative to American pronunciations.
For tests and quizzes, I found that I could make excerpts of sound and video files using editing software like Audacity and Microsoft Movie Maker. That allowed me to isolate particular words or groups of words so that the students didn’t waste time locating content in a three-minute video, or a twenty-minute podcast.
This still left a problem: how much detail were the students expected to include, and how could I specify that for them in the instructions? Back in 2013, in a unit on language variation, I had used accent tag videos to replace the hierarchy implied in most discussions of accents with a more explicit, less judgmental contrast between “sounds like me” and “sounds different.” I realized that the accent tags were also good for transcription practice, because they contained multiple pronunciations of words that differed in socially meaningful ways – in fact, the very purpose that phonetic transcription was invented for. Phonetic transcription is a tool for talking about differences in pronunciation.
The following semester, Spring 2015, I created a “Comparing Accents” assignment, where I gave the students links to excerpts of two accent tag videos, containing the word list segment of the accent tag task. I then asked them to find pairs of words that the two speakers pronounced differently and transcribe them in ways that highlighted the differences. To give them practice reading IPA notation, I gave them transcriptions and asked them to upload recordings of themselves pronouncing the transcriptions.
I was pleased to find that I actually could teach phonetic transcription online, and even write tests that assessed the students’ abilities to transcribe, thanks to accent tag videos and the principle that transcription is about communicating differences.
When I first studied phonetic transcription I learned about broad and narrow transcription, where narrow transcription contains much more detail, like the presence of aspiration on consonants and fine distinctions of tongue height. Of course it makes sense that you wouldn’t always want to go into such detail, but at the time I didn’t think about what detail was excluded from broad transcription and why.
In phonology we learned about phonemes, and how phoneme categories glossed over many of those same details that were excluded from broad transcription. For reasons I never quite grasped, though, we were told that phonemic transcription was a very different thing from broad transcription, and we were not to confuse them. Okay.
I got a better explanation from my first phonetics professor, Jacques Filliolet, who used three levels of analysis: niveau généralisant, niveau pertinent and niveau particularisant. We can translate them as general, specific and detailed levels.
When I started teaching phonology, I realized that the broad vs. narrow distinction did not reflect what I read in books and papers and saw at conferences. When people are actually using phonetic transcription there is no consistent set of features that they leave out or include.
What people do instead is include the relevant features and leave out the irrelevant ones. Which features are relevant depends on the topic of discussion. If it’s a paper about aspiration, or a paper about variation where aspiration may or may not be relevant, they will include aspiration. If it isn’t, they won’t.
I realized that sometimes linguists need to go into more detail than phonetic transcription can easily handle, so they use even finer-grained representations like formant frequencies, gestural scores and voice onset times.
Recently I realized that this just means phonetic transcription is a form of communication. In all forms of communication we adjust the level of detail we provide to convey the relevant information to our audience and leave out the irrelevant parts.
Phonemes are another, more organic way that we do this. This explains why phonemic transcription is not the same as broad transcription: we often want to talk about what sounds go into a phoneme without adding other details. For example, we may want to talk about how English /t/ typically includes both aspirated and unaspirated stops, without talking about fundamental frequency or lip closure.
Another possible translation of Filliolet’s niveau pertinent is “the appropriate level.” This is really what we’re all aiming for: the level of detail that is most appropriate for the circumstances.
Finding the right level of detail for phonetic transcription is actually not hard for students to learn; they do it all the time in regular language. The simplest way to teach it is to give the students assignments that require a particular level of detail.
Students are sometimes frustrated that there is not a single way to transcribe a given utterance. In addition to these differences of level of description, there are stylistic differences: do you write [r] instead of [ɹ] for an English bunched /r/?
Of course the International Phonetic Alphabet was sold as just such a consistent system: one symbol for one sound, in contrast with the messy reality of writing systems. To me this feels very Modernist and Utopian, and it is no accident that it was invented at the same time as other big modernist projects like Esperanto, Principia Mathematica, and International Style architecture.
The IPA falls short of the ideal consistent representation that was sold to people, but has largely succeeded in providing enough consistency, and keeping enough of the mess at bay, for specific purposes like documenting language variation and language acquisition.
When I first taught phonetic transcription, almost seven years ago, I taught it almost the same way I had learned it twenty-five years ago. Today, the way I teach it is radically different. The story of the change is actually two stories intertwined. One is a story of how I’ve adopted my teaching to the radical changes in technology that occurred in the previous eighteen years. The other is a story of the more subtle evolution of my understanding of phonetics, phonology, phonological variation and the phonetic transcription that allows us to talk about them.
When I took Introduction to Linguistics in 1990 all the materials we had were pencil, paper, two textbooks and the ability of the professor to produce unusual sounds. In 2007 and even today, the textbooks have the same exercises: Read this phonetic transcription, figure out which English words were involved, and write the words in regular orthography. Read these words in English orthography and transcribe the way you pronounce them. Transcribe in broad and narrow transcription.
The first challenge was moving the homework online. I already assigned all the homework and posted all the grades online, and required my students to submit most of the assignments online; that had drastically reduced the amount of paper I had to collect and distribute in class and schlep back and forth. For this I had the advantage that tuition at Saint John’s pays for a laptop for every student. I knew that all of my students had the computing power to access the Blackboard site.
Thanks to the magic of Unicode and Richard Ishida’s IPA Picker, my students were able to submit their homework in the International Phonetic Alphabet without having to fuss with fonts and keyboard layouts. Now, with apps like the Multiling Keyboard, students can even write in the IPA on phones and tablets.
The next problem was that instead of transcribing, some students would look up the English spellings on dictionary sites, copy the standard pronunciation guides, and paste them into the submission box. Other students would give unusual transcriptions, but I couldn’t always tell whether these transcriptions reflected the students’ own pronunciations or just errors.
At first, as my professors had done, I made up for these homework shortcomings with lots of in-class exercises and drills, but they still all relied on the same principle: reading English words and transcribing them. Both in small groups and in full-class exercises, we were able to check the transcriptions and correct each other because everyone involved was listening to the same sounds. It wasn’t until I taught the course exclusively online that I realized there was another way to do it.
When I tell some people that I teach online courses, they imagine students from around the world tuning in to me lecturing at a video camera. This is not the way Saint John’s does online courses. I do create a few videos every semester, but the vast majority of the teaching I do is through social media, primarily the discussion forums on the Blackboard site connected with the course. I realized that I couldn’t teach phonetics without a way to verify that we were listening to the same sounds, and without that classroom contact I no longer had a way.
I also realized that with high-speed internet connections everywhere in the US, I had a new way to verify that we were listening to the same sounds: use a recording. When I took the graduate Introduction to Phonetics in 1993, we had to go to the lab and practice with the cassette tapes from William Smalley’s Manual of Articulatory Phonetics, but if I’m remembering right we didn’t actually do any transcription of the sounds; we just practiced listening to them and producing them. Some of us were better at that than others.
In 2015 we are floating in rivers of linguistic data. Human settlements have always been filled with the spontaneous creation of language, but we used to have to pore over their writings or rely on our untrustworthy memories. In the twentieth century we had records and tape, film and video, but so much of what was on that was scripted and rehearsed. If we could get recordings of the unscripted language it was hard to store, copy and distribute them.
Now people create language in forms that we can grab and hold: online news articles, streaming video, tweets, blog posts, YouTube videos, Facebook comments, podcasts, text messages, voice mails. A good proportion of these are even in nonstandard varieties of the language. We can read them and watch them and listen to them – and then we can reread and rewatch and relisten, we can cut and splice in seconds what would have taken hours – and then analyze them, and compare our analyses.
Instead of telling my students to read English spelling and transcribe in IPA, now I give them a link to a video. This way we’re working from the exact same sequence of sounds, a sequence that we can replay over and over again. I specifically choose pronunciations that don’t match what they find on the dictionary websites. This is precisely what the IPA is for.
Going the other way, I give my students IPA transcriptions and ask them to record themselves pronouncing the transcriptions and post it to Blackboard. Sure, my professor could have assigned us something like this in 1990, but then he would have had to take home a stack of cassettes and spend time rewinding them over and over. Now all my students have smartphones with built-in audio recording apps, and I could probably listen to all of their recordings on my own smartphone if I didn’t have my laptop handy.
So that’s the story about technology and phonetic transcription. Stay tuned for the other story, about the purpose of phonetic transcription.