Screenshot of LanguageLab displaying the exercise "J'étais certain que j'aillais écrire à quinze ans"

Imagining an alternate language service

It’s well known that some languages have multiple national standards, to the point where you can take courses in either Brazilian or European Portuguese, for example. Most language instruction services seem to choose one variety per language: when I studied Portuguese at the University of Paris X-Nanterre it was the European variety, but the online service Duolingo only offers the Brazilian one.

I looked into some of Duolingo’s offerings for this post, because they’re the most talked about language instruction service these days. I was surprised to discover that they use no recordings of human speakers; all their speech samples are synthesized using an Amazon speech synthesis service named Polly. Interestingly, even though Duolingo only offers one variety of each language, Amazon Polly offers multiple varieties of English, Spanish, Portuguese and French.

As an aside, when I first tried Duolingo years ago I had the thought, “Wait, is this synthesized?” but it just seemed too outrageous to think that someone would make a business out of teaching humans to talk like statistical models of corpus speech. It turns out it wasn’t too outrageous, and I’m still thinking through the implications of that.

Synthesized or not, it makes sense for a company with finite resources to focus on one variety. But if that one company controls a commanding market share, or if there’s a significant amount of collusion or groupthink among language instruction services, they can wind up shutting out whole swathes of the world, even while claiming to be inclusive.

This is one of the reasons I created an open LanguageLab platform: to make it easier for people to build their own exercises and lessons, focusing on any variety they choose. You can set up your own LanguageLab server with exercises exclusively based on recordings of the English spoken on Smith Island, Maryland (population 149), if you like.

So what about excluded varieties with a few more speakers? I made a table of all the Duolingo language offerings according to their number of English learners, along with the Amazon Polly dialect that is used on Duolingo. If the variety is only vaguely specified, I made a guess.

For each of these languages I picked another variety, one with a large number of speakers. I tried to find the variety with the largest number of speakers, but these counts are always very imprecise. The result is an imagined alternate language service, one that does not automatically privilege the speakers of the most influential variety. Here are the top ten:

Language Duolingo dialect Alternate dialect
English Midwestern US India
Spanish Mexico Argentina
French Paris Quebec
Japanese Tokyo Kagoshima
German Berlin Bavarian
Korean Seoul Pyongyang
Italian Florence Rome
Mandarin Chinese Beijing Taipei
Hindi Delhi Chhatisgarhi
Russian Moscow Almaty

To show what could be done with a little volunteer work, I created a sample lesson for a language that I know, the third-most popular language on Duolingo, French. After France, the country with the next largest number of French speakers is Canada. Canadian French is distinct in pronunciation, vocabulary and to some degree grammar.

Canadian French is stigmatized outside Canada, to the point where I’m not aware of any program in the US that teaches it, but it is omnipresent in all forms of media in Canada, and there is quite a bit of local pride. These days at least, it would be as odd for a Canadian to speak French like a Parisian as for an American to speak English like a Londoner. There are upper and lower class accents, but they all share certain features, notably the ranges of the nasal vowels.

I chose a bestselling author and television anchor, Michel Jean, who has one grandmother from the indigenous Innu people and three presumably descended from white French settlers. I took a small excerpt from an interview with Jean about his latest novel where he responds spontaneously to the questions of a librarian, Josianne Binette.

The sample lesson in Canadian French based on Michel Jean’s speech is available on the LanguageLab demo site. You are welcome to try it! Just log in with the username demo and the password LanguageLab.

The gesture location symbols of Stokoe notation, mapped onto a chart of the upper torso, arm and head

Teaching intro sign phonetics

A few years ago I wrote about incorporating sign linguistics when I taught Introduction to Linguistics at Saint John’s University. The other course I taught most often was Introduction to Phonology. This course was required for our majors in Speech Pathology and Audiology, and they often filled up the class. I never had a Deaf student, but almost all of my students expressed some level of interest in signed languages, and many had taken several semesters of American Sign Language.

The texts I used tended to devote a chapter to sign linguistics here or there, but not present it systematically or include it in general discussions. I always included those chapters, and any mention of signed languages was received enthusiastically by my students, so having a love of sign linguistics myself, I was happy to teach more.

The first thing I did was to add sign phonetics. I had previously found that I needed to start Introduction to Phonology with a comprehensive review of spoken phonetics, so I just followed that with a section on the systematic description of hand, face and upper body gestures. A lot of the spoken phonetics review was focused on phonetic transcription, and the students needed some way to keep track of the gestures they were studying, so I taught them Stokoe notation.

A list of Stokoe handshape symbols, with corresponding illustrations of the handshapes

Some of you may be remembering negative things you’ve read, or heard, or said, about Stokoe notation. It’s not perfect. But it’s granular enough for an intro phonology course, and it’s straightforward and relatively transparent. My students had no problem with it. Remember that the appropriate level of granularity depends on what you’re trying to communicate about the language.

The orientation and movement symbols from Stokoe notation, mapped onto a chart depicting the right side of a human head and attached right shoulder

I developed charts for the Stokoe symbols for locations, orientations and movements (“tab” and “sig” in Stokoe’s terminology), corresponding to the vowel quadrilateral charts developed by Pierre Delattre and others for spoken languages. To create the charts I used the StokoeTempo font that I developed back in 1995.

A list of additional movements of ASL and their symbols in Stokoe notation

The next step was to find data for students to analyze. I instructed my students to watch videos of jokes in American Sign Language posted to YouTube and Facebook by two Deaf storytellers and ASL teachers, Greg “NorthTrue” Eyben and Joseph Wheeler.

Deaf YouTuber NorthTrue makes the ASL sign for “mail”

The first exercise I gave my students was a scavenger hunt. I had previously found them to be useful in studying spoken language features at all levels of analysis. Here is a list of items I asked my students to find in one two-minute video:

  • A lexical sign
  • A point
  • A gesture depicting movement or location
  • An iconic gesture miming a person’s hand movement
  • A nonmanual miming a person’s emotion
  • A grammatical nonmanual indicating question, role shifting or topic

The students did well on the exercises, whether in class, for homework or for exams. Unfortunately that was pretty much all that I was able to develop during the years I taught Introduction to Phonology.

There is one more exercise I created using sign phonology; I will write about that in a future post.

How to set up your own LanguageLab

I’ve got great news! I have now released LanguageLab, my free, open-source software for learning languages and music, to the public on GitHub.

I wish I could tell you I’ve got a public site up that you can all use for free. Unfortunately, the features that would make LanguageLab easy for multiple users to share one server are later in the roadmap. There are a few other issues that also stand in the way of a massive public service. But you can set up your own server!

I’ve documented the steps in the README file, but here’s an overview. You don’t need to know how to program, but you will need to know how to set up web services, retrieve files from GitHub, edit configuration files, and run a few commands at a Linux/MacOS/DOS prompt.

LanguageLab uses Django, one of the most popular web frameworks for Python, and React, one of the most popular frameworks for Javascript. All you need is a server that can run Django and host some Javascript files! I’ve been doing my development and testing on Pythonanywhere, but I’ve also set it up on Amazon Web Services, and you should be able to run it on Google Cloud, Microsoft Azure, a University web server or even your personal computer.

There are guides online for setting up Django in all those environments. Once you’ve got a basic Django setup installed, you’ll need to clone the LanguageLab repo from GitHub to a place where it can be read by your web server. Then you’ll configure it to access the database, and configure the web server to load it. You’ll use Pip and NPM to download the Python and Javascript libraries you need, like the Django REST Framework, React and the Open Iconic font. Finally, you’ll copy all the files into the right places for the web server to read them and restart the server.

Once you’ve got everything in place, you should be able to log in! You can make multiple accounts, but keep in mind that at this point we do not have account-level access, so all accounts have full access to all the data. You can then start building your library of languages, media, exercises and lessons. LanguageLab comes with the most widely used languages, but it’s easy to set up new ones if yours are not on the list.

Media can be a bit tricky, because LanguageLab is not a media server. You can upload your media to another place on your server, or any other server – as long as it’s got an HTTPS URL you should be able to use it. If the media you’re using is copyrighted you may want to set up some basic password protection to avoid any accusations of piracy. I use a simple .htaccess password. I have to log in every time, but it works.

With the URL of your media file, you can create a media entry. Just paste that URL into the form and add metadata to keep track of the file and what it can be used for. You can then set up one or more exercises based on particular segments of that media file. It may take a little trial and error to get the exercises right.

You can then create one or more lessons to organize your exercises. You can choose to have a lesson for all the exercises in a particular media file, or you can combine exercises from multiple media files in a lesson. It’s up to you how to organize the lessons. You can edit the queues for each lesson to reorder or remove exercises.

Once you’ve got exercises, you can start practicing! The principle is simple: listen to the model, repeat into the microphone, then listen to the model again, followed by your recording. Set yourself a goal of a.certain number of repetitions per session.

After you’ve created your language and media entries, exercises and lessons, you can export the data. Importing the data is not yet implemented, but the data is exported to a human-readable JSON format that you can then recreate if necessary.

In the near future I will go on Twitch to demonstrate how to set up exercises and lessons, and how to practice with them. I will also try to find time to demonstrate the installation process. I will record each demonstration and put it on YouTube for your future reference. You can follow me on Twitter to find out when I’m doing the demos and posting the videos.

If you try setting up a LanguageLab, please let me know how it goes! You can report bugs by creating incidents on GitHub, or you can send me an email. I’m happy to hear about problems, but I’d also like to hear success stories! And if you know some Python or Javascript, please consider writing a little code to help me add one of the features in the roadmap!

A free, open source language lab app

Viewers of the Crown may have noticed a brief scene where Prince Charles practices Welsh by sitting in a glass cubicle wearing a headset.  Some viewers may recognize that as a language lab. Some may have even used language labs themselves.

The core of the language lab technique is language drills, which are based on the bedrock of all skills training: mimicry, feedback and repetition.  An instructor can identify areas for the learner to focus on.

Because it’s hard for us to hear our own speech, the instructor also can observe things in the learner’s voice that the learner may not perceive.  Recording technology enabled the learner to take on some of the role of observer more directly.

When I used a language lab to learn Portuguese in college, it ran on cassette tapes.  The lab station played the model (I can still remember “Elena, estudante francesa, vai passar as ferias em Portugal…“), then it recorded my attempted mimicry onto a blank cassette.  Once I was done recording it played back the model, followed by my own recording.

Hearing my voice repeated back to me after the model helped me judge for myself how well I had mimicked the model.  It wasn’t enough by itself, so the lab instructor had a master station where he could listen in on any of us and provide additional feedback.  We also had classroom lessons with an instructor, and weekly lectures on culture and grammar.

There are several companies that have brought language lab technology into the digital age, on CD-ROM and then over the internet.  Many online language learning providers rely on proprietary software and closed platforms to generate revenue, which is fine for them but doesn’t allow teachers the flexibility to add new language varieties.

People have petitioned these language learning companies to offer new languages, but developing offerings for a new language is expensive.  If a language has a small user base it may never generate enough revenue to offset the cost of developing the lessons.  It would effectively be a donation to people who want to promote these languages, and these companies are for profit entities.

Duolingo has offered a work-around to this closed system: they will accept materials developed by volunteers according to their specifications and freely donated.  Anyone who remembers the Internet Movie Database before it was sold to Amazon can identify the problems with this arrangement: what happens to those submissions if Duolingo goes bankrupt, or simply decides not to support them anymore?

Closed systems raise another issue: who decides what it means to learn French, or Hindi?  This has been discussed in the context of Duolingo, which chose to teach the artificial Modern Standard Arabic rather than a colloquial dialect or the classical language of the Qur’an.  Similarly, activists for the Hawai’ian language wanted the company to focus on lessons to encourage Hawai’ians to speak the language, rather than tourists who might visit for a few weeks at most.

Years ago I realized that we could make a free, open-source language lab application.  It wouldn’t have to replicate all the features of the commercial apps, especially not initially.  An app would be valuable if it offers the basic language lab functionality: play a model, record the learner’s mimicry, play the model again and finally play the recording of the learner.

An open system would be able to use any recording that the device can play.  This would allow learners to choose the models they practice with, or allow an instructor to choose models for their students.  The lessons don’t have to be professionally produced.  They can be created for a single student, or even for a single occasion.  I am not a lawyer, but I believe they can even use copyrighted materials.

I have created a language lab app using the Django Rest Framework and ReactJS that provides basic language lab functionality.  It runs in a web browser using responsive layout, and I have successfully tested it in Chrome and Firefox, on Windows and Android.

This openness and flexibility drastically reduces the cost of producing a lesson.  The initial code can be installed in an hour, on any server that can host Django.  The monthly cost of hosting code and media can be under $25.  Once this is set up, a media item and several exercises based on it can be added in five minutes.

This reduced cost means that a language does not have to bring in enough learners to recoup a heavy investment.  That in turn means that teachers can create lessons for every dialect of Arabic, or in fact for every dialect of English.  They can create Hawai’ian lessons for both tourists and heritage speakers.  They could even create lessons for actors to learn dialects, or master impressions of celebrities.

As a transgender person I’ve long been interested in developing a feminine voice to match my feminine visual image.  Gender differences in language include voice quality, pitch contour, rhythm and word choice – areas that can only be changed through experience.  I have used the alpha and beta versions of my app to create exercises for practicing these differences.

Another area where it helps a learner to hear a recording of their own voice is singing.  This could be used by professional singers or amateurs.  It could even be used for instrument practice.  I use it to improve my karaoke!

This week I was proud to present my work at the QueensJS meetup.  My slides from that talk contain more technical details about how to record audio through the web browser.  I’ll be pushing my source to GitHub soon. You can read more details about how to set up and use LanguageLab.  In the meantime, if you’d like to contribute, or to help with beta testing, please get in touch!

Flu. What is Thisby? A wandring knight? Quin. It is the Lady, that Pyramus must loue. Fl. Nay faith: let not me play a wom?: I haue a beard c?-(ming. Quin. Thats all one: you shall play it in a Maske: and you may speake as small as you will. Bott. And I may hide my face, let me play Thisby to: Ile speake in a monstrous little voice; Thisne, Thisne, ah Py-, ramus my louer deare, thy Thysby deare, & Lady deare. Qu. No, no: you must play Pyramus: & Flute, you Thysby.

The History of English through SparkNotes

Language change has been the focus of my research for over twenty years now, so when I taught second semester linguistics at Saint John’s University, I was very much looking forward to teaching a unit focused on change.  I had been working to replace constructed examples with real data, so I took a tip from my natural language processing colleague Dr. Wei Xu and turned to SparkNotes.

I first encountered SparkNotes when I was teaching French Language and Culture, and I assigned all of my students to write a book report on a work of French literature, or a book about French language or culture.  I don’t remember the details, but at times I had reason to suspect that one or another of my students was copying summary or commentary information about their chosen book from SparkNotes rather than writing their own.

When I was in high school, my classmates would make use of similar information for their book reports.  The rule was that you could consult the Cliffs Notes for help understanding the text, but you weren’t allowed to simply copy the Cliffs Notes.

Modern Text

FLUTE
Who’s Thisbe? A knight on a quest?

QUINCE
Thisbe is the lady Pyramus is in love with.

FLUTE
No, come on, don’t make me play a woman. I’m growing a beard.

QUINCE
That doesn’t matter. You’ll wear a mask, and you can make your voice as high as you want to.

BOTTOM
In that case, if I can wear a mask, let me play Thisbe too! I’ll be Pyramus first: “Thisne, Thisne!”—And then in falsetto: “Ah, Pyramus, my dear lover! I’m your dear Thisbe, your dear lady!”

QUINCE
No, no. Bottom, you’re Pyramus.—And Flute, you’re Thisbe.

When I discovered SparkNotes I noticed that for some older authors – Shakespeare, of course, but even Dickens – they not only offered summaries and commentary, but translations of the text into contemporary English.  It was this feature I drew on for the unit on language change.

While I was developing and teaching this second semester intro linguistics course at Saint John’s, I was also working as a linguistic annotator for an information extraction project in the NYU Computer Science Department.  I met a doctoral student, Wei Xu, who was studying a number of interesting corpora, including Twitter, hip-hop and SparkNotes. Wei graduated in 2014, and is now Assistant Professor of Computer Science and Engineering at Ohio State.

Wei had realized that the modern translations on SparkNotes and eNotes, combined with the original Shakespearean text, formed a parallel corpus, a collection of texts in one language variety that are paired with translations in another language variety.  Parallel corpora, like the Canadian Hansard Corpus of French and English parliamentary debates, are used in translation studies, including for training machine translation software. Wei used the SparkNotes/eNotes parallel Shakespeare corpus to generate Shakespearean-style paraphrases of contemporary movie lines, among other things.

When it came time to teach the unit on language change at Saint John’s, I found a few small exercises that asked students to compare older literary excerpts with modern translations.  Given the constraints of this being one unit in a survey course, it made sense to focus on the language of instruction, English. The Language Files had one such exercise featuring a short Chaucer passage.  In general, when working with corpora I prefer to look at larger segments, ideally an entire text but at minimum a full page.

I realized that I could cover all the major areas of language change – phonological, morphological, syntactic, semantic and pragmatic – with these texts.  Linguists have been able to identify phonological changes from changes in spelling, for example that Chaucer’s spelling of “when” as “whan” indicates that we typically put our tongues in a higher place in our mouths when pronouncing the vowel of that word than people did in the fourteenth century.

When teaching Shakespeare to college students it is common to use texts with standardized spelling, but we now have access to scans of Shakespeare’s work as it was first published in his lifetime or shortly after his death, with the spellings chosen by those printers.  This spelling modernization is even practiced with some nineteenth century authors, and similarly we have access to the first editions of most words through digitization projects like Google Books.

With this in mind, I created exercises to explore language change.  For a second semester intro course the students learned a lot from a simple scavenger hunt: compare a passage from the SparkNotes translation of Shakespeare with the Quarto, find five differences, and specify whether they are phonological, morphological, syntactic, semantic or pragmatic.  In more advanced courses stufents could compare differences more systematically.

This comparison is the kind of thing that we always do when we read an old text: compare older spellings and wordings with the forms we would expect from a more modern text.  Wei Xu showed us that the translations and spelling changes in SparkNotes and eNotes can be used for a more explicit comparison, because they are written down based on the translators’ and editors’ understanding of what modern students will find difficult to read.

As I have detailed in my forthcoming book, Building a Representative Theater Corpus, we must be careful not to generalize universal statements, including statements about prevalence, to the language as a whole.  This is especially problematic when we are looking at authors who appealed to elite audiences, but it applies to Shakespeare and Dickens as well.  Existential observations, such as that Shakespeare used bare not (“let me not”) in one instance where SparkNotes used do-support (“don’t let me”) are much safer.

My students seemed to learn a lot from this technique.  I hope some of you find it useful in your classrooms!

Deaf scholar Ben Bahan gives a lecture about Deaf architecture

Teaching sign linguistics in introductory classes

Language is not just spoken and written, and even though I’ve been working mostly on spoken languages for the past fifteen years, my understanding of language has been tremendously deepened by my study of sign languages. At the beginning of the semester I always asked my students what languages they had studied and what aspects of language they wanted to know more about, and they were always very interested in sign language. Since they had a professor with training and experience in sign linguistics it seemed natural to spend some time on it in class.

Our primary textbook, by George Yule,contains a decent brief overview of sign languages. The Language Files integrates sign language examples throughout and has a large section on sign phonetics. I added a lecture on the history of sign languages in Europe and North America, largely based on Lane, Hoffmeister and Bahan’s Journey Into the Deaf-World (1996), and other information I had learned over the years.

I also felt it was important for my students to actually observe a sign language being used to communicate and to express feeling, so I found an online video of an MIT lecture by psychologist and master storyteller (and co-author of Journey Into the Deaf-World) Ben Bahan. Bahan’s talk does not focus exclusively on language, but demonstrates the use of American Sign Language well, and the English interpretation is well done.

Studying a video lecture is a prime candidate for “flipped classroom” techniques, but I never got around to trying that. We watched the video in class, but before starting the video I assigned my students a simple observation task: could they find examples of the four phonological subsystems of American Sign Language – lexical signs, fingerspelling, depicting signs and nonmanual gestures?

Some of the students were completely overwhelmed by the task at first, but I made it clear that this was not a graded assignment, only introductory exploration. Other students had had a semester or more of ASL coursework, and the students with less experience were able to learn from them. Bahan, being Ben Bahan, produces many witty, thought-provoking examples of all four subsystems over the course of the lecture.

The phonological subsystems are among the easiest sign language phenomena for a novice to distinguish, but as we watched the video I pointed out other common features of ASL and other sign languages, such as topic-comment structures and stance-shifting.

Later, when I started teaching Introduction to Phonology, we had the opportunity to get deeper into sign language phonology. I’ll cover that in a future post.

Teaching with accent tags in the face-to-face classroom

In September I wrote about how I used accent tag videos to teach phonetic transcription in my online linguistics classes. Since I could not be there in person, the videos provided a stable reference that we could all refer to from our computers around the country. Having two pronunciations to compare drew the students’ attention to the differences between them – one of the major reasons phonetic transcription was invented – and the most natural level of detail to include in the answer.

In the Fall of 2015 I was back in the classroom teaching Introduction to Phonology, and I realized that those features – a stable reference and multiple pronunciations of the same word with small differences – were also valuable when we were all in the same room. I used accent tag clips in exercises on transcription and other skills, such as identifying phonetic traits like tongue height and frication.

One of my students, Alice Nkanga, pointed out a feature of YouTube that I wasn’t aware of before: you can adjust the speed of playback down to one-quarter speed, and it auto-corrects the pitch, which can help with transcription.

After reading my previous post another linguist, Jessi Grieser, said that she liked the idea, so I shared some of my clips with her. She used them in her class, including a clip I made contrasting two African American women – one from Chicago and one from New York – saying the word “oil.”

Grieser reported, “this went excellently! It really helped hammer home the idea that there isn’t a ‘right’ way to transcribe a word based on its orthography–that what we’re really looking for is a transcription which captures what the speaker did. They really had fun with ‘oil’ since many of them are /AHL/ or /UHL/ speakers themselves. It was a really great discussion starter for our second day of transcription. This is a genius idea.”

It makes me really happy to know that other people find this technique useful in their classrooms, because I was so excited when I came up with it. I would make the clips available to the public, even at no charge, but I’m not sure about the rights because I did not make the original accent tag videos. I hope you’ll all make your own, though – it’s not that hard!

And if you teach sign linguistics in your introductory courses, or are considering it, you might be interested in reading about similar techniques I used for teaching students to analyze and transcribe sign languages!

Teaching phonetic transcription online

When I was teaching introductory linguistics, I had a problem with the phonetic transcription exercises in the textbooks I was using: they asked students to transcribe “the pronunciation” of individual words – implying that there is a single correct pronunciation with a single correct transcription. I worked around it in face-to-face classes by hearing the students’ accents and asking them to pronounce any words if their transcriptions differed from what I expected. I was also able to illustrate the pronunciation of various IPA symbols by pronouncing the sounds in class.

In the summer of 2013 I taught linguistics online for the first time, and it was much more difficult to give students a sense of the sounds I expected them to produce, and to get a sense of the sounds they associated with particular symbols. On top of that I discovered I had another challenge: I couldn’t trust these students to do the work if the answers were available anywhere online. Some of them would google the questions, find the answers, copy and paste. Homework done!

Summer courses move so fast that I wasn’t able to change the exercises until it was too late. In the fall of 2014 I taught the course again, and created several new exercises. I realized that there was now a huge wealth of speech data available online, in the form of streaming and downloadable audio, created for entertainment, education and archives. I chose a podcast episode that seemed relatively interesting and asked my students to transcribe specific words and phrases.

It immediately became clear to me that instead of listening to the sounds and using Richard Ishida’s IPA Picker or another tool to transcribe what they heard, the students were listening to the words, looking them up one by one in the dictionary, and copying and pasting word transcriptions. In some cases Roman Mars’s pronunciations were different from the dictionary transcriptions, but they were close enough that my low grades felt like quibbling to them.

I tried a different strategy: I noticed that another reporter on the podcast, Joel Werner, spoke with an Australian accent, so I asked the students to transcribe his speech. They began to understand: “Professor, do we still have to transcribe the entire word even though a letter from the word may not be pronounced due to an accent?” asked one student. Others noticed that the long vowels were shifted relative to American pronunciations.

For tests and quizzes, I found that I could make excerpts of sound and video files using editing software like Audacity and Microsoft Movie Maker. That allowed me to isolate particular words or groups of words so that the students didn’t waste time locating content in a three-minute video, or a twenty-minute podcast.

This still left a problem: how much detail were the students expected to include, and how could I specify that for them in the instructions? Back in 2013, in a unit on language variation, I had used accent tag videos to replace the hierarchy implied in most discussions of accents with a more explicit, less judgmental contrast between “sounds like me” and “sounds different.” I realized that the accent tags were also good for transcription practice, because they contained multiple pronunciations of words that differed in socially meaningful ways – in fact, the very purpose that phonetic transcription was invented for. Phonetic transcription is a tool for talking about differences in pronunciation.

The following semester, Spring 2015, I created a “Comparing Accents” assignment, where I gave the students links to excerpts of two accent tag videos, containing the word list segment of the accent tag task. I then asked them to find pairs of words that the two speakers pronounced differently and transcribe them in ways that highlighted the differences. To give them practice reading IPA notation, I gave them transcriptions and asked them to upload recordings of themselves pronouncing the transcriptions.

I was pleased to find that I actually could teach phonetic transcription online, and even write tests that assessed the students’ abilities to transcribe, thanks to accent tag videos and the principle that transcription is about communicating differences.

I also found these techniques to be useful for teaching other aspects of linguistics, such as language variation, and for teaching in face-to-face courses.

Teaching language variation with accent tag videos

Last January I wrote that the purpose of phonetic transcription is to talk about differences in pronunciation. Last December I introduced accent tags, a fascinating genre of self-produced YouTube videos of crowdsourced dialectology and a great source of data about language variation. I put these together when I was teaching a unit on language variation for the second-semester Survey of Linguistics course at Saint John’s University. When I learned about language variation as an undergraduate, it was exciting to see accents as a legitimate object of study, and it was gratifying to see my family’s accents taken seriously.

At the same time, the focus on a single dialect at a time contrasts with the absence of variation from the discussion of English pronunciation, grammar and lexis in other units, and in the rest of the way English is typically taught. This implies that there is a single standard that does not vary, despite evidence from perceptual dialectology (such as Dennis Preston’s work) that language norms are fragmentary, incomplete and contested. I saw the cumulative effects of this devaluation in class discussions, when students openly denigrated features of the New York accents spoken by their neighbors, their families and often the students themselves.

At first I just wanted to illustrate variation in African American accents, but then I realized that the accent tags allowed me to set up the exercises as an explicit contrast between two varieties. I asked my students to search YouTube to find an accent tag that “sounds like you,” and one that sounded different, and to find differences between the two in pronunciation, vocabulary and grammar. I followed up on this exercise with other ones asking students to compare two accent tags from the same place but with different ethnic, economic or gender backgrounds.

My students did a great job at finding videos that sounded like them. Most of them were from the New York area, and were able to find accent tags made by people from New York City, Long Island or northern New Jersey. Some students were African American or Latin American, and were able to find videos that demonstrated the accents, vocabulary and grammar common among those groups. The rest of the New York students did not have any features that we noticed as ethnic markers, and whether the students were Indian, Irish or Circassian, they were satisfied that the Italian or Jewish speakers in the videos sounded pretty much like them.

Some of the students were from other parts of the country, and found accent tags from California or Boston that illustrated features that the students shared. A student from Zimbabwe who is bilingual in English and Shona was not able to find any accent tags from her country, but she found a video made by a white South African and was able to identify features of English pronunciation, vocabulary and grammar that they shared.

As I wrote last year, the phonetic transcription exercises I had done in introductory linguistics and phonology courses were difficult because they implicitly referred to unspecified standard pronunciations, leading to confusion among the students about the “right” transcriptions. In the variation unit, when I framed the exercise as an explicit comparison between something that “sounds like you” and something different, I removed the implied value judgment and replaced it with a neutral investigation of difference.

I found that this exercise was easier for the students than the standard transcription problems, because it gave them two recordings to compare instead of asking them to compare one recording against their imagination of the “correct” or “neutral” pronunciation. I realized that this could be used for the regular phonetics units as well. I’ll talk about my experiences with that in a future post.

Online learning and intellectual honesty

In January I wrote that I believe online learning is possible, but I have doubts about whether online courses are an adequate substitute for in-person college classes, let alone an improvement. One of those doubts concerns trust and intellectual honesty.

Any course is an exchange. The students pay money to the college, the instructor gets a cut, and the students get something of value in return. What that something is can be disputed. In theory, the teacher gives the students knowledge: information and skills.

In practice, some of the students actually expect to receive knowledge in exchange for their tuition. Some of them want knowledge but have gotten discouraged. Some wouldn’t mind a little knowledge, but that’s not what they’re there for. Others just have no time for actual learning.

If they’re not there for knowledge, why are they there? For credentials. They want a degree, and the things that go with a degree and make it more valuable for getting a good job: a major, a course list, good grades, letters of recommendation, connections.

If learning is not important, or if the credentials are urgent enough, it is tempting to skip the learning, just going through the motions. That means pretending to learn, or pretending that you learned more than you did. Most teachers have encountered this attitude at some point.

I have seen various manifestations of the impulse to cheat in every class I’ve taught over the years. Some people might be tempted to treat it like any other transaction. It is hard to make a living while being completely ethical. I fought it for several reasons.

First, I genuinely enjoy learning and I love studying languages, and I want to share that enjoyment and passion with my students. Second, many of my students have been speech pathology majors. I have experienced speech pathology that was not informed by linguistics, and I know that a person who doesn’t take linguistics seriously is not fit to be a speech pathologist.

If that wasn’t enough, I was simply not getting paid enough to tolerate cheating. At the wages of an adjunct professor, I wasn’t in it for the money. I was doing it to pass on my knowledge and gain experience, and looking the other way while students cheated was not the kind of experience I signed up for.

I’ve seen varying degrees of dishonesty in my years of teaching. In one French class, a student tried to hand in an essay in Spanish; in his haste he had chosen the wrong option on the machine translation app. I developed strategies for deterring cheating, such as multiple drafts and a focus on proper citation. But I was not prepared for how much cheating I would find when I taught an online course.

The most effective deterrent was simply to get multiple examples of a student’s work: in class discussions, in small group work, in homeworks and on exams. That allowed me to spot inconsistent quality that might turn out to be plagiarism.

In these introductory linguistics courses, the homeworks themselves were minor exercises, mainly for the students to get feedback on whether they had understood the reading. If a student skipped a reading and plagiarized the homework assignment, it would usually be obvious to both of us when we went over the material in class. That would give the student feedback so that they could change their habits before the first exam.

The first term that I taught this course online, I noticed that some students were getting all the answers right on the homeworks. I was suspicious, but I gave the students the benefit of the doubt. Maybe they had taken linguistics in high school, or read some good books.

Then I noticed that the answers were all the same, and I began to notice quirks of language that didn’t fit my students. One day I saw that the answers were all in an unusual font. I googled one of the quirky phrases and immediately found a file of answers to the questions for that chapter.

I started searching around and found answers to every homework in the textbook. These students were simply googling the questions, copying the answers, and pasting them into Blackboard. They weren’t reading and they weren’t discussing the material. And it showed in their test results. But because this was a summer course, they didn’t have time to recover, and they all got bad grades.

I understood where they were coming from. They needed to knock out this requirement for their degree. They didn’t care about linguistics, or if they did, they didn’t have time for it. They wanted to get the work out of the way for this class and then go to their job or their internship or their other classes. Maybe they wanted to go drinking, but I knew these Speech Pathology students well enough to know that they weren’t typically party animals.

I’ve had jobs where I saw shady practices and just went along with it, but in this case I couldn’t do that, for the reasons I gave above. My compensation for this work wasn’t the meager adjunct pay that was deposited in my checking account every two weeks. It was the knowledge that I had passed on some ideas about language to these students. It was also the ability to say that I had taught linguistics, and even online.

The only solution I had to the problem was to write my own homework questions, ones that could be answered online, but where the appropriate answers couldn’t be found with a simple Google search.

The next term I taught the course online I had to deal with students sharing answers – not collaborating in the groups I had carefully constructed so that the student finishing her degree in another state could learn through peer discussion, but where one student simply copied the homework her friend had done. They did it on exams too, where they were supposed to be answering the questions alone. This meant that I also had to come up with questions where the answers were individual and couldn’t be copied.

I worked hard at it. My student evaluations for the online courses were pretty bad for that first summer, and for the next term, and the one after that. But the term after that they were almost as good as the ones for my in-person courses.

Unfortunately, that’s when I had to tell my coordinator that I couldn’t teach any more online courses. Because to teach them right required a lot of time – especially if every assignment has to be protected against students googling the answers or shouting them to each other across the room.

The good news is that in this whole process I learned a ton of interesting things about language and linguistics, and how to teach them. I’ve found that many of the strategies I developed for online teaching are helpful for in-person classes. I’m planning to post about some of them in the near future.