The gesture location symbols of Stokoe notation, mapped onto a chart of the upper torso, arm and head

Teaching intro sign phonetics

A few years ago I wrote about incorporating sign linguistics when I taught Introduction to Linguistics at Saint John’s University. The other course I taught most often was Introduction to Phonology. This course was required for our majors in Speech Pathology and Audiology, and they often filled up the class. I never had a Deaf student, but almost all of my students expressed some level of interest in signed languages, and many had taken several semesters of American Sign Language.

The texts I used tended to devote a chapter to sign linguistics here or there, but not present it systematically or include it in general discussions. I always included those chapters, and any mention of signed languages was received enthusiastically by my students, so having a love of sign linguistics myself, I was happy to teach more.

The first thing I did was to add sign phonetics. I had previously found that I needed to start Introduction to Phonology with a comprehensive review of spoken phonetics, so I just followed that with a section on the systematic description of hand, face and upper body gestures. A lot of the spoken phonetics review was focused on phonetic transcription, and the students needed some way to keep track of the gestures they were studying, so I taught them Stokoe notation.

A list of Stokoe handshape symbols, with corresponding illustrations of the handshapes

Some of you may be remembering negative things you’ve read, or heard, or said, about Stokoe notation. It’s not perfect. But it’s granular enough for an intro phonology course, and it’s straightforward and relatively transparent. My students had no problem with it. Remember that the appropriate level of granularity depends on what you’re trying to communicate about the language.

The orientation and movement symbols from Stokoe notation, mapped onto a chart depicting the right side of a human head and attached right shoulder

I developed charts for the Stokoe symbols for locations, orientations and movements (“tab” and “sig” in Stokoe’s terminology), corresponding to the vowel quadrilateral charts developed by Pierre Delattre and others for spoken languages. To create the charts I used the StokoeTempo font that I developed back in 1995.

A list of additional movements of ASL and their symbols in Stokoe notation

The next step was to find data for students to analyze. I instructed my students to watch videos of jokes in American Sign Language posted to YouTube and Facebook by two Deaf storytellers and ASL teachers, Greg “NorthTrue” Eyben and Joseph Wheeler.

Deaf YouTuber NorthTrue makes the ASL sign for “mail”

The first exercise I gave my students was a scavenger hunt. I had previously found them to be useful in studying spoken language features at all levels of analysis. Here is a list of items I asked my students to find in one two-minute video:

  • A lexical sign
  • A point
  • A gesture depicting movement or location
  • An iconic gesture miming a person’s hand movement
  • A nonmanual miming a person’s emotion
  • A grammatical nonmanual indicating question, role shifting or topic

The students did well on the exercises, whether in class, for homework or for exams. Unfortunately that was pretty much all that I was able to develop during the years I taught Introduction to Phonology.

There is one more exercise I created using sign phonology; I will write about that in a future post.

How to set up your own LanguageLab

I’ve got great news! I have now released LanguageLab, my free, open-source software for learning languages and music, to the public on GitHub.

I wish I could tell you I’ve got a public site up that you can all use for free. Unfortunately, the features that would make LanguageLab easy for multiple users to share one server are later in the roadmap. There are a few other issues that also stand in the way of a massive public service. But you can set up your own server!

I’ve documented the steps in the README file, but here’s an overview. You don’t need to know how to program, but you will need to know how to set up web services, retrieve files from GitHub, edit configuration files, and run a few commands at a Linux/MacOS/DOS prompt.

LanguageLab uses Django, one of the most popular web frameworks for Python, and React, one of the most popular frameworks for Javascript. All you need is a server that can run Django and host some Javascript files! I’ve been doing my development and testing on Pythonanywhere, but I’ve also set it up on Amazon Web Services, and you should be able to run it on Google Cloud, Microsoft Azure, a University web server or even your personal computer.

There are guides online for setting up Django in all those environments. Once you’ve got a basic Django setup installed, you’ll need to clone the LanguageLab repo from GitHub to a place where it can be read by your web server. Then you’ll configure it to access the database, and configure the web server to load it. You’ll use Pip and NPM to download the Python and Javascript libraries you need, like the Django REST Framework, React and the Open Iconic font. Finally, you’ll copy all the files into the right places for the web server to read them and restart the server.

Once you’ve got everything in place, you should be able to log in! You can make multiple accounts, but keep in mind that at this point we do not have account-level access, so all accounts have full access to all the data. You can then start building your library of languages, media, exercises and lessons. LanguageLab comes with the most widely used languages, but it’s easy to set up new ones if yours are not on the list.

Media can be a bit tricky, because LanguageLab is not a media server. You can upload your media to another place on your server, or any other server – as long as it’s got an HTTPS URL you should be able to use it. If the media you’re using is copyrighted you may want to set up some basic password protection to avoid any accusations of piracy. I use a simple .htaccess password. I have to log in every time, but it works.

With the URL of your media file, you can create a media entry. Just paste that URL into the form and add metadata to keep track of the file and what it can be used for. You can then set up one or more exercises based on particular segments of that media file. It may take a little trial and error to get the exercises right.

You can then create one or more lessons to organize your exercises. You can choose to have a lesson for all the exercises in a particular media file, or you can combine exercises from multiple media files in a lesson. It’s up to you how to organize the lessons. You can edit the queues for each lesson to reorder or remove exercises.

Once you’ve got exercises, you can start practicing! The principle is simple: listen to the model, repeat into the microphone, then listen to the model again, followed by your recording. Set yourself a goal of a.certain number of repetitions per session.

After you’ve created your language and media entries, exercises and lessons, you can export the data. Importing the data is not yet implemented, but the data is exported to a human-readable JSON format that you can then recreate if necessary.

In the near future I will go on Twitch to demonstrate how to set up exercises and lessons, and how to practice with them. I will also try to find time to demonstrate the installation process. I will record each demonstration and put it on YouTube for your future reference. You can follow me on Twitter to find out when I’m doing the demos and posting the videos.

If you try setting up a LanguageLab, please let me know how it goes! You can report bugs by creating incidents on GitHub, or you can send me an email. I’m happy to hear about problems, but I’d also like to hear success stories! And if you know some Python or Javascript, please consider writing a little code to help me add one of the features in the roadmap!

A free, open source language lab app

Viewers of the Crown may have noticed a brief scene where Prince Charles practices Welsh by sitting in a glass cubicle wearing a headset.? Some viewers may recognize that as a language lab. Some may have even used language labs themselves.

The core of the language lab technique is language drills, which are based on the bedrock of all skills training: mimicry, feedback and repetition.? An instructor can identify areas for the learner to focus on.

Because it’s hard for us to hear our own speech, the instructor also can observe things in the learner’s voice that the learner may not perceive.? Recording technology enabled the learner to take on some of the role of observer more directly.

When I used a language lab to learn Portuguese in college, it ran on cassette tapes.? The lab station played the model (I can still remember “Elena, estudante francesa, vai passar as ferias em Portugal?“), then it recorded my attempted mimicry onto a blank cassette.? Once I was done recording it played back the model, followed by my own recording.

Hearing my voice repeated back to me after the model helped me judge for myself how well I had mimicked the model.? It wasn’t enough by itself, so the lab instructor had a master station where he could listen in on any of us and provide additional feedback.? We also had classroom lessons with an instructor, and weekly lectures on culture and grammar.

There are several companies that have brought language lab technology into the digital age, on CD-ROM and then over the internet.? Many online language learning providers rely on proprietary software and closed platforms to generate revenue, which is fine for them but doesn’t allow teachers the flexibility to add new language varieties.

People have petitioned these language learning companies to offer new languages, but developing offerings for a new language is expensive.? If a language has a small user base it may never generate enough revenue to offset the cost of developing the lessons.? It would effectively be a donation to people who want to promote these languages, and these companies are for profit entities.

Duolingo has offered a work-around to this closed system: they will accept materials developed by volunteers according to their specifications and freely donated.? Anyone who remembers the Internet Movie Database before it was sold to Amazon can identify the problems with this arrangement: what happens to those submissions if Duolingo goes bankrupt, or simply decides not to support them anymore?

Closed systems raise another issue: who decides what it means to learn French, or Hindi?? This has been discussed in the context of Duolingo, which chose to teach the artificial Modern Standard Arabic rather than a colloquial dialect or the classical language of the Qur’an.? Similarly, activists for the Hawai’ian language wanted the company to focus on lessons to encourage Hawai’ians to speak the language, rather than tourists who might visit for a few weeks at most.

Years ago I realized that we could make a free, open-source language lab application.? It wouldn’t have to replicate all the features of the commercial apps, especially not initially.? An app would be valuable if it offers the basic language lab functionality: play a model, record the learner’s mimicry, play the model again and finally play the recording of the learner.

An open system would be able to use any recording that the device can play.? This would allow learners to choose the models they practice with, or allow an instructor to choose models for their students.? The lessons don’t have to be professionally produced.? They can be created for a single student, or even for a single occasion.? I am not a lawyer, but I believe they can even use copyrighted materials.

I have created a language lab app using the Django Rest Framework and ReactJS that provides basic language lab functionality.? It runs in a web browser using responsive layout, and I have successfully tested it in Chrome and Firefox, on Windows and Android.

This openness and flexibility drastically reduces the cost of producing a lesson.? The initial code can be installed in an hour, on any server that can host Django.? The monthly cost of hosting code and media can be under $25.? Once this is set up, a media item and several exercises based on it can be added in five minutes.

This reduced cost means that a language does not have to bring in enough learners to recoup a heavy investment.? That in turn means that teachers can create lessons for every dialect of Arabic, or in fact for every dialect of English.? They can create Hawai’ian lessons for both tourists and heritage speakers.? They could even create lessons for actors to learn dialects, or master impressions of celebrities.

As a transgender person I’ve long been interested in developing a feminine voice to match my feminine visual image.? Gender differences in language include voice quality, pitch contour, rhythm and word choice – areas that can only be changed through experience.? I have used the alpha and beta versions of my app to create exercises for practicing these differences.

Another area where it helps a learner to hear a recording of their own voice is singing.? This could be used by professional singers or amateurs.? It could even be used for instrument practice.? I use it to improve my karaoke!

This week I was proud to present my work at the QueensJS meetup.? My slides from that talk contain more technical details about how to record audio through the web browser.? I’ll be pushing my source to GitHub soon. You can read more details about how to set up and use LanguageLab.? In the meantime, if you’d like to contribute, or to help with beta testing, please get in touch!

Angus Grieve-Smith wears a mask of his own design, featuring IPA vowel quadrilaterals on each cheek

Show your vowels and support Doctors Without Borders!

I’m very excited about a new face mask I designed.? You can order it online!

I was inspired by two tweets I saw within minutes of each other on July Fourth.? First, M?d?ric Gasquet-Cyrus, a professor at Aix-Marseille, posted a picture of? his colleague Pascal Rom?as wearing a “triangle vocalique” T-shirt designed by the linguistics YouTuber Romain Filstroff, known as Linguisticae. Gasquet-Cyrus’s tweet translates to “When you eat out with a phonetician colleague, you get a chance to practice your vowel quadrilateral!”

The vowel quadrilateral is one of the great data visualizations of linguistics: a two-dimensional diagram of the tongue height and position assigned to the vowel symbols of the Interneational Phonetic Alphabet, as viewed from the left side of the face.?? It is also known as the vowel triangle, depending on how much wiggle room you think people have for their tongues when their mouths are fully open.? It can even be plotted based on the formant frequencies extracted from acoustic analysis.

The second was a tweet by Emily Bender, a professor at the University of Washington, about face masks with a random grid of IPA symbols on them.? These are designed by the Lingthusiasm podcast team of author Gretchen McCulloch and professor Lauren Gawne, using the same pattern as in their popular IPA scarves.

Seeing the two pictures one after the other, I realized that rather than a random grid, I could put a vowel quadrilateral on an IPA mask.? Then I realized that if I placed the quadrilateral on one side, I could get it to line up with the wearer’s mouth.? I also had to make a corresponding chart for the right side.

I decided that I wanted the money to go to a charity that was helping with COVID-19.? Doctors Without Borders has been doing good work around the world for years, and with COVID they’ve really stepped up.? Here in New York they provided support to several local organizations and operated two shower trailers in Manhattan at the height of the outbreak.

From July 16 through 29, and then from November 27 through December 28, I ran a fundraiser through Custom Ink where we raised $430 in profits for Doctors Without Borders, and masks were sent to 32 supporters.

There’s another way to get masks!? I have made a slightly different mask design available at RedBubble.com.? You can even get a mug or a phone case.? This is the same store where I’ve been selling Existential Black Swan T-shirts for years.? You can get a mask with the swan on it, if that’s your style.? None of these part of a fundraiser, but you can still donate directly to Doctors Without Borders!

Update, February 1, 2021: There are more virulent strains of COVID spreading, so medical experts are recommending that people wear three-layer masks, or wear a single or double layer mask over a disposable surgical mask.? You should know that the white-on-black Custom Ink masks sold in the fundraisers in 2020 are single layer, and the RedBubble masks sold in 2020 are double layer.? They can both be worn over surgical masks.? Both services are now offering triple-layer masks, so I’ve updated the RedBubble links to the three-layer masks, and will use three-layer masks for any future fundraisers.? Stay safe, everyone!

Seeing the Star Wars movies does not make you a Star Wars fan. Actual Star Wars fans have done some of the following: * Read the novelizations * Read books in the EU * Read new canon books * Read some comics * Watched the animated shows * Participated in SW discussion groups.

Coercing with categories

Recently some guy tweeted “Seeing the Star Wars movies does not make you a Star Wars fan. Actual Star Wars fans have done some of the following…”? This is a great opportunity for me to talk about a particular kind of category fight: coercion.

Over the past several years I’ve written about some things people try to do with categories: watchdogging, gatekeeping, pedantry, eclipsing and splitting.  Coercion is similar to gatekeeping, which is where someone highlights category boundaries with the goal of preventing free riders from accessing benefits that they are not entitled to: the example I gave was of Dr. Nerdlove defending the category of “socially awkward men” from incursion by genuinely abusive men.  He argues that these abusive men do not deserve the accommodation that is sometimes extended to men who are simply socially awkward.

Coercion is different from gatekeeping in that the person making the accusation is shifting the category boundaries.  Ed Powell knows quite well that most people’s definition of “Star Wars fan” includes people who have not done any of the six things he lists.  So why is he insisting that “Actual Star Wars fans” have all done some of those things? Because he wants to control the behavior of people who care about whether they are considered Star Wars fans.

Why would someone care about being considered a Star Wars fan?  Because fandom is often a communal affair. Fans go to movies and conventions together, and bond over their shared appreciation for Star wars.  As Powell says, they may participate in discussion groups. There’s a satisfaction people get in talking about Wookiees or midichlorians with people who share background knowledge and don’t have to ask what a protocol droid is.

I’ve also heard that some people get a sense of belonging from participating in these groups.  They may have been teased – and rejected from other groups – for being one of the few Star Wars fans in their high school, especially in the seventies and eighties.  There’s a satisfaction and relief in finally finding a group that you share so much with.

Of course, these groups are vulnerable to the dark side.  They contain people, and people aren’t necessarily nice just because they’ve been treated badly by other people.  Sometimes not even if they’re Star Wars fans. Sometimes people discover they can wield power within a group like that, and they’re not always interested in using that power for good.

One way to wield power is to be able to give people something they want – or to deny it to them.  And if people want the sense of belonging to a group, or the enjoyment of participating in group activities, it’s a source of power to be able to control who belongs to the group – and who doesn’t.  Some groups are arbitrary: in theory, the only person who gets to decide who belongs to “Brenda’s friends” is Brenda, and the only person who gets to decide who’s invited to Kevin’s party is Kevin.

Other groups are based on categories, like these Meetup groups that are hosting events tomorrow: the New York Haskell Users Group, Black Baby Boomers Just Want to have Fun, or First Time Upper West Side Moms.  Or like Star Wars fans. These groups are much less arbitrary: if a woman lives on the Upper West Side with her only child, it’s going to be hard to throw her out.

It’s hard to exclude people from a category-based group, but not impossible.  What if our First Time Upper West Side Mom is trans, or a stepmother? Or if she’s a stepmother and a first-time biological mother?  Or if she lives on 107th Street? Or if her kid is in college? Because categories are fuzzy, the power to draw category boundaries can be the power to exclude people from group membership.  If the group leader doesn’t like our hypothetical mom, all she has to do is draw the boundary of the Upper West Side at 106th Street. Sorry honey, there is no First Time Morningside Heights Moms?  Oh gee, what a shame.

The power to exclude doesn’t even need to be exercised.  It doesn’t even need to have any direct force to have a chilling effect.  Even if the head of your local Star Wars fan club totally owns Ed Powell on Twitter, you still may be wondering if people at the next regional convention are going to look at you funny because you haven’t read Dark Force Rising.

But if you’re not actually going to use this power to exclude people, what do you use it for?  This is where the coercion comes in. You can use the threat of exclusion to bully people into doing things.  And the easiest way to do that is simply to make doing those things the criteria for inclusion.

So here’s what I think happened: Ed Powell got tired of going to conferences and not having anyone to talk about novelizations and animated series with.  All they wanted to talk about was the movies (I can’t imagine why!). So how does Powell get people to read these books? He changes the criteria for what counts as an Actual Star Wars fan.  Now they have to read them, or watch the series, if they want to be Actual Star Wars fans.

Now as far as I can tell, Ed Powell is just some guy on Twitter, and has no authority to exclude anyone from any fan club.  And he seems to be getting owned by everyone. I doubt that his shaming will have an effect on the general population of Star Wars fans.  It may serve as advertising to encourage people who have read these books and watched the animated series to talk with him about them. If it doesn’t turn them off too.

Flu. What is Thisby? A wandring knight? Quin. It is the Lady, that Pyramus must loue. Fl. Nay faith: let not me play a wom?: I haue a beard c?-(ming. Quin. Thats all one: you shall play it in a Maske: and you may speake as small as you will. Bott. And I may hide my face, let me play Thisby to: Ile speake in a monstrous little voice; Thisne, Thisne, ah Py-, ramus my louer deare, thy Thysby deare, & Lady deare. Qu. No, no: you must play Pyramus: & Flute, you Thysby.

The History of English through SparkNotes

Language change has been the focus of my research for over twenty years now, so when I taught second semester linguistics at Saint John’s University, I was very much looking forward to teaching a unit focused on change.  I had been working to replace constructed examples with real data, so I took a tip from my natural language processing colleague Dr. Wei Xu and turned to SparkNotes.

I first encountered SparkNotes when I was teaching French Language and Culture, and I assigned all of my students to write a book report on a work of French literature, or a book about French language or culture.  I don’t remember the details, but at times I had reason to suspect that one or another of my students was copying summary or commentary information about their chosen book from SparkNotes rather than writing their own.

When I was in high school, my classmates would make use of similar information for their book reports.  The rule was that you could consult the Cliffs Notes for help understanding the text, but you weren’t allowed to simply copy the Cliffs Notes.

Modern Text

FLUTE
Who?s Thisbe? A knight on a quest?

QUINCE
Thisbe is the lady Pyramus is in love with.

FLUTE
No, come on, don?t make me play a woman. I?m growing a beard.

QUINCE
That doesn?t matter. You?ll wear a mask, and you can make your voice as high as you want to.

BOTTOM
In that case, if I can wear a mask, let me play Thisbe too! I?ll be Pyramus first: ?Thisne, Thisne!??And then in falsetto: ?Ah, Pyramus, my dear lover! I?m your dear Thisbe, your dear lady!?

QUINCE
No, no. Bottom, you?re Pyramus.?And Flute, you?re Thisbe.

When I discovered SparkNotes I noticed that for some older authors – Shakespeare, of course, but even Dickens – they not only offered summaries and commentary, but translations of the text into contemporary English.  It was this feature I drew on for the unit on language change.

While I was developing and teaching this second semester intro linguistics course at Saint John’s, I was also working as a linguistic annotator for an information extraction project in the NYU Computer Science Department.  I met a doctoral student, Wei Xu, who was studying a number of interesting corpora, including Twitter, hip-hop and SparkNotes. Wei graduated in 2014, and is now Assistant Professor of Computer Science and Engineering at Ohio State.

Wei had realized that the modern translations on SparkNotes and eNotes, combined with the original Shakespearean text, formed a parallel corpus, a collection of texts in one language variety that are paired with translations in another language variety.  Parallel corpora, like the Canadian Hansard Corpus of French and English parliamentary debates, are used in translation studies, including for training machine translation software. Wei used the SparkNotes/eNotes parallel Shakespeare corpus to generate Shakespearean-style paraphrases of contemporary movie lines, among other things.

When it came time to teach the unit on language change at Saint John’s, I found a few small exercises that asked students to compare older literary excerpts with modern translations.  Given the constraints of this being one unit in a survey course, it made sense to focus on the language of instruction, English. The Language Files had one such exercise featuring a short Chaucer passage.  In general, when working with corpora I prefer to look at larger segments, ideally an entire text but at minimum a full page.

I realized that I could cover all the major areas of language change – phonological, morphological, syntactic, semantic and pragmatic – with these texts.  Linguists have been able to identify phonological changes from changes in spelling, for example that Chaucer’s spelling of “when” as “whan” indicates that we typically put our tongues in a higher place in our mouths when pronouncing the vowel of that word than people did in the fourteenth century.

When teaching Shakespeare to college students it is common to use texts with standardized spelling, but we now have access to scans of Shakespeare’s work as it was first published in his lifetime or shortly after his death, with the spellings chosen by those printers.  This spelling modernization is even practiced with some nineteenth century authors, and similarly we have access to the first editions of most words through digitization projects like Google Books.

With this in mind, I created exercises to explore language change.  For a second semester intro course the students learned a lot from a simple scavenger hunt: compare a passage from the SparkNotes translation of Shakespeare with the Quarto, find five differences, and specify whether they are phonological, morphological, syntactic, semantic or pragmatic.  In more advanced courses stufents could compare differences more systematically.

This comparison is the kind of thing that we always do when we read an old text: compare older spellings and wordings with the forms we would expect from a more modern text.  Wei Xu showed us that the translations and spelling changes in SparkNotes and eNotes can be used for a more explicit comparison, because they are written down based on the translators’ and editors’ understanding of what modern students will find difficult to read.

As I have detailed in my forthcoming book, Building a Representative Theater Corpus, we must be careful not to generalize universal statements, including statements about prevalence, to the language as a whole.  This is especially problematic when we are looking at authors who appealed to elite audiences, but it applies to Shakespeare and Dickens as well.  Existential observations, such as that Shakespeare used bare not (“let me not”) in one instance where SparkNotes used do-support (“don’t let me”) are much safer.

My students seemed to learn a lot from this technique.  I hope some of you find it useful in your classrooms!

What is “text” for a sign language?

I started writing this post back in August, and I hurried it a little because of a Limping Chicken article guest written by researchers at the Deafness, Cognition and Language Research Centre at University College London. I’ve known the DCAL folks for years, and they graciously acknowledged some of my previous writings on this issue. I know they don’t think the textual form of British Sign Language is written English, so I was surprised that they used the term “sign-to-text” in the title of their article and in a tweet announcing the article. After I brought it up, Dr. Kearsy Cormier acknowledged that there was potential for confusion in that term.

So, what does “sign-to-text” mean, and why do I find it problematic in this context? “Sign-to-text” is an analogy with “speech-to-text,” also known as speech recognition, the technology that enables dictation software like DragonSpeak. Speech recognition is also used by agents like Siri to interpret words we say so that they can act on them.

There are other computer technologies that rely on the concept of text. Speech synthesis is also known as text-to-speech. It’s the technology that enables a computer to read a text aloud. It can also be used by agents like Siri and Alexa to produce sounds we understand as words. Machine translation is another one: it typically proceeds from text in one language to text in another language. When the DCAL researchers wrote “sign-to-text” they meant a sign recognition system hooked up to a BSL-to-English machine translation system.

Years ago I became interested in the possibility of applying these technologies to sign languages, and created a prototype sign synthesis system, SignSynth, and an experimental English-to-American Sign Language system.

I realized that all these technologies make heavy use of text. If we want automated audiobooks or virtual assistants or machine translation with sign languages, we need some kind of text, or we need to figure out a new way of accomplishing these things without text. So what does text mean for a sign language?

One big thing I discovered when working on SignSynth is that (unlike the DCAL researchers) many people really think that the written form of ASL (or BSL) is written English. On one level that makes a certain sense, because when we train ASL signers for literacy we typically teach them to read and write English. On another level, it’s completely nuts if you know anything about sign languages. The syntax of ASL is completely different from that of English, and in some ways resembles Mandarin Chinese or Swahili more than English.

It’s bad enough that we have speakers of languages like Moroccan Arabic and Fujianese that have to write in a related language (written Arabic and written Chinese, respectively) that is different in non-trivial ways that take years of schooling to master. ASL and English are so totally different that it’s like writing Korean or Japanese with Chinese characters. People actually did this for centuries until someone smart invented hangul and katakana, which enabled huge jumps in literacy.

There are real costs to this, serious costs. I spent some time volunteering with Deaf and hard-of-hearing fifth graders in an elementary school, and after years of drills they were able to put English words on paper and pronounce them when they saw them. But it became clear to me that despite their obvious intelligence and curiosity, they had no idea that they could use words on paper to send a message, or that some of the words they saw might have a message for them.

There are a number of Deaf people who are able to master English early on. But from extensive reading and discussions with Deaf people, it is clear to me that the experience of these kids is typical of that for the vast majority of Deaf people.

It is a tremendous injustice to a child, and a tremendous waste of that child’s time and attention, for them to get to the age of twelve, at normal intelligence, without being able to use writing. This is the result of portraying English as the written form of ASL or BSL.

So what is the written form of ASL? Simply put, it doesn’t have one, despite several writing systems that have been invented, and it won’t have one until Deaf people adopt one. There will be no sign-to-text until signers have text, in their language.

I can say more about that, but I’ll leave it for another post.

Theories are tools for communication

I’ve written in the past about instrumentalism, the scientific practice of treating theories as tools that can be evaluated by their usefulness, rather than as claims that can be evaluated as true or false. If you haven’t tried this way of looking at science, I highly recommend it! But if theories are tools, what are they used for? What makes a theory more or less useful?

The process of science starts when someone makes an observation about the world. If we don’t understand the observation, we need to explore more, make more observations. We make hypotheses and test them, trying to get to a general principle that we can apply to a whole range of situations. We may then look for ways to apply this principle to our interactions with the world.

At every step of this process there is communication. The person who makes the initial observation, the people who make the further observations, who make the hypotheses, who test them, who who generalize the findings, who apply them: these are usually multiple people. They need to communicate all these things (observations, hypotheses, applications) to each other. Even if it’s one single person who does it all end to end, that person needs to communicate with their past and future selves, in the form of notes or even just thinking aloud.

These observations, hypotheses and applications are always new, because that’s what science is for: processing new information. It’s hard to deal with new information, to integrate it with the systems we already have for dealing with the world. What helps us in this regard are finding similarities between the new information and things we already know about the world. Once we find those similarities, we need to record this for our own reference and to signal it to others: other researchers, technologists and the rest of the population.

In informal settings, we already have ways of finding and communicating similarities between different observations. We use similes and metaphors: a person’s eyes may be blue like the sky, not blue like police lights. These are not just idle observations, though: the similarities often have implications for how we respond to things. If someone is leaving a job and they say that they’re passing the baton to a new person, they are signaling a similarity between their job and a relay race, and the suggestion is that the new person will be expected to continue towards the same goal the way a relay runner continues along the racecourse.

Theories and models are just formalized versions of metaphors: saying that light is a wave is a way of noting that it can move through the air like a wave moves through water. That theory allowed scientists to predict that light would diffract around objects the way that water waves behave when they encounter objects, a testable hypothesis that has been confirmed. This in turn allowed technologists to design lasers and other devices that took advantage of those wavelike properties, applications that have proven useful.

Here’s a metaphor that will hopefully help you understand how theories are communication tools: another communication tool is a photograph. Sometimes I see a photograph of myself and I notice that I’ve recently lost weight. Let’s say that I have been cutting back on snacks and I see a photo like that. I have other tools for discovering that I’ve lost weight, like scales and measuring tape and what I can observe of my body with my own eyes, but seeing a photo can communicate it to me in a different way and suggest that if I continue cutting back on snacks I will continue to lose weight. Similarly, if I post that photo on Facebook my friends can see that I’ve lost weight and understand that I’m going to continue to cut back on snacks.

A theory is like a photograph in that there is no single best photograph. To communicate my weight loss I would want a photo that shows my full body, but to communicate my feelings about it, a close-up on my face might be more appropriate. Friends of mine who get new tattoos on their legs will take close-ups of the tattoos. We may have six different photos of the exact same thing (full body, face or leg, for example), and be satisfied with them all. Theories are similar: they depend entirely on the purpose of communication.

A theory is like a photograph in that the best level of detail depends on what is being communicated and who the target is. If a friend takes a close-up of four square inches of their calf, that may be enough to show off their new tattoo, but a close-up of four square inches of my calf will probably not tell me or anyone else how much weight I’ve lost. Similarly, if I get someone to take an aerial photograph of me, that may indicate where I am at the time, but it will not communicate much about my weight. This applies to theories: a model with too much detail will simply swamp the researchers, and one with too little will not convey anything coherent about the topic.

A theory is like a photograph in that its effectiveness depends on who is on the other end of the communication. If someone who doesn’t know me sees that picture, they will have no idea how much I weighed before, or that my weight has been affecting my health. They will just see a person, and interpret it in whatever way they can.

A photograph may not be the best way to communicate my weight loss to my doctor. Their methods depend on measurable benchmarks, and they would prefer to see actual measurements made with scales or tape. On the other hand, a photo is a better way to communicate my weight loss to my Facebook friends than posting scale and tape measurements on Facebook, because they (or some of them at least) are more concerned with the overall way I look.

A theory’s effectiveness similarly depends on its audience. Population researchers may be familiar with the theories of Alfred Lotka and Vito Volterra, so if I tell them that ne…pas in French follows a Lotka-Volterra model, they are likely to understand. Chemists have probably never heard of Lotka or Volterra, so if I tell them the same thing I’m likely to get a blank stare.

This means that there is no absolute standard for comparing theories. We are never going to find the best theory. We may be able to compare theories for a particular purpose, with a particular level of detail, aimed at a particular audience, but even then there may be several theories that work about as well.

When I tell people about this instrumental approach to scientific theories and models, some of them get anxious. If there’s no way for theories to be true or false, how can we ever have a complete picture of the universe? The answer is that we can’t. Kurt G?del showed decades ago with his Incompleteness Theorem that no theory or model can ever completely capture reality, not even a mathematical or computer model. Jorge Luis Borges illustrated it with his story of the map that is the same size as the territory.

Science is not about finding out everything. It’s not about getting a complete picture. That’s because reality is too big and complex for our understanding, or for the formal systems that our computers are based on. It’s just about figuring out more than we knew before. It will never be finished. And that’s okay.

Le Corpus de la sc?ne parisienne

C’est l’ann?e 1810, et vous vous promenez sur les Grands Boulevards de Paris. Vous avez l’impression que toute la ville, voir m?me toute la France, a eu la m?me id?e, et est venue pour se promener, pour voir les gens et se faire voir. Qu’est-ce que vous entendez?

Vous arrivez ? un th??tre, vous montrez un billet pour une nouvelle pi?ce, et vous entrez. La pi?ce commence. Qu’est-ce que vous entendez de la sc?ne? Quels voix, quel langage?

Le projet du Corpus de la sc?ne parisienne cherche ? r?pondre ? cette derni?re question, avec l’id?e que cela nous informera sur la premi?re question aussi. Il s’appuie sur les travaux du chercheur Beaumont Wicks et des ressources comme Google Books et le projet Gallica de la Biblioth?que Nationale de France pour cr?er un corpus vraiment repr?sentatif du langage du th??tre parisien.

Certains corpus sont construits ? base d’une ?principe d’autorit??, qui tend ? mettre les voix des aristocrates et des grands bourgeois au premier plan. Le Corpus de la Sc?ne Parisienne corrige ce biais par se baser sur une ?chantillon tir?e au sort. En incorporant ainsi le th??tre populaire, le Corpus de la Sc?ne Parisienne permet au langage des classes ouvri?res, dans sa repr?sentation th??trale, de prendre sa place dans le tableau linguistique de cette p?riode.

La premi?re phase de construction, qui couvre les ann?es 1800 ? 1815, a d?j? contribu? ? la d?couverte des r?sultats int?ressants. Par exemple, dans le CSP en 75% des n?gations de phrase on utilise la construction ne ? pas, mais dans les quatre pi?ces de th??tre qui font partie du corpus FRANTEXT de la m?me p?riode, on n’utilise ne ? pas qu’en 49% des n?gations de phrase.

En 2016 j’ai cr?? un d?p?t sur GitHub et commenc? ? y mettre les textes de la premi?re phase en format HTML. Vous pouvez en lire pour vous amuser (Jocrisse-Ma?tre et Jocrisse-Valet en particulier m’a amus?), les mettre sur sc?ne (j’ach?terai des places) ou bien les utiliser pour vos propres recherches. Peut-?tre vous voudriez aussi contribuer au d?p?t, par corriger des erreurs dans les textes, ajouter de nouveaux textes du catalogue, ou convertir les textes en de nouveaux formats, comme TEI ou Markdown.

En janvier 2018 j’ai cr?? le bot spectacles_xix sur Twitter. Chaque jour il diffuse les descriptions des pi?ces qui ont d?but? ce jour-l? il y a exactement deux cents ans.

N’h?sitez pas ? utiliser ce corpus dans vos recherches, mais je vous prie de ne pas oublier de me citer, ou m?me me contacter pour discuter des collaborations ?ventuelles!

Deaf scholar Ben Bahan gives a lecture about Deaf architecture

Teaching sign linguistics in introductory classes

Language is not just spoken and written, and even though I’ve been working mostly on spoken languages for the past fifteen years, my understanding of language has been tremendously deepened by my study of sign languages. At the beginning of the semester I always asked my students what languages they had studied and what aspects of language they wanted to know more about, and they were always very interested in sign language. Since they had a professor with training and experience in sign linguistics it seemed natural to spend some time on it in class.

Our primary textbook, by George Yule,contains a decent brief overview of sign languages. The Language Files integrates sign language examples throughout and has a large section on sign phonetics. I added a lecture on the history of sign languages in Europe and North America, largely based on Lane, Hoffmeister and Bahan’s Journey Into the Deaf-World (1996), and other information I had learned over the years.

I also felt it was important for my students to actually observe a sign language being used to communicate and to express feeling, so I found an online video of an MIT lecture by psychologist and master storyteller (and co-author of Journey Into the Deaf-World) Ben Bahan. Bahan’s talk does not focus exclusively on language, but demonstrates the use of American Sign Language well, and the English interpretation is well done.

Studying a video lecture is a prime candidate for “flipped classroom” techniques, but I never got around to trying that. We watched the video in class, but before starting the video I assigned my students a simple observation task: could they find examples of the four phonological subsystems of American Sign Language – lexical signs, fingerspelling, depicting signs and nonmanual gestures?

Some of the students were completely overwhelmed by the task at first, but I made it clear that this was not a graded assignment, only introductory exploration. Other students had had a semester or more of ASL coursework, and the students with less experience were able to learn from them. Bahan, being Ben Bahan, produces many witty, thought-provoking examples of all four subsystems over the course of the lecture.

The phonological subsystems are among the easiest sign language phenomena for a novice to distinguish, but as we watched the video I pointed out other common features of ASL and other sign languages, such as topic-comment structures and stance-shifting.

Later, when I started teaching Introduction to Phonology, we had the opportunity to get deeper into sign language phonology. I’ll cover that in a future post.