Screenshot of the "Compose new Tweet" modal on Twitter, with the "+" button and a tooltip reading "Add another Tweet". The tweet texts reads "blah blah blah bl"

Dialogue and monologue in social media

I wrote most of this post in June 2022, before a lot of us decided to try out Mastodon. I didn’t publish it because I despaired of it making a difference. It felt like so many people were set in particular practices, including not reading blog posts! My experience on Mastodon has been so much better than the past several years on Twitter. I think this is connected with how Twitter and Mastodon handle threads.

A few years ago I wrote a critique of Twitter threads, tweetstorms, essays, and similar forms. I realize now that I didn’t actually talk much about what’s wrong with them. I focused on how difficult they are to read, but I didn’t realize how the native Twitter website and app actually makes them easier to read. So let me tell you some of the deeper problems with threads.

In 2001 I visited some of the computational linguistics labs at Carnegie Mellon University. Unfortunately I don’t remember the researchers’ names, but they described a set of experiments that has informed my thinking about language ever since. They were looking at the size of the input box in a communication app.

These researchers did experiments where they asked people to communicate with each other using a custom application. They presented different users with input boxes of different sizes: some got only a single line, others got three or four, and maybe some got six or eight lines.

What they found was that when someone was presented with a large blank space, as in an email application or the Google Docs application I’m writing this in, they tended to take their time and write long blocks of text, and edit them until they were satisfied. Only then did they hit send. Then the other user would do the same.

When the Carnegie Mellon researchers presented users with only one line, as in a text message app, their behavior was much different. They wrote short messages and sent them off with minimal editing. The short turnaround time resulted in a dialogue that was much closer to the rhythm of spoken conversation.

This echoed my own findings from a few years before. I was searching for features of French that I heard all over the streets of Paris, but had not been taught to me in school, in particular what linguists call right dislocation (“Ils sont fous, ces Romains”) and left dislocation (“L’?tat, c’est moi”).

In 1998 the easiest place to look was USENET newsgroups, and I found that even casual newsgroups like fr.rec.animaux were heavy on the formal, carefully crafted types of messages I remembered from high school French class. I had already read some prior research on this kind of language variation, so I decided to try something with faster dialogue.

In Internet Relay Chat (IRC) I hit the jackpot. On the IRC channel, left and right dislocations made up between 21% and 38% of all finite clauses. I noticed other features of conversational French like ne-dropping were common as well. I could even see IRC newbies adapting in real time: they would start off trying to write formal sentences the way they were taught in lyc?e, and soon give up and start writing the way they talked.

At this point I have to say: I love dialogue. Don’t get me wrong: I can get into a nice well-crafted monologue or monograph. And anyone who knows me knows I enjoy telling a good story or tearing off on a rant about something. But dialogue keeps me honest, and it keeps other people honest too.

Dialogue is not inherently or automatically good. On Twitter as in many other places, it is used to harass and intimidate. But when properly structured and regulated it can be a democratizing force. It’s important to remember how long our media has been dominated by monologues: newspapers, films, television. Even when these formats contain dialogues, they are often fictional dialogues written by a single author or team of authors to send a single message.

One of my favorite things about the internet is that it has always favored dialogue. Before large numbers of people were on the internet there was a large gap between privileged media sources and independent ones. Those of us who disagreed with the monologues being thrust upon us by television and newspapers were often reduced to impotently talking back at those powerful media sources, in an empty room.

USENET, email newsletters, personal websites and blogs were democratizing forces because they allowed anyone who could afford the hosting fees (sometimes with the help of advertisers) to command these monologic platforms. They were the equivalent of Speakers’ Corner in London. They were like pamphlets or letters to the editor or cable access television, but they eliminated most of the barriers to entry. But they were focused on monologues.

In the 1990s and early 2000s we had formats that encouraged dialogue, like mailing lists and bulletin boards, but they had large input boxes. As I saw on fr.rec.animaux in 1998, that encouraged long, edited messages.
We did have forums with smaller input boxes, like IRC or the group chats on AOL Instant Messenger. As I found, those encouraged people to write short messages in dialog with each other. When I first heard about Twitter with its 140-character limit I immediately recognized it as a dialogic forum.

But what sets Twitter apart from IRC or AOL Instant Messenger? Twitter is a broadcast platform. The fact that every tweet is public by default, searchable and assigned a unique URL, makes it a “microblog” site like some popular sites in China.

If someone said something on IRC or AIM in 1999 it was very hard to share it outside that channel. I was able to compile my corpus by creating a “bot” that logged on to the channel every night and logged a copy of all the messages. What Twitter and the sites it copied like Weibo brought was the combination of permanent broadcast, low barrier to entry, and dialogue.

This is why I’m bothered by Twitter threads, by screenshots of text, by the unending demands for an edit button. These are all attempts to overpower the dialogue on Twitter, to remove one of the key elements that make it special.

Without the character limits, Twitter is just a blogging platform. Of course, there’s nothing wrong with blogs! I’ve done a lot of blogging, I’ve done a lot of commenting on blogs and I’ve tweeted a lot of links to blogs. But I want to choose when to follow those links and go read those blog posts or news articles or press releases.

I want a feed full of dialogue or short statements. Threads and screenshots interrupt the dialogue. They aggressively claim the floor, crowding out other tweets. Screenshots interrupt the other tweets with large blocks of text, demanding to be read in their entirety. Threads take up even more of the timeline. The Twitter web app will show as many as three tweets of a thread, interrupting the flow of dialogue.

The experience of threads is much worse on Twitter clients that don’t manipulate the timeline, like TweetDeck (which was bought by Twitter in 2011) and HootSuite. If it’s a long thread, your timeline is screwed, and you have to scroll endlessly to get past it.

One of the things I love the most about Mastodon is the standard practice of making the first toot in a thread public, but publishing all the other toots as unlisted. That broadcasts the toot announcing the thread, and then gives readers the agency to decide whether they want to read the follow-up toots. It’s more or less the equivalent of including a link to a web page or blog post in a toot.

There’s a lot more to say about dialogue and social media, but for now I’m hugely encouraged by the feeling of being on Mastodon, and I’m hoping it leads us in a better direction for dialogue, away from threads and screenshots.

(imit.: dez may be slightly bent spread 5) v type, N typewriter, typist with or without suffix -|| [/BB/v,.

Fonts for Stokoe notation

You may be familiar with the International Phonetic Alphabet, the global standard for representing speech sounds, ideally independent of the way those speech sounds may be represented in a writing system. Did you know that sign languages have similar standards for representing hand and body gestures?

Unfortunately, we haven’t settled on a single notation system for sign languages the way linguists have mostly chosen the IPA for speech. There are compelling arguments that none of the existing systems are complete enough for all sign languages, and different systems have different strengths.

Another difference is that signers, by and large, do not read and write their languages. Several writing systems have been developed and promoted, but to my knowledge, there is no community that sends written messages to each other in any sign language, or that writes works of fiction or nonfiction for other signers to read.

One of the oldest and best-known notation system is the one developed by Gallaudet University professor William Stokoe (u5"tx) for his pioneering analysis of American Sign Language in the 1960s, which succeeded in convincing many people that ASL is, in ways that matter, a language like English or Japanese or Navajo. Among other things, with his co-authors Dorothy Casterline and Carl Cronenberg Stokoe used this system for the entries in their 1965 Dictionary of American Sign Language (available from SignMedia).? In the dictonary entry above, the sign CbCbr~ is given the English translation of “type.”

Stokoe notation is incomplete in a number of ways. Chiefly, it is optimized for the lexical signs of American Sign Language. It does not account for the wide range of handshapes used in American fingerspelling, or the wide range of locations, orientations and movements used in ASL depicting gestures. It only describes what a signer’s hands are doing, with none of the face and body gestures that have come to be recognized as essential to the grammar of sign languages. Some researchers have produced modifications for other languages, but those are not always well-documented.

Stokoe created a number of symbols, some of which bore a general resemblance to Roman letters, and some that didn’t. This made it impossible to type with existing technology; I believe all the transcriptions in the Dictionary of ASL were written by hand. In 1993 another linguist, Mark Mandel, developed a system for encoding Stokoe notation into the American Standard Code for Information Interchange (ASCII) character set, which by then could be used on almost all American computers.

In September 1995 I was in the middle of a year-long course in ASL at the ASL Institute in Manhattan. I used some Stokoe notation for my notes, but I wanted to be able to type it on the computer, not just using Mandel’s ASCII encoding. I also happened to be working as a trainer at Userfriendly, a small chain of computer labs with a variety of software available, including Altsys Fontographer, and as an employee I could use the workstations whenever customers weren’t paying for them.

One day I sat down in a Userfriendly lab and started modifying an existing public domain TrueType font (Tempo by David Rakowski) to make the Stokoe symbols. The symbols were not in Unicode, and still are not, despite a proposal to that effect on file. I arranged it so that the symbols used the ASCII-Stokoe mappings: if you typed something in ASCII-Stokoe and applied my font, the appropriate Stokoe symbols would appear. StokoeTempo was born. It wasn’t elegant, but it worked.

I made the font available for download from my website, where it’s been for the past 26-plus years. I wound up not using it for much, other than to create materials for the linguistics courses I taught at Saint John’s University, but others have downloaded it and put it to use. It is linked from the Wikipedia article on Stokoe notation.

A few years later I developed SignSynth, a web-based prototype sign language synthesis application. At the time web browsers did not offer much flexibility in terms of fonts, so I could not use Stokoe symbols and had to rely on ASCII-Stokoe, and later Don Newkirk’s (1986) Literal Orthography, along with custom extensions for fingerspelling and nonmanual gestures.

Recently, as part of a project to bring SignSynth (another project of mine) into the 21st Century I decided to explore using fonts on the Web. I discovered a free service, FontSquirrel, that creates Web Open Font Format (WOFF and WOFF2) wrappers for TrueType fonts. I created WOFF and WOFF2 files for StokoeTempo and posted them on my site.

I also discovered a different standard, Typeface.js, which actually uses a JSON format. This is of particular relevance to SignSynth, because it can be used with the 3D web library Three.js. There’s another free service, Facetype.js, that converts TrueType fonts to Typeface.js fonts.

(imit.: dez may be slightly bent spread 5) v type, N typewriter, typist with or without suffix -|| [/BB/v,.

To demonstrate the use of StokoeTempo web fonts, above is a scan of the definition of CbCbr~ from page 51 of the Dictionary of American Sign Language. Below I have reproduced it using HTML and StokoeTempo:

CbCbr~ (imit.: dez may be slightly bent spread 5) v type, r typewriter, typist with or without suffix _____ ?[BBv.

StokoeTempo is free to download and use by individuals and educational institutions.

Screenshot of LanguageLab displaying the exercise "J'étais certain que j'aillais écrire à quinze ans"

Imagining an alternate language service

It’s well known that some languages have multiple national standards, to the point where you can take courses in either Brazilian or European Portuguese, for example. Most language instruction services seem to choose one variety per language: when I studied Portuguese at the University of Paris X-Nanterre it was the European variety, but the online service Duolingo only offers the Brazilian one.

I looked into some of Duolingo’s offerings for this post, because they’re the most talked about language instruction service these days. I was surprised to discover that they use no recordings of human speakers; all their speech samples are synthesized using an Amazon speech synthesis service named Polly. Interestingly, even though Duolingo only offers one variety of each language, Amazon Polly offers multiple varieties of English, Spanish, Portuguese and French.

As an aside, when I first tried Duolingo years ago I had the thought, “Wait, is this synthesized?” but it just seemed too outrageous to think that someone would make a business out of teaching humans to talk like statistical models of corpus speech. It turns out it wasn’t too outrageous, and I’m still thinking through the implications of that.

Synthesized or not, it makes sense for a company with finite resources to focus on one variety. But if that one company controls a commanding market share, or if there’s a significant amount of collusion or groupthink among language instruction services, they can wind up shutting out whole swathes of the world, even while claiming to be inclusive.

This is one of the reasons I created an open LanguageLab platform: to make it easier for people to build their own exercises and lessons, focusing on any variety they choose. You can set up your own LanguageLab server with exercises exclusively based on recordings of the English spoken on Smith Island, Maryland (population 149), if you like.

So what about excluded varieties with a few more speakers? I made a table of all the Duolingo language offerings according to their number of English learners, along with the Amazon Polly dialect that is used on Duolingo. If the variety is only vaguely specified, I made a guess.

For each of these languages I picked another variety, one with a large number of speakers. I tried to find the variety with the largest number of speakers, but these counts are always very imprecise. The result is an imagined alternate language service, one that does not automatically privilege the speakers of the most influential variety. Here are the top ten:

Language Duolingo dialect Alternate dialect
English Midwestern US India
Spanish Mexico Argentina
French Paris Quebec
Japanese Tokyo Kagoshima
German Berlin Bavarian
Korean Seoul Pyongyang
Italian Florence Rome
Mandarin Chinese Beijing Taipei
Hindi Delhi Chhatisgarhi
Russian Moscow Almaty

To show what could be done with a little volunteer work, I created a sample lesson for a language that I know, the third-most popular language on Duolingo, French. After France, the country with the next largest number of French speakers is Canada. Canadian French is distinct in pronunciation, vocabulary and to some degree grammar.

Canadian French is stigmatized outside Canada, to the point where I’m not aware of any program in the US that teaches it, but it is omnipresent in all forms of media in Canada, and there is quite a bit of local pride. These days at least, it would be as odd for a Canadian to speak French like a Parisian as for an American to speak English like a Londoner. There are upper and lower class accents, but they all share certain features, notably the ranges of the nasal vowels.

I chose a bestselling author and television anchor, Michel Jean, who has one grandmother from the indigenous Innu people and three presumably descended from white French settlers. I took a small excerpt from an interview with Jean about his latest novel where he responds spontaneously to the questions of a librarian, Josianne Binette.

The sample lesson in Canadian French based on Michel Jean’s speech is available on the LanguageLab demo site. You are welcome to try it! Just log in with the username demo and the password LanguageLab.

A free, open source language lab app

Viewers of the Crown may have noticed a brief scene where Prince Charles practices Welsh by sitting in a glass cubicle wearing a headset.? Some viewers may recognize that as a language lab. Some may have even used language labs themselves.

The core of the language lab technique is language drills, which are based on the bedrock of all skills training: mimicry, feedback and repetition.? An instructor can identify areas for the learner to focus on.

Because it’s hard for us to hear our own speech, the instructor also can observe things in the learner’s voice that the learner may not perceive.? Recording technology enabled the learner to take on some of the role of observer more directly.

When I used a language lab to learn Portuguese in college, it ran on cassette tapes.? The lab station played the model (I can still remember “Elena, estudante francesa, vai passar as ferias em Portugal?“), then it recorded my attempted mimicry onto a blank cassette.? Once I was done recording it played back the model, followed by my own recording.

Hearing my voice repeated back to me after the model helped me judge for myself how well I had mimicked the model.? It wasn’t enough by itself, so the lab instructor had a master station where he could listen in on any of us and provide additional feedback.? We also had classroom lessons with an instructor, and weekly lectures on culture and grammar.

There are several companies that have brought language lab technology into the digital age, on CD-ROM and then over the internet.? Many online language learning providers rely on proprietary software and closed platforms to generate revenue, which is fine for them but doesn’t allow teachers the flexibility to add new language varieties.

People have petitioned these language learning companies to offer new languages, but developing offerings for a new language is expensive.? If a language has a small user base it may never generate enough revenue to offset the cost of developing the lessons.? It would effectively be a donation to people who want to promote these languages, and these companies are for profit entities.

Duolingo has offered a work-around to this closed system: they will accept materials developed by volunteers according to their specifications and freely donated.? Anyone who remembers the Internet Movie Database before it was sold to Amazon can identify the problems with this arrangement: what happens to those submissions if Duolingo goes bankrupt, or simply decides not to support them anymore?

Closed systems raise another issue: who decides what it means to learn French, or Hindi?? This has been discussed in the context of Duolingo, which chose to teach the artificial Modern Standard Arabic rather than a colloquial dialect or the classical language of the Qur’an.? Similarly, activists for the Hawai’ian language wanted the company to focus on lessons to encourage Hawai’ians to speak the language, rather than tourists who might visit for a few weeks at most.

Years ago I realized that we could make a free, open-source language lab application.? It wouldn’t have to replicate all the features of the commercial apps, especially not initially.? An app would be valuable if it offers the basic language lab functionality: play a model, record the learner’s mimicry, play the model again and finally play the recording of the learner.

An open system would be able to use any recording that the device can play.? This would allow learners to choose the models they practice with, or allow an instructor to choose models for their students.? The lessons don’t have to be professionally produced.? They can be created for a single student, or even for a single occasion.? I am not a lawyer, but I believe they can even use copyrighted materials.

I have created a language lab app using the Django Rest Framework and ReactJS that provides basic language lab functionality.? It runs in a web browser using responsive layout, and I have successfully tested it in Chrome and Firefox, on Windows and Android.

This openness and flexibility drastically reduces the cost of producing a lesson.? The initial code can be installed in an hour, on any server that can host Django.? The monthly cost of hosting code and media can be under $25.? Once this is set up, a media item and several exercises based on it can be added in five minutes.

This reduced cost means that a language does not have to bring in enough learners to recoup a heavy investment.? That in turn means that teachers can create lessons for every dialect of Arabic, or in fact for every dialect of English.? They can create Hawai’ian lessons for both tourists and heritage speakers.? They could even create lessons for actors to learn dialects, or master impressions of celebrities.

As a transgender person I’ve long been interested in developing a feminine voice to match my feminine visual image.? Gender differences in language include voice quality, pitch contour, rhythm and word choice – areas that can only be changed through experience.? I have used the alpha and beta versions of my app to create exercises for practicing these differences.

Another area where it helps a learner to hear a recording of their own voice is singing.? This could be used by professional singers or amateurs.? It could even be used for instrument practice.? I use it to improve my karaoke!

This week I was proud to present my work at the QueensJS meetup.? My slides from that talk contain more technical details about how to record audio through the web browser.? I’ll be pushing my source to GitHub soon. You can read more details about how to set up and use LanguageLab.? In the meantime, if you’d like to contribute, or to help with beta testing, please get in touch!

Le Corpus de la sc?ne parisienne

C’est l’ann?e 1810, et vous vous promenez sur les Grands Boulevards de Paris. Vous avez l’impression que toute la ville, voir m?me toute la France, a eu la m?me id?e, et est venue pour se promener, pour voir les gens et se faire voir. Qu’est-ce que vous entendez?

Vous arrivez ? un th??tre, vous montrez un billet pour une nouvelle pi?ce, et vous entrez. La pi?ce commence. Qu’est-ce que vous entendez de la sc?ne? Quels voix, quel langage?

Le projet du Corpus de la sc?ne parisienne cherche ? r?pondre ? cette derni?re question, avec l’id?e que cela nous informera sur la premi?re question aussi. Il s’appuie sur les travaux du chercheur Beaumont Wicks et des ressources comme Google Books et le projet Gallica de la Biblioth?que Nationale de France pour cr?er un corpus vraiment repr?sentatif du langage du th??tre parisien.

Certains corpus sont construits ? base d’une ?principe d’autorit??, qui tend ? mettre les voix des aristocrates et des grands bourgeois au premier plan. Le Corpus de la Sc?ne Parisienne corrige ce biais par se baser sur une ?chantillon tir?e au sort. En incorporant ainsi le th??tre populaire, le Corpus de la Sc?ne Parisienne permet au langage des classes ouvri?res, dans sa repr?sentation th??trale, de prendre sa place dans le tableau linguistique de cette p?riode.

La premi?re phase de construction, qui couvre les ann?es 1800 ? 1815, a d?j? contribu? ? la d?couverte des r?sultats int?ressants. Par exemple, dans le CSP en 75% des n?gations de phrase on utilise la construction ne ? pas, mais dans les quatre pi?ces de th??tre qui font partie du corpus FRANTEXT de la m?me p?riode, on n’utilise ne ? pas qu’en 49% des n?gations de phrase.

En 2016 j’ai cr?? un d?p?t sur GitHub et commenc? ? y mettre les textes de la premi?re phase en format HTML. Vous pouvez en lire pour vous amuser (Jocrisse-Ma?tre et Jocrisse-Valet en particulier m’a amus?), les mettre sur sc?ne (j’ach?terai des places) ou bien les utiliser pour vos propres recherches. Peut-?tre vous voudriez aussi contribuer au d?p?t, par corriger des erreurs dans les textes, ajouter de nouveaux textes du catalogue, ou convertir les textes en de nouveaux formats, comme TEI ou Markdown.

En janvier 2018 j’ai cr?? le bot spectacles_xix sur Twitter. Chaque jour il diffuse les descriptions des pi?ces qui ont d?but? ce jour-l? il y a exactement deux cents ans.

N’h?sitez pas ? utiliser ce corpus dans vos recherches, mais je vous prie de ne pas oublier de me citer, ou m?me me contacter pour discuter des collaborations ?ventuelles!

Why do people make ASL translations of written documents?

My friend Josh was puzzled to see that the City of New York offers videos of some of its documents, translated from the original English into American Sign Language, on YouTube. I didn?t know of a good, short explainer online, and nobody responded when I asked for one on Twitter, so I figured I?d write one up.

The short answer is that ASL and English are completely different language, and knowing one is not that much help learning the other. It?s true that some deaf people are able to lipread, speak and write fluent English, this is generally because they have some combination of residual hearing, talent, privilege and interest in language. Many deaf people need to sign for daily conversation, even if they grew up with hearing parents.

It is incredibly difficult to learn to read and write a language that you can?t speak, hear, sign or see. As part of my training in sign linguistics I spent time with two deaf fifth grade students in an elementary school in Albuquerque. These were bright, curious children, and they spent hours every day practicing reading, writing, speaking and even listening – they both had cochlear implants.

After visiting these kids several times, talking with them in ASL and observing their reading and writing, I realized that at the age of eleven they did not understand how writing is used to communicate. I asked them to simply pass notes to each other, the way that hearing kids did well before fifth grade. They did each write things on paper that made the other laugh, but when I tried giving them specific messages and asking them to pass those messages on in writing, they had no idea what I was asking for.

These kids are in their thirties now, and they may well be able to read and write English fluently. At least one had a college-educated parent who was fluent in both English and ASL, which helps a lot. Other factors that help are the family?s income level and a general skill with languages. Many deaf people have none of these advantages, and consequently never develop much skill with English.

The City could even print some of these documents in ASL. Several writing systems have been created for sign languages, some of them less complete than others. For a variety of reasons, they haven?t caught on in Deaf communities, so using one of those would not help the City get the word out about school closures.

The reasons that the City government provides videos in ASL are thus that ASL is a completely different language from English, many deaf people do not have the exceptional language skills necessary to read a language they don?t really speak, and the vast majority of deaf people don?t read ASL.

On this day in Parisian theater

Since I first encountered The Parisian Stage, I?ve been impressed by the completeness of Beaumont Wicks?s life?s work: from 1950 through 1979 he compiled a list of every play performed in the theaters of Paris between 1800 and 1899. I?ve used it as the basis for my Digital Parisian Stage corpus, currently a one percent sample of the first volume (Wicks 1950), available in full text on GitHub.

Last week I had an idea for another project. Science requires both qualitative and quantitative research, and I?ve admired Neil Freeman?s @everylotnyc Twitter bot as a project that conveys the diversity of the underlying data and invites deep, qualitative exploration.

In 2016, with Timm Dapper, Elber Carneiro and Laura Silver I forked Freeman?s everylotbot code to create @everytreenyc, a random walk through the New York City Parks Department?s 2015 street tree census. Every three hours during normal New York active time, the bot tweets information about a tree from the database, in a template written by Laura that may also include topical, whimsical sayings.

Recently I?ve encountered a lot of anniversaries. A lot of it is connected to the centenary of the First World War I, but some is more random: I just listened to an episode of la Fabrique de l?histoire about Fran?ois Mitterrand’s letters to his mistress that was promoted with the fact that he was born in 1916, one hundred years before that episode aired, even though he did not start writing those letters until 1962.

There are lots of ?On this day? blogs and Twitter feeds, such as the History Channel and the New York Times, and even specialized feeds like @ThisDayInMETAL. There are #OnThisDay and #otd hashtags, and in French #CeJourL?. The ?On this day? feeds have two things in common: they tend to be hand-curated, and they jump around from year to year. For April 13, 2014, the @CeJourLa feed tweeted events from 1849, 1997, 1695 and 1941, in that order.

Two weeks ago I was at the Annual Convention of the Modern Language Association, describing my Digital Parisian Stage corpus, and I realized that in the Parisian Stage there were plays being produced exactly two hundred years ago. I thought of the #OnThisDay feeds and @everytreenyc, and realized that I could create a Twitter bot to pull information about plays from the database and tweet them out. A week later, @spectacles_xix sent out its first automated tweet, about the play la R?conciliation par ruse.

@spectacles_xix runs on Pythonanywhere in Python 3.6, and accesses a MySQL database. It uses Mike Verdone?s Twitter API client. The source is open on GitHub.

Unlike other feeds, including this one from the French Ministry of Culture that just tweeted about the anniversary of the premi?re of Rostand?s Cyrano de Bergerac, this one will not be curated, and it will not jump around from year to year. It will tweet every play that premi?red in 1818, in order, until the end of the year, and then go on to 1819. If there is a day when no plays premi?red, like January 16, @spectacles_xix will not tweet.
I have a couple of ideas about more features to add, so stay tuned!

And we mean really every tree!

When Timm, Laura, Elber and I first ran the @everytreenyc Twitter bot almost a year ago, we knew that it wasn?t actually sampling from a list that included every street tree in New York City. The Parks Department?s 2015 Tree Census was a huge undertaking, and was not complete by the time they organized the Trees Count! Data Jam last June. There were large chunks of the city missing, particularly in Southern and Eastern Queens.

The bot software itself was not a bad job for a day?s work, but it was still a hasty patch job on top of Neil Freeman?s original Everylotbot code. I hadn?t updated the readme file to reflect the changed we had made. It was running on a server in the NYU Computer Science Department, which is currently my most precarious affiliation.

On April 28 I received an email from the Parks Department saying that the census was complete, and the final version had been uploaded to the NYC Open Data Portal. It seemed like a good opportunity to upgrade.

Over the past two weeks I?ve downloaded the final tree database, installed everything on Pythonanywhere, streamlined the code, added a function to deal with Pythonanywhere?s limited scheduler, and updated the readme file. People who follow the bot might have noticed a few extra tweets over the past couple of days as I did final testing, but I?ve removed the cron job at NYU, and @everytreenyc is now up and running in its new home, with the full database, a week ahead of its first birthday. Enjoy the d?rive!

The Photo Roster, a web app for Columbia University faculty

Since July 2016 I have been working as Associate Application Systems in the Teaching and Learning Applications group at Columbia University. I have developed several apps, including this Photo Roster, an LTI plugin to the Canvas Learning Management System.

The back end of the Photo Roster is written in Python and Flask. The front end uses Javascript with jQuery to filter the student listings and photos, and to create a flash card app to help instructors learn their students’ names.

This is the third generation of the Photo Roster tool at Columbia. The first generation, for the Prometheus LMS, was famously scraped by Mark Zuckerberg when he extended Facebook to Columbia. To prevent future release of private student information, this version uses SAML and OAuth2 to authenticate users and securely retrieve student information from the Canvas API, and Oracle SQL to store and retrieve the photo authorizations.

It would be a release of private student information if I showed you the Roster live, so I created a demo class with famous Columbia alumni, and used a screen recorder to make this demo video. Enjoy!

Online learning: Definitely possible

There?s been a lot of talk over the past several years about online learning. Some people sing its praises without reservation. Others claim that it doesn?t work at all. I have successfully learned over the internet and I have successfully taught over the internet. It can work very well, but it requires a commitment on the part of the teacher and the learner that is not always present. In this series of posts I will discuss what has worked well and what hasn?t in my experience, specifically in teaching linguistics to undergraduate speech pathology majors.

Online learning is usually contrasted with an ideal classroom model where the students engage in two-way oral conversation, exercises and assessment with the instructor and each other, face to face in real time. In practice there are already deviations from this model: one-way lectures, independent and group exercises, asynchronous homeworks, take-home exams. The questions are really whether the synchronous or face-to-face aspects can be completely eliminated, and whether the internet can provide a suitable medium for instruction.

The first question was answered hundreds of years ago, when the first letter was exchanged between scholars. Since then people have learned a great deal from each other, via books and through the mail. My great-uncle Doc learned embalming through a correspondence course, and made a fortune as one of the few providers of Buddhist funerals in San Jose. So we know that people can learn without face-to-face, synchronous or two-way interaction with teachers.

What about the internet? People are learning a lot from each other over the internet. I?ve learned how to assemble a futon frame and play the cups over the internet. A lot of the core ideas about social science that inform my work today I learned in a single independent study course I took over email with Melissa Axelrod in 1999.

My most dramatic exposure to online learning was from 2003 through 2006. I read the book My Husband Betty, and discovered that the author, Helen Boyd, had an online message board for readers to discuss her book (set up by Betty herself). The message board would send me emails whenever someone posted, and I got drawn into a series of discussions with Helen and Betty, as well as Diane S. Frank, Caprice Bellefleur, Donna Levinsohn, Sarah Steiner and a number of other thoughtful, creative, knowledgeable people.

A lot of us knew a thing or two about gender and sexuality already, but Helen, having read widely and done lots of interviews on those topics, was our teacher, and would often start a discussion by posting a question or a link to an article. Sometimes the discussion would get heated, and eventually I was kicked off and banned. But during those three years I learned a ton, and I feel like I got a Master?s level education in gender politics. Of course, we didn?t pay Helen for this besides buying her books, so I?m glad she eventually got a full-time job teaching this stuff.

So yes, we can definitely learn things over the internet. But are official online courses an adequate substitute for – or even an improvement over – in-person college classes? I have serious doubts, and I?ll cover them in future posts.