@everytreenyc

At the beginning of June I participated in the Trees Count Data Jam, experimenting with the results of the census of New York City street trees begun by the Parks Department in 2015. I had seen a beta version of the map tool created by the Parks Department’s data team that included images of the trees pulled from the Google Street View database. Those images reminded me of others I had seen in the @everylotnyc twitter feed.

silver maple 20160827

@everylotnyc is a Twitter bot that explores the City’s property database. It goes down the list in order by taxID number. Every half hour it compose a tweet for a property, consisting of the address, the borough and the Street View photo. It seems like it would be boring, but some people find it fascinating. Stephen Smith, in particular, has used it as the basis for some insightful commentary.

It occurred to me that @everylotnyc is actually a very powerful data visualization tool. When we think of “big data,” we usually think of maps and charts that try to encompass all the data – or an entire slice of it. The winning project from the Trees Count Data Jam was just such a project: identifying correlations between cooler streets and the presence of trees.

Social scientists, and even humanists recently, fight over quantitative and qualitative methods, but the fact is that we need them both. The ethnographer Michael Agar argues that distributional claims like “5.4 percent of trees in New York are in poor condition” are valuable, but primarily as a springboard for diving back into the data to ask more questions and answer them in an ongoing cycle. We also need to examine the world in detail before we even know which distributional questions to ask.

If our goal is to bring down the percentage of trees in Poor condition, we need to know why those trees are in Poor condition. What brought their condition down? Disease? Neglect? Pollution? Why these trees and not others?

Patterns of neglect are often due to the habits we develop of seeing and not seeing. We are used to seeing what is convenient, what is close, what is easy to observe, what is on our path. But even then, we develop filters to hide what we take to be irrelevant to our task at hand, and it can be hard to drop these filters. We can walk past a tree every day and not notice it. We fail to see the trees for the forest.

Privilege filters our experience in particular ways. A Parks Department scientist told me that the volunteer tree counts tended to be concentrated in wealthier areas of Manhattan and Brooklyn, and that many areas of the Bronx and Staten Island had to be counted by Parks staff. This reflects uneven amounts of leisure time and uneven levels of access to city resources across these neighborhoods, as well as uneven levels of walkability.

A time-honored strategy for seeing what is ordinarily filtered out is to deviate from our usual patterns, either with a new pattern or with randomness. This strategy can be traced at least as far as the sampling techniques developed by Pierre-Simon Laplace for measuring the population of Napoleon’s empire, the forerunner of modern statistical methods. Also among Laplace’s cultural heirs are the flâneurs of late nineteenth-century Paris, who studied the city by taking random walks through its crowds, as noted by Charles Baudelaire and Walter Benjamin.

In the tradition of the flâneurs, the Situationists of the mid-twentieth century highlighted the value of random walks, that they called dérives. Here is Guy Debord (1955, translated by Ken Knabb):

The sudden change of ambiance in a street within the space of a few meters; the evident division of a city into zones of distinct psychic atmospheres; the path of least resistance which is automatically followed in aimless strolls (and which has no relation to the physical contour of the ground); the appealing or repelling character of certain places — these phenomena all seem to be neglected. In any case they are never envisaged as depending on causes that can be uncovered by careful analysis and turned to account. People are quite aware that some neighborhoods are gloomy and others pleasant. But they generally simply assume that elegant streets cause a feeling of satisfaction and that poor streets are depressing, and let it go at that. In fact, the variety of possible combinations of ambiances, analogous to the blending of pure chemicals in an infinite number of mixtures, gives rise to feelings as differentiated and complex as any other form of spectacle can evoke. The slightest demystified investigation reveals that the qualitatively or quantitatively different influences of diverse urban decors cannot be determined solely on the basis of the historical period or architectural style, much less on the basis of housing conditions.

In an interview with Neil Freeman, the creator of @everylotbot, Cassim Shepard of Urban Omnibus noted the connections between the flâneurs, the dérive and Freeman’s work. Freeman acknowledged this: “How we move through space plays a huge and under-appreciated role in shaping how we process, perceive and value different spaces and places.”

Freeman did not choose randomness, but as he describes it in a tinyletter, the path of @everylotbot sounds a lot like a dérive:

@everylotnyc posts pictures in numeric order by Tax ID, which means it’s posting pictures in a snaking line that started at the southern tip of Manhattan and is moving north. Eventually it will cross into the Bronx, and in 30 years or so, it will end at the southern tip of Staten Island.

Freeman also alluded to the influence of Alfred Korzybski, who coined the phrase, “the map is not the territory”:

Streetview and the property database are both a widely used because they’re big, (putatively) free, and offer a completionist, supposedly comprehensive view of the world. They’re also both products of people working within big organizations, taking shortcuts and making compromises.

I was not following @everylotnyc at the time, but I knew people who did. I had seen some of their retweets and commentaries. The bot shows us pictures of lots that some of us have walked past hundreds of times, but seeing it in our twitter timelines makes us see it fresh again and notice new things. It is the property we know, and yet we realize how much we don’t know it.

When I thought about those Street View images in the beta site, I realized that we could do the same thing for trees for the Trees Count Data Jam. I looked, and discovered that Freeman had made his code available on Github, so I started implementing it on a server I use. I shared my idea with Timm Dapper, Laura Silver and Elber Carneiro, and we formed a team to make it work by the deadline.

It is important to make this much clear: @everytreenyc may help to remind us that no census is ever flawless or complete, but it is not meant as a critique of the enterprise of tree counts. Similarly, I do not believe that @everylotnyc was meant as an indictment of property databases. On the contrary, just as @everylotnyc depends on the imperfect completeness of the New York City property database, @everytreenyc would not be possible without the imperfect completeness of the Trees Count 2015 census.

Without even an attempt at completeness, we could have no confidence that our random dive into the street forest was anything even approaching random. We would not be able to say that following the bot would give us a representative sample of the city’s trees. In fact, because I know that the census is currently incomplete in southern and eastern Queens, when I see trees from the Bronx and Staten Island and Astoria come up in my timeline I am aware that I am missing the trees of southeastern Queens, and awaiting their addition to the census.

Despite that fact, the current status of the 2015 census is good enough for now. It is good enough to raise new questions: what about that parking lot? Is there a missing tree in the Street View image because the image is newer than the census, or older? It is good enough to continue the cycle of diving and coming up, of passing through the funnel and back up, of moving from quantitative to qualitative and back again.

Posted in Information Technology, Natural Language Generation, Sampling, Science, Web | Leave a comment

On pet parents

I’m a parent. It doesn’t make me better or worse than anyone else, it’s just a category that reflects some facts about me: I conceived a new human with my wife, we are raising and caring for that human, and we expect to have a relationship with him for the rest of our lives. Some people don’t take parenthood seriously, so it doesn’t impact their lives very much, but their kids suffer. We take it very seriously, and it’s a lot of work for us.

Pet mom (Photo: 0x01C / Wikimedia
Pet mom (Photo: 0x010C / Wikimedia

I also take care of pets. We own three cats, and sometimes I walk my mom’s dog or take him to be groomed. It can be a lot of work, and the relationships can be very intimate at times. “Ownership” is kind of a funny word for it. In some ways it can be like certain stages of parenting: we buy all the food and make sure the animals don’t get into danger. It makes sense when I hear people refer to their pets as their “baby” or put words in their pets calling themselves “daddy.” I even understand when I hear them refer to themselves as “pet moms.”

I understand this usage, but I do not agree with it. I have a kid, and I have pets. The relationships are similar, but different. When someone calls themself a “pet dad,” it trivializes my relationship with my kid and infantilizes my pets. It erases the work of the actual parents, and trivializes the hard work of humans who act as surrogate parents to infant pets. I am a dad: I am not a pet dad, and I am not my pets’ dad. Or their mom.

My kid will one day be an adult, and while I may always think of him as The Kid, he will be able to function as an autonomous member of society. (Note that the term “kid” itself is an animal metaphor – referring to a juvenile goat.) Only one of my cats can still be considered a juvenile by any standard; the others are five years old and twenty years old, respectively. They are adult males, and until the last century they would have been free to come and go as they wished.

If my cats are incapable of leaving our house unaccompanied it is more likely due to the fact that we have cars everywhere than anything else. When I was a kid we lost three dogs to car culture. When I was eleven I saw a neighbor’s cat crushed beneath the wheels of a car, and arrived just in time to see him take his last breath. We have indoor cats and dog leashes in part because we have made the outdoors inhospitable.

I suspect one reason we hear more about “pet parents” is that so few of our pets are parents themselves. I support universal neutering, and have only adopted neutered cats from shelters or feral rescuers. It’s the best response to the overpopulation of feral animals, but it does make the pets neuter – and childless.

When I was a kid we had a cat who had a litter of kittens. I watched one of our dogs give birth to eleven puppies, and then found homes for the ten that lived. Our male cats were aggressive, sexual toms. Again, not wise in retrospect, but it was hard to think of any of the humans in the house as “moms” or “dads” of our pets while they were themselves moms and dads.

There is one human I know who would qualify as a “cat mom” in my mind. She is the woman who leads the feral cat helpers in our neighborhood. Six years ago someone found a baby kitten near some railroad tracks in Manhattan. My neighbor fostered this kitten in her apartment for five months, feeding him with an eyedropper until he was old enough to eat. She posted his picture on her website and we adopted him. If he has a “pet mom” it’s her.

Posted in Categorization, Language politics, Semantics | Leave a comment

What Professor Bigshot said

I was feeling very nervous, sitting there in Professor Bigshot’s office. I had just been accepted into the PhD program, and was visiting the department to get to know everyone and see if it was the right fit. I hadn’t applied to any other PhD program. If I didn’t go here, I probably wouldn’t get a PhD.

You can figure out pretty easily who Professor Bigshot is, if you care. I guess you could say I’m giving her a pseudonym for SEO reasons.

The student who was showing me around the department had asked, “Oh, have you met Professor Bigshot yet?” I had not. I had heard of her, but I had absolutely no idea what her work was: what she studied, what she had written, what her theories were. I was nervous, sitting there in her office, because I was afraid she would find out that I hadn’t read anything she’d written. I was right to be nervous, but for a completely different reason.

“So Angus,” Professor Bigshot asked me, “You know that the job market in linguistics is very tight? You understand that we cannot guarantee you a job when you graduate?”

I relaxed a bit. I knew this one. I had thought long and hard about it. I said, brightly, “Oh yes. But that’s okay. I have computer skills, and I can always get another IT job if this doesn’t work out.”

“Well, at this university,” Professor Bigshot’s face abruptly twisted into a snarl. “We are not in the business of granting recreational PhDs.”

That was the last thing I was expecting to hear. I did the only thing I could think of: I thanked Professor Bigshot politely, got up and walked out of her office.

I still had a day and a half before I left town. I had planned to visit classes and see the rest of the university.

I didn’t quite know how to tell my student guide what Professor Bigshot had said, so in a few minutes I was sitting down in Professor Littleshot’s office. I didn’t know what he had done in linguistics either, but at this point it hardly seemed to matter.

“So Angus,” said Professor Littleshot. “Have you made up your mind whether you’re going to attend our program?”

I opened my mouth. “Well…”

“Is there anything I can say to convince you?”

I shut my mouth and thought for a minute. “Well, I guess you just did.”

That was slightly over nineteen years ago. Professor Littleshot retired before I could propose a dissertation topic. I wrote a dissertation in Professor Bigshot’s theoretical framework, received my PhD in 2009, taught linguistics as an adjunct for seven years, sent out applications for tenure-track jobs and was invited to exactly zero interviews. Last week I started working as a Python developer in the IT Department at Columbia University.

Recreational PhD? Well, there have been times that I’ve enjoyed quite a lot. And yes, I suppose you can get a back injury, chronic insomnia and thousands of dollars of debt from plenty of other recreational activities. Maybe I would have enjoyed it more if I hadn’t tried so hard to prove Professor Bigshot wrong.

Posted in Academia | Leave a comment

Quantitative needs qualitative, and vice versa

Data Science is all the rage these days. But this current craze focuses on a particular kind of data analysis. I conducted an informal poll as an icebreaker at a recent data science party, and most of the people I talked to said that it wasn’t data science if it didn’t include machine learning. Companies in all industries have been hiring “quants” to do statistical modeling. Even in the humanities, “distant reading” is a growing trend.

primula-1326409_1280

There has been a reaction to this, of course. Other humanists have argued for the continued value of close reading. Some companies have been hiring anthropologists and ethnographers. Academics, journalists and literary critics regularly write about the importance of nuance and empathy.

For years, my response to both types of arguments has been “we need both!” But this is not some timid search for a false balance or inclusion. We need both close examination and distributional analysis because the way we investigate the world depends on both, and both depend on each other.

I learned this from my advisor Melissa Axelrod, and a book she assigned me for an independent study on research methods. The Professional Stranger is a guide to ethnographic field methods, but also contains some commentary on the nature of scientific inquiry, and mixes its well-deserved criticism of quantitative social science with a frank acknowledgment of the interdependence of qualitative and quantitative methods. On Page 134 he discusses Labov’s famous study of /r/-dropping in New York City:

The catch, of course, is that he would never have known which variable to look at without the blood, sweat and tears of previous linguists who had worked with a few informants and identified problems in the linguistic structure of American English. All of which finally brings us to the point of this example traditional ethnography struggles mightily with the existence of pattern among the few.

Labov acknowledges these contributions in Chapter 2 of his 1966 book: Babbitt (1896), Thomas (1932, 1942, 1951), Kurath (1949, based on interviews by Guy S. Lowman), Hubbell (1950) and Bronstein (1962). His work would not be possible without theirs, and their work was incomplete until he developed a theoretical framework to place their analysis in, and tested that framework with distributional surveys.

We’ve all seen what happens when people try to use one of these methods without the other. Statistical methods that are not grounded in close examination of specific examples produce surveys that are meaningless to the people who take them and uninformative to scientists. Qualitative investigations that are not checked with rigorous distributional surveys produce unfounded, misleading generalizations. The worst of both worlds are quantitative surveys that are neither broadly grounded in ethnography nor applied to representative samples.

It’s also clear in Agar’s book that qualitative and quantitative are not a binary distinction, but rather two ends of a continuum. Research starts with informal observations about specific things (people, places, events) that give rise to open-ended questions. The answers to these questions then provoke more focused questions that are asked of a wider range of things, and so on.

The concepts of broad and narrow, general and specific, can be confusing here, because at the qualitative, close or ethnographic end of the spectrum the questions are broad and general but asked about a narrow, specific set of subjects. At the quantitative, distant or distributional end of the spectrum the questions are narrow and specific, but asked of a broad, general range of subjects. Agar uses a “funnel” metaphor to model how the questions narrow during this progression, but he could just as easily have used a showerhead to model how the subjects broaden at the same time.

The progression is not one-way, either. The findings of a broad survey can raise new questions, which can only be answered by a new round of investigation, again beginning with qualitative examination on a small scale and possibly proceeding to another broad survey. This is one of the cycles that increase our knowledge.

Rather than the funnel metaphor, I prefer a metaphor based on seeing. Recently I’ve been re-reading The Omnivore’s Dilemma, and in Chapter 8 Michael Pollan talks about taking a close view of a field of grass:

In fact, the first time I met Salatin he’d insisted that even before I met any of his animals, i get down on my belly in this very pasture to make the acquaintance of the less charismatic species his farm was nurturing that, in turn, were nurturing his farm.

Pollan then gets up from the grass to take a broader view of the pasture, but later bends down again to focus on individual cows and plants. He does this metaphorically throughout the book, as many great authors do: focusing in on a specific case, then zooming out to discuss how that case fits in with the bigger picture. Whether he’s talking about factory-farmed Steer 534, or Budger the grass-fed cow, or even the thousands of organic chickens that are functionally nameless under the generic name of “Rosie,” he dives into specific details about the animals, then follows up by reporting statistics about these farming methods and the animals they raise.

The bottom line is that we need studies from all over the qualitative-quantitative spectrum. They build on each other, forming a cycle of knowledge. We need to fund them all, to hire people to do them all, and to promote and publish them all. If you do it right, the plural of anecdote is indeed data, and you can’t have data without anecdotes.

Posted in Science | 2 Comments

Viewing in free motion

Last month I went on a walk with my friend Ezra. It was his birthday, so we walked for almost two hours, drinking coffee, eating cinnamon rolls, and talking about semantics and coding. The funny thing is that Ezra lives on the West Coast and I live in New York, so we conducted our entire conversation by cell phone, with him walking through Ballard and Loyal Heights, and me walking through Jackson Heights and East Elmhurst.

20141207_142554

Cell phones have been around for decades, and I’m sure we’re far from the first to walk together this way. You’ve probably done it yourself. But it reminded me of Isaac Asimov’s 1956 novel The Naked Sun, in which our hero Elijah Baley visits an Earth colony on the planet Solaria, where all the colonists live on separate estates, with at most one spouse and possibly an infant child, surrounded by robots who tend to their every need, almost never seeing one another in person. They interact socially by “viewing” each other through realistic virtual-reality projections.

Baley interviews a murder suspect, Gladia Delmarre, and is intrigued when she tells him she goes on walks together with her neighbor. “I didn’t know you could go on walks together with anyone,” says Baley.

“I said viewing,” responds Gladia. “Oh well, I keep forgetting you’re an Earthman. Viewing in free motion means we focus on ourselves and we can go anywhere we want to without losing contact. I walk on my estate and he walks on his and we’re together.”

I had no visual contact with Ezra during this walk. I’ve seen people “viewing in free motion” on FaceTime. We could probably have rigged something up with a GoPro camera and Google Glass, but it would most likely not have been much like on Solaria, where I could have looked over and seen a chunk of Seattle superimposed on Queens, with Ezra walking across it next to me.

The biggest reason not to attempt any visual presence is that it was dangerous enough for me to be crossing the street while talking; it would have been much worse if the virtual view of the cars on 24th Avenue NW were blocking my view of the cars coming at me down Northern Boulevard.

Of course, on Solaria all the cars were (or will be?) automatic, and there are armies of robots to protect the humans from danger.

Posted in Mobile tech | Leave a comment

Printing differences and material issues in Google Books

I am looking forward to presenting my Digital Parisian Stage corpus and the exciting results I’ve gotten from it so far at the American Association for Corpus Linguistics at Iowa State in September. In the meantime I’m continuing to process texts, working towards a one percent sample from the Napoleonic period (Volume 1 of the Wicks catalog).

One of the plays in my sample is les Mœurs du jour, ou l’école des femmes, a comedy by Collin-Harleville (also known as Jean-François Collin d’Harleville). I ran the initial OCR on a PDF scanned for the Google Books project. For reasons that will become clear, I will refer to it by its Google Books ID, VyBaAAAAcAAJ. When I went to clean up the OCR text, I discovered that it was missing pages 2-6. I emailed the Google Books team about this, and got the following response:

google-books-material-issue

I’m guessing “a material issue” means that those pages were missing from the original paper copy, but I didn’t even bother emailing until the other day, since I found another copy in the Google Books database, with the ID kVwxUp_LPIoC.

Comparing the OCR text of VyBaAAAAcAAJ with the PDF of kVwxUp_LPIoC, I discovered some differences in spelling. For example, throughout the text, words that end in the old fashioned spelling -ois or -oit in VyBaAAAAcAAJ are spelled with the more modern -ais in kVwxUp_LPIoC. There is also a difference in the way “Madame” is abbreviated (“Mad.” vs. “M.me“) and in which accented letters preserve their accents when set in small caps, and differences in pagination. Here is the entirety of Act III, Scene X in each copy:

VyBaAAAAcAAJ
Act III, Scene X in copy VyBaAAAAcAAJ
Act III, Scene X in kVwxUp_LPIoC
Act III, Scene X in copy kVwxUp_LPIoC

My first impulse was to look at the front matter and see if the two copies were identified as different editions or different printings. Unfortunately, they were almost identical, with the most notable differences being that VyBaAAAAcAAJ has an œ ligature in the title, while kVwxUp_LPIoC is signed by the playwright and marked as being a personal gift from him to an unspecified recipient. Both copies give the exact same dates: the play was first performed on the 7th of Thermidor in year VIII and published in the same year (1800).

The Google Books metadata indicate that kVwxUp_LPIoC was digitized from the Lyon Public Library, while VyBaAAAAcAAJ came from the Public Library of the Netherlands. The other copies I have found in the Google Books database, OyL1oo2CqNIC from the National Library of Naples and dPRIAAAAcAAJ from Ghent University, appear to be the same printing as kVwxUp_LPIoC, as does the copy from the National Library of France.

Since the -ais and M.me spellings are closer to the forms used in France today, we might expect that kVwxUp_LPIoC and its cousins are from a newer printing. But in Act II, Scene XI I came across a difference that concerns negation, the variable that I have been studying for many years. The decadent Parisians Monsieur Basset and Madame de Verdie question whether marriage should be eternal. Our hero Formont replies that he has no reason not to remain with his wife forever. In VyBaAAAAcAAJ he says, “je n’ai pas de raisons,” while in kVwxUp_LPIoC he says “je n’ai point de raisons.”

Act III, Scene XI (page 75) in VyBaAAAAcAAJ
Act III, Scene XI (page 75) in VyBaAAAAcAAJ
Act III, Scene XI (page 78) in kVwxUp_LPIoC
Act III, Scene XI (page 78) in kVwxUp_LPIoC

In my dissertation study I found that the relative use of ne … point had already peaked by the nineteenth century, and was being overtaken by ne … pas. If this play fits the pattern, the use of the more conservative pattern in kVwxUp_LPIoC goes against the more innovative -ais and M.me spellings.

I am not an expert in French Revolutionary printing (if anyone knows a good reference or contact, please let me know!). My best guess is that kVwxUp_LPIoC is from a limited early run, some copies of which were given to the playwright to give away, while VyBaAAAAcAAJ and the other -ais/M.me/ne … point copies are from a larger, slightly later, printing.

In any case, it is clear that I should pick one copy and make it consistent with that. Since VyBaAAAAcAAJ is incomplete, I will try dPRIAAAAcAAJ. I will try to double-check all the spellings and wordings, but at the very least I will check all of the examples of negation against dPRIAAAAcAAJ as I annotate them.

Posted in Digital humanities, French, Language change, Variation, Web | Leave a comment

Introducing Selected Birthdays

If you have an Android phone like me, you probably use Google Calendar. I like the way it integrates with my contacts so that I can schedule events with people. I like the idea of it integrating with my Google+ contacts to automatically create a calendar of birthdays that I don’t want to miss. There’s a glitch in that, but I’ve created a new app to get around it, called Selected Birthdays.

birthdays-screenshot20160514

The glitch is that the builtin Birthdays calendar has three options: show your Google Contacts, show your contacts and the people in your Google+ circles, or nothing. I have a number of contacts who are attractive and successful people, but I’m sorry to say I have no interest in knowing when their birthdays are. Natasha Lomas has even stronger feelings.

Google doesn’t let you change the builtin Birthdays calendar, but it does let you create a new calendar and fill it with the birthdays that interest you. My new web app, Selected Birthdays, automates that process. It goes through your contacts, finds the ones who have shared their birthdays with you, and gives you a checklist. You decide whose birthdays to include, and Select Birthdays will create a new calendar with those birthdays. It’ll also give you the option of hiding Google’s built-in birthday calendar.

I wrote the Selected Birthdays app in Javascript with the Google+ and Google Calendar APIs. Ian Jones was a big help in recommending the moment.js library, which I used to manipulate dates. Bootflat helped me add a bit of visual style.

For the app to work you’ll have to authorize it to read your contacts and write your calendars. For your privacy, the app communicates directly between your browser and Google’s server; once you download it there is no further contact with my server. There is no way for me to see or edit your contacts or calendars. You can verify that in the source code.

Please let me know if you have any comments, questions or suggestions. I have also made the code available on GitHub for free under the Apache License, if you want to build on it. A number of people have said they wish they had an app like this for Facebook. If enough of you repeat that, I’ll look into it!

Posted in Android, Data security, Software, Web | Leave a comment

Prejudice and intelligibility

Last month I wrote about the fact that intelligibility – the ability of native speakers of one language or dialect to understand a closely related one – is not constant or automatic. A major factor in intelligibility is familiarity: when I was a kid, for example, I had a hard time understanding the Beatles until I got used to them. Having lived in North Carolina, I find it much easier to understand people from Ocracoke Island than my students do.

Photo: Theonlysilentbob / Wikimedia
Photo: Theonlysilentbob / Wikimedia

Prejudice can play a big role in intelligibility, as Donald Rubin showed in 1992. (I first heard about this study from Rosina Lippi-Green’s book English With an Accent.) At the time, American universities had recently increased the overall number of instructors from East Asia they employed, and some students complained that they had difficulty understanding the accents of their instructors.

In an ingenious experiment, Rubin demonstrated that much of this difficulty was due to prejudice. He recorded four-minute samples of “a native speaker of English raised in Central Ohio” reading a script for introductory-level lectures on two different subjects and played those samples to three groups of students.

For one group, a still photo of a “Caucasian” woman representing the instructor was projected on a screen while the audio sample was played. For the second group, a photo of “an Asian (Chinese)” woman was projected, with the same audio of the woman from central Ohio (presumably not of Asian ancestry) was played. The third group heard only the audio and was not shown a photo.

In a survey they took after hearing the clip, most of the students who saw the picture of an Asian woman reported that the speaker had “Oriental/Asian ethnicity.” That’s not surprising, because it’s essentially what they were told by being shown the photograph. But many of these students went further and reported that the person in the recording “speaks with a foreign accent.” In contrast, the vast majority of the students who were shown the “Caucasian” picture said that they heard “an American accent.”

The kicker is that immediately after they heard the recording (and before answering the survey), Rubin tested the students on their comprehension of the content of the excerpt, by giving them a transcript with every seventh word replaced by a blank. The students who saw a picture of an Asian woman not only thought they heard a “foreign accent,” but they did worse on the comprehension task! Rubin concluded that “listening comprehension seemed to be undermined simply by identifying (visually) the instructor as Asian.”

Rubin’s subjects may not have felt any particular hostility towards people from East Asia, but they had a preconceived notion that the instructor would have an accent, and they assumed that they would have difficulty understanding her, so they didn’t bother trying.

This study (and a previous one by Rubin with Kim Smith) connect back to what I was saying about familiarity, and I will discuss that and power imbalances in a future post, but this finding is striking enough to merit its own post.

Posted in Categorization, English as a Second Language, Language politics, Variation | Leave a comment

Ten reasons why sign-to-speech is not going to be practical any time soon.

It’s that time again! A bunch of really eager computer scientists have a prototype that will translate sign language to speech! They’ve got a really cool video that you just gotta see! They win an award! (from a panel that includes no signers or linguists). Technology news sites go wild! (without interviewing any linguists, and sometimes without even interviewing any deaf people).

Gee-whiz Tech Photo: Texas A&M
Gee-whiz Tech Photo: Texas A&M

…and we computational sign linguists, who have been through this over and over, every year or two, just *facepalm*.

The latest strain of viral computational sign linguistics hype comes from the University of Washington, where two hearing undergrads have put together a system that … supposedly recognizes isolated hand gestures in citation form. But you can see the potential! *facepalm*.

Twelve years ago, after already having a few of these *facepalm* moments, I wrote up a summary of the challenges facing any computational sign linguistics project and published it as part of a paper on my sign language synthesis prototype. But since most people don’t have a subscription to the journal it appeared in, I’ve put together a quick summary of Ten Reasons why sign-to-speech is not going to be practical any time soon.

  1. Sign languages are languages. They’re different from spoken languages. Yes, that means that if you think of a place where there’s a sign language and a spoken language, they’re going to be different. More different than English and Chinese.
  2. We can’t do this for spoken languages. You know that app where you can speak English into it and out comes fluent Pashto? No? That’s because it doesn’t exist. The Army has wanted an app like that for decades, and they’ve been funding it up the wazoo, and it’s still not here. Sign languages are at least ten times harder.
  3. It’s complicated. Computers aren’t great with natural language at all, but they’re better with written language than spoken language. For that reason, people have broken the speech-to-speech translation task down into three steps: speech-to-text, machine translation, and text-to-speech.
  4. Speech to text is hard. When you call a company and get a message saying “press or say the number after the tone,” do you press or say? I bet you don’t even call if you can get to their website, because speech to text suuucks:

    -Say “yes” or “no” after the tone.
    -No.
    -I think you said, “Go!” Is that correct?
    -No.
    -My mistake. Please try again.
    -No.
    -I think you said, “I love cheese.” Is that correct?
    -Operator!

  5. There is no text. A lot of people think that text for a sign language is the same as the spoken language, but if you think about point 1 you’ll realize that that can’t possibly be true. Well, why don’t people write sign languages? I believe it can be done, and lots of people have tried, but for some reason it never seems to catch on. It might just be the classifier predicates.
  6. Sign recognition is hard. There’s a lot that linguists don’t know about sign languages already. Computers can’t even get reliable signs from people wearing gloves, never mind video feeds. This may be better than gloves, but it doesn’t do anything with facial or body gestures.
  7. Machine translation is hard going from one written (i.e. written version of a spoken) language to another. Different words, different meanings, different word order. You can’t just look up words in a dictionary and string them together. Google Translate is only moderately decent because it’s throwing massive statistical computing power at the input – and that only works for languages with a huge corpus of text available.
  8. Sign to spoken translation is really hard. Remember how in #5 I mentioned that there is no text for sign languages? No text, no huge corpus, no machine translation. I tried making a rule-based translation system, and as soon as I realized how humongous the task of translating classifier predicates was, I backed off. Matt Huenerfauth has been trying (PDF), but he knows how big a job it is.
  9. Sign synthesis is hard. Okay, that’s probably the easiest problem of them all. I built a prototype sign synthesis system in 1997, I’ve improved it, and other people have built even better ones since.
  10. What is this for, anyway? Oh yeah, why are we doing this? So that Deaf people can carry a device with a camera around, and every time they want to talk to a hearing person they have to mount it on something, stand in a well-lighted area and sign into it? Or maybe someday have special clothing that can recognize their hand gestures, but nothing for their facial gestures? I’m sure that’s so much better than decent funding for interpreters, or teaching more people to sign, or hiring more fluent signers in key positions where Deaf people need the best customer service.

So I’m asking all you computer scientists out there who don’t know anything about sign languages, especially anyone who might be in a position to fund something like this or give out one of these gee-whiz awards: Just stop. Take a minute. Step back from the tech-bling. Unplug your messiah complex. Realize that you might not be the best person to decide whether or not this is a good idea. Ask a linguist. And please, ask a Deaf person!

Note: I originally wrote this post in November 2013, in response to an article about a prototype using Microsoft Kinect. I never posted it. Now I’ve seen at least three more, and I feel like I have to post this. I didn’t have to change much.

Posted in Interpreting, Language politics, Natural Language Generation, Translation | 14 Comments

Including linguistics at literary conferences

I just got back from attending my second meeting of the Northeast Modern Language Association. My experience at both conferences has been very positive: friendly people, interesting talks, good connections. But I would like to see a little more linguistics at NeMLA, and better opportunities for linguists to attend. I’ve talked with some of the officers of the organization about this, and they have told me they welcome more papers from linguists.

Photo. Sean Weidman
Photo. Sean Weidman

One major challenge is that the session calls tend to be very specific and/or literary. Here are some examples from this year’s conference:

  • The Language of American Warfare after World War II
  • Representing Motherhood in Contemporary Italy
  • ‘Deviance’ in 19th-century French Women´s Writing

There is nothing wrong with any of these topics, but when they are all that specific, linguistic work can easily fall through the cracks. For several years I scanned the calls and simply failed to find anything where my work would fit. The two papers that I have presented are both pedagogical (in 2014 on using music to teach French, and this year on using accent tag videos to teach language variation and language attitudes). I believe that papers about the structure of language can find an audience at NeMLA, when there are sessions where they can fit.

In contrast, the continental MLA tends to have several calls with broader scope: an open call for 18th-Century French, for example, as well as ones specifically related to linguistics. When I presented at the MLA in 2012 it was at a session titled “Change and Perception of Change in the Romance Languages,” organized by Chris Palmer (a linguist and all-around nice guy).

With all that in mind, if you are considering attending next year’s NeMLA in Baltimore, I would like to ask the following:

  • Would you consider submitting a session proposal by the April 29th deadline?
  • Would you like to co-chair a session with me? (please respond by private email)
  • What topics would you find most inviting for linguistics papers at a (mostly) literature conference?

I recognize that I have readers outside of the region. For those of you who do not live in northeastern North America, have you had similar experiences with literary conferences? Do you have suggestions for session topics – or session topics to avoid?

Posted in Conferences, Events | Leave a comment