At the beginning of June I participated in the Trees Count Data Jam, experimenting with the results of the census of New York City street trees begun by the Parks Department in 2015. I had seen a beta version of the map tool created by the Parks Department’s data team that included images of the trees pulled from the Google Street View database. Those images reminded me of others I had seen in the @everylotnyc twitter feed.

silver maple 20160827

@everylotnyc is a Twitter bot that explores the City’s property database. It goes down the list in order by taxID number. Every half hour it compose a tweet for a property, consisting of the address, the borough and the Street View photo. It seems like it would be boring, but some people find it fascinating. Stephen Smith, in particular, has used it as the basis for some insightful commentary.

It occurred to me that @everylotnyc is actually a very powerful data visualization tool. When we think of “big data,” we usually think of maps and charts that try to encompass all the data – or an entire slice of it. The winning project from the Trees Count Data Jam was just such a project: identifying correlations between cooler streets and the presence of trees.

Social scientists, and even humanists recently, fight over quantitative and qualitative methods, but the fact is that we need them both. The ethnographer Michael Agar argues that distributional claims like “5.4 percent of trees in New York are in poor condition” are valuable, but primarily as a springboard for diving back into the data to ask more questions and answer them in an ongoing cycle. We also need to examine the world in detail before we even know which distributional questions to ask.

If our goal is to bring down the percentage of trees in Poor condition, we need to know why those trees are in Poor condition. What brought their condition down? Disease? Neglect? Pollution? Why these trees and not others?

Patterns of neglect are often due to the habits we develop of seeing and not seeing. We are used to seeing what is convenient, what is close, what is easy to observe, what is on our path. But even then, we develop filters to hide what we take to be irrelevant to our task at hand, and it can be hard to drop these filters. We can walk past a tree every day and not notice it. We fail to see the trees for the forest.

Privilege filters our experience in particular ways. A Parks Department scientist told me that the volunteer tree counts tended to be concentrated in wealthier areas of Manhattan and Brooklyn, and that many areas of the Bronx and Staten Island had to be counted by Parks staff. This reflects uneven amounts of leisure time and uneven levels of access to city resources across these neighborhoods, as well as uneven levels of walkability.

A time-honored strategy for seeing what is ordinarily filtered out is to deviate from our usual patterns, either with a new pattern or with randomness. This strategy can be traced at least as far as the sampling techniques developed by Pierre-Simon Laplace for measuring the population of Napoleon’s empire, the forerunner of modern statistical methods. Also among Laplace’s cultural heirs are the flâneurs of late nineteenth-century Paris, who studied the city by taking random walks through its crowds, as noted by Charles Baudelaire and Walter Benjamin.

In the tradition of the flâneurs, the Situationists of the mid-twentieth century highlighted the value of random walks, that they called dérives. Here is Guy Debord (1955, translated by Ken Knabb):

The sudden change of ambiance in a street within the space of a few meters; the evident division of a city into zones of distinct psychic atmospheres; the path of least resistance which is automatically followed in aimless strolls (and which has no relation to the physical contour of the ground); the appealing or repelling character of certain places — these phenomena all seem to be neglected. In any case they are never envisaged as depending on causes that can be uncovered by careful analysis and turned to account. People are quite aware that some neighborhoods are gloomy and others pleasant. But they generally simply assume that elegant streets cause a feeling of satisfaction and that poor streets are depressing, and let it go at that. In fact, the variety of possible combinations of ambiances, analogous to the blending of pure chemicals in an infinite number of mixtures, gives rise to feelings as differentiated and complex as any other form of spectacle can evoke. The slightest demystified investigation reveals that the qualitatively or quantitatively different influences of diverse urban decors cannot be determined solely on the basis of the historical period or architectural style, much less on the basis of housing conditions.

In an interview with Neil Freeman, the creator of @everylotbot, Cassim Shepard of Urban Omnibus noted the connections between the flâneurs, the dérive and Freeman’s work. Freeman acknowledged this: “How we move through space plays a huge and under-appreciated role in shaping how we process, perceive and value different spaces and places.”

Freeman did not choose randomness, but as he describes it in a tinyletter, the path of @everylotbot sounds a lot like a dérive:

@everylotnyc posts pictures in numeric order by Tax ID, which means it’s posting pictures in a snaking line that started at the southern tip of Manhattan and is moving north. Eventually it will cross into the Bronx, and in 30 years or so, it will end at the southern tip of Staten Island.

Freeman also alluded to the influence of Alfred Korzybski, who coined the phrase, “the map is not the territory”:

Streetview and the property database are both a widely used because they’re big, (putatively) free, and offer a completionist, supposedly comprehensive view of the world. They’re also both products of people working within big organizations, taking shortcuts and making compromises.

I was not following @everylotnyc at the time, but I knew people who did. I had seen some of their retweets and commentaries. The bot shows us pictures of lots that some of us have walked past hundreds of times, but seeing it in our twitter timelines makes us see it fresh again and notice new things. It is the property we know, and yet we realize how much we don’t know it.

When I thought about those Street View images in the beta site, I realized that we could do the same thing for trees for the Trees Count Data Jam. I looked, and discovered that Freeman had made his code available on Github, so I started implementing it on a server I use. I shared my idea with Timm Dapper, Laura Silver and Elber Carneiro, and we formed a team to make it work by the deadline.

It is important to make this much clear: @everytreenyc may help to remind us that no census is ever flawless or complete, but it is not meant as a critique of the enterprise of tree counts. Similarly, I do not believe that @everylotnyc was meant as an indictment of property databases. On the contrary, just as @everylotnyc depends on the imperfect completeness of the New York City property database, @everytreenyc would not be possible without the imperfect completeness of the Trees Count 2015 census.

Without even an attempt at completeness, we could have no confidence that our random dive into the street forest was anything even approaching random. We would not be able to say that following the bot would give us a representative sample of the city’s trees. In fact, because I know that the census is currently incomplete in southern and eastern Queens, when I see trees from the Bronx and Staten Island and Astoria come up in my timeline I am aware that I am missing the trees of southeastern Queens, and awaiting their addition to the census.

Despite that fact, the current status of the 2015 census is good enough for now. It is good enough to raise new questions: what about that parking lot? Is there a missing tree in the Street View image because the image is newer than the census, or older? It is good enough to continue the cycle of diving and coming up, of passing through the funnel and back up, of moving from quantitative to qualitative and back again.

Teaching phonetic transcription in the digital age

When I first taught phonetic transcription, almost seven years ago, I taught it almost the same way I had learned it twenty-five years ago. Today, the way I teach it is radically different. The story of the change is actually two stories intertwined. One is a story of how I’ve adopted my teaching to the radical changes in technology that occurred in the previous eighteen years. The other is a story of the more subtle evolution of my understanding of phonetics, phonology, phonological variation and the phonetic transcription that allows us to talk about them.

When I took Introduction to Linguistics in 1990 all the materials we had were pencil, paper, two textbooks and the ability of the professor to produce unusual sounds. In 2007 and even today, the textbooks have the same exercises: Read this phonetic transcription, figure out which English words were involved, and write the words in regular orthography. Read these words in English orthography and transcribe the way you pronounce them. Transcribe in broad and narrow transcription.

The first challenge was moving the homework online. I already assigned all the homework and posted all the grades online, and required my students to submit most of the assignments online; that had drastically reduced the amount of paper I had to collect and distribute in class and schlep back and forth. For this I had the advantage that tuition at Saint John’s pays for a laptop for every student. I knew that all of my students had the computing power to access the Blackboard site.

Thanks to the magic of Unicode and Richard Ishida’s IPA Picker, my students were able to submit their homework in the International Phonetic Alphabet without having to fuss with fonts and keyboard layouts. Now, with apps like the Multiling Keyboard, students can even write in the IPA on phones and tablets.

The next problem was that instead of transcribing, some students would look up the English spellings on dictionary sites, copy the standard pronunciation guides, and paste them into the submission box. Other students would give unusual transcriptions, but I couldn’t always tell whether these transcriptions reflected the students’ own pronunciations or just errors.

At first, as my professors had done, I made up for these homework shortcomings with lots of in-class exercises and drills, but they still all relied on the same principle: reading English words and transcribing them. Both in small groups and in full-class exercises, we were able to check the transcriptions and correct each other because everyone involved was listening to the same sounds. It wasn’t until I taught the course exclusively online that I realized there was another way to do it.

When I tell some people that I teach online courses, they imagine students from around the world tuning in to me lecturing at a video camera. This is not the way Saint John’s does online courses. I do create a few videos every semester, but the vast majority of the teaching I do is through social media, primarily the discussion forums on the Blackboard site connected with the course. I realized that I couldn’t teach phonetics without a way to verify that we were listening to the same sounds, and without that classroom contact I no longer had a way.

I also realized that with high-speed internet connections everywhere in the US, I had a new way to verify that we were listening to the same sounds: use a recording. When I took the graduate Introduction to Phonetics in 1993, we had to go to the lab and practice with the cassette tapes from William Smalley’s Manual of Articulatory Phonetics, but if I’m remembering right we didn’t actually do any transcription of the sounds; we just practiced listening to them and producing them. Some of us were better at that than others.

In 2015 we are floating in rivers of linguistic data. Human settlements have always been filled with the spontaneous creation of language, but we used to have to pore over their writings or rely on our untrustworthy memories. In the twentieth century we had records and tape, film and video, but so much of what was on that was scripted and rehearsed. If we could get recordings of the unscripted language it was hard to store, copy and distribute them.

Now people create language in forms that we can grab and hold: online news articles, streaming video, tweets, blog posts, YouTube videos, Facebook comments, podcasts, text messages, voice mails. A good proportion of these are even in nonstandard varieties of the language. We can read them and watch them and listen to them – and then we can reread and rewatch and relisten, we can cut and splice in seconds what would have taken hours – and then analyze them, and compare our analyses.

Instead of telling my students to read English spelling and transcribe in IPA, now I give them a link to a video. This way we’re working from the exact same sequence of sounds, a sequence that we can replay over and over again. I specifically choose pronunciations that don’t match what they find on the dictionary websites. This is precisely what the IPA is for.

Going the other way, I give my students IPA transcriptions and ask them to record themselves pronouncing the transcriptions and post it to Blackboard. Sure, my professor could have assigned us something like this in 1990, but then he would have had to take home a stack of cassettes and spend time rewinding them over and over. Now all my students have smartphones with built-in audio recording apps, and I could probably listen to all of their recordings on my own smartphone if I didn’t have my laptop handy.

So that’s the story about technology and phonetic transcription. Stay tuned for the other story, about the purpose of phonetic transcription.

How to Connect an Insignia NS-15AT10 to ADB on Windows

I bought a nice little tablet at BestBuy, and I wanted to use it to test an Android app I’m developing. In order to do that, I have to connect the tablet to my Windows laptop and run something called ADB. Unfortunately, in order for ADB to connect to it, Windows needs to recognize it as an ADB device, and BestBuy hasn’t done the work to support that.

Photo: Insignia.

Photo: Insignia.

I did find a post by someone named pcdebol that tells you how to get other Insignia tablets working with ADB, and was able to get mine working using the Google USB drivers with some modifications. I wanted to post this for the benefit of other people who want to test their apps on this model of tablet.

The first thing to do is to download the Google driver, unpack it and modify the android_winusb.inf file to add the following lines in the [Google.NTamd64] section.

%SingleAdbInterface% = USB_Install, USB\VID_0414&PID_506B&MI_01
%CompositeAdbInterface% = USB_Install, USB\VID_0414&PID_506B&REV_FFFF&MI_01

I found the “VID” and “PID” codes by looking at the hardware IDs in the Windows Device Manager. They should be the same for all NS-15AT10 tablets, but different for any other model. The next step is to edit the file adb_usb.ini in the .android folder in your user profile (for me, in windows 7, that’s “c:\users\grvsmth\”). If there is no .android folder, you should make one, and if your .android folder has no adb_usb.ini file you should make one of those. Then you put in the file this code, on a line by itself.


It took me a little while to figure out that it’s the VID number from the Device Manager, with an 0x prefix to tell Windows that it’s a hexidecimal number. Once I did that and saved the file, I was able to re-add the device in Device Manager, Windows recognized it, and I was able to connect ADB to it flawlessly and test my app. I hope you have similar success!