Sampling and the digital humanities

Leave a comment February 16, 2016 Angus Andrea Grieve-Smith

I was pleased to have the opportunity to announce some progress on my Digital Parisian Stage project in a lightning talk at the kickoff event for New York City Digital Humanities Week on Tuesday. One theme that was expressed by several other digital humanists that day was the sheer volume of interesting stuff being produced daily, and collected in our archives.

I was particularly struck by Micki McGee’s story of how working on the Yaddo archive challenged her commitment to “horizontality” – flattening hierarchies, moving beyond the “greats” and finding valuable work and stories beyond the canon. The archive was simply too big for her to give everyone the treatment they deserved. She talked about using digital tools to overcome that size, but still was frustrated in the end.

At the KeystoneDH conference this summer I found out about the work of Franco Moretti, who similarly uses digital tools to analyze large corpora. Moretti’s methods seem very useful, but on Tuesday we saw that a lot of people were simply not satisfied with “distant reading”:

Interesting trend I’m noticing at #nycdhweek: lots of admissions to working with data by hand alongside automated processes

— Shana Kimball (@shanakimball) February 9, 2016

Lot of talk abt hands on data work that needed b4 using dig tools. Are we training for that data knowledge & practice correctly? #NYCDHWeek

— Kimon Keramidas (@kimonizer) February 9, 2016

I am of the school that sees quantitative and qualitative methods as two ends of a continuum of tools, all of which are necessary for understanding the world. This is not even a humanities thing: from geologists with hammers to psychologists in clinics, all the sciences rely on close observation of small data sets.

My colleague in the NYU Computer Science Department, Adam Myers, uses the same approach to do natural language processing; I have worked with him on projects like this (PDF. We begin with a close reading of texts from the chosen corpus, then decide on a set of interesting patterns to annotate. As we annotate more and more texts, the patterns come into sharper focus, and eventually we use these annotations to train machine learning routines.

One question that arises with these methods is what to look at first. There is an assumption of uniformity in physics and chemistry, so that scientists can assume that one milliliter of ethyl alcohol will behave more or less like any other milliliter of ethyl alcohol under similar conditions. People are much less interchangeable, leading to problems like WEIRD bias in psychology. Groups of people and their conventions are even more complex, making it even more unlikely that the easiest texts or images to study are going to give us an accurate picture of the whole archive.

Fortunately, this is a solved problem. Pierre-Simon Laplace figured out in 1814 that he could get a reasonable estimate of the population of the French Empire by looking at a representative sample of its d?partements, and subsequent generations have improved on his sampling techniques.

We may not be able to analyze all the things, but if we study enough of them we may be able to get a good idea of what the rest are like. William Sealy “Student” Gosset developed his famous t-test precisely to avoid having to analyze all the things. His employers at the Guinness Brewery wanted to compare different strains of barley without testing every plant in the batch. The p-value told them whether they had sampled enough plants.

I share McGee’s appreciation of “horizontality” and looking beyond the greats, and in my Digital Parisian Stage corpus I achieved that horizontality with the methods developed by Laplace and Student. The creators of the FRANTEXT corpus chose its texts using the “principle of authority,” in essence just using the greats. For my corpus I built on the work of Charles Beaumont Wicks, taking a random sample from his list of all the plays performed in Paris between 1800 and 1815.

What I found was that characters in the randomly selected plays used a lot less of the conservative ne alone construction to negate sentences than characters in the FRANTEXT plays. This seems to be because the FRANTEXT plays focused mostly on aristocrats making long declamatory speeches, while the randomly selected plays also included characters who were servants, peasants, artisans and bourgeois, often in faster-moving dialogue. The characters from the lower classes tended to use much more of the ne ? pas construction, while the aristocrats tended to use ne alone.

Student’s t-test tells me that the difference I found in the relative frequency of ne alone in just four plays was big enough that I could be confident of finding the same pattern in other plays. Even so, I plan to produce the full one percent sample (31 plays) so that I can test for differences that might be smaller

It’s important for me to point out here that this kind of analysis still requires a fairly close reading of the text. Someone might say that I just haven’t come up with the right regular expression or parser, but at this point I don’t know of any automatic tools that can reliably distinguish the negation phenomena that interest me. I find that to really get an accurate picture of what’s going on I have to not only read several lines before and after each instance of negation, but in fact the entire play. Sampling reduces the number of times I have to do that reading, to bring the overall workload down to a reasonable level.

Okay, you may be saying, but I want to analyze all the things! Even a random sample isn’t good enough. Well, if you don’t have the time or the money to analyze all the things, a random sample can make the case for analyzing everything. For example, I found several instances of the pas alone construction, which is now common but was rare in the early nineteenth century. I also turned up the script for a pantomime about the death of Captain Cook that gave the original Hawaiian characters a surprising (given what little I knew about these attitudes) level of intelligence and agency.

If either of those findings intrigued you and made you want to work on the project, or fund it, or hire me, that illustrates another use of sampling. (You should also email me.) Sampling gives us a place to start outside of the “greats,” where we can find interesting information that may inspire others to get involved.

One final note: the first step to getting a representative sample is to have a catalog. You won’t be able to generalize to all the things until you have a list of all the things. This is why my Digital Parisian Stage project owes so much to Beaumont Wicks. This “paper and ink” humanist spent his life creating a list of every play performed in Paris in the nineteenth century – the catalog that I sampled for my corpus.

It’s got to be deontic necessity

Leave a comment January 30, 2016 Angus Andrea Grieve-Smith

Gretchen McCulloch has been posting about epistemic modality on her All Things Linguistic blog recently. If you don’t know what epistemic modality is, very briefly, in many languages (including English), there are words that are ambiguous in a particular way: between saying something about our social and moral codes, and saying something about our knowledge of the world. Consider this expression of necessity:

She should be there by now.

Under a deontic interpretation, this “should” is telling us that she has an obligation to be there by now, but under an epistemic interpretation it means that we expect her to be there by now. Other expressions of necessity like “must, have to, got to” have similar pairs of interpretations. Now consider this expression of possibility:

She could have gone there yesterday.

Under a deontic interpretation the “could” means that she was allowed to go there yesterday, but the epistemic interpretation means we have some reason to imagine that she went there yesterday. There are other expressions of possibility like “may, can, might” that allow similar ambiguity.

There are also root modality interpretations: under root necessity, (1) means that the circumstances of the world have culminated in her being there by now, and under root possibility, (2) means simply that it was possible for her to go there yesterday.

A few years ago I noticed that the Jackson Browne song “Somebody’s Baby” had an interesting twist to it. According to Harmonov it was written in 1982 for the soundtrack to Fast Times at Ridgemont High, as a theme for the character of Stacy, played by Jennifer Jason Leigh.

I was listening to “Somebody’s Baby” again yesterday not long after reading Gretchen’s post, and I realized that the lyric twist plays on the ambiguity between epistemic and deontic modality. In the first verse we hear “got to be” and “must be”:

Well, just, a look at that girl with the lights comin’ up in her eyes.
She’s got to be somebody’s baby.
She must be somebody’s baby.

The clear intent is the epistemic one: the girl is so fine that the narrator can only conclude she has a relationship based on this evidence. The guys on the corner don’t harass her because they come to the same conclusion and don’t want any trouble from “somebody.” But the twist comes in the second verse:

I heard her talkin’ with her friend when she thought nobody else was around.
She said she’s got to be somebody’s baby; she must be somebody’s baby.

The epistemic reading of these modals is ruled out by the fact that it is the girl who is saying them. Presumably she knows whether she’s somebody’s baby or not, and does not need to declare this epistemic relation to her friend. This leads us to the root necessity reading: she has a need to be somebody’s baby, and the deontic reading: she has an obligation to be somebody’s baby.

It’s actually very sad that this girl believes her beauty and worth are not validated unless she is in a relationship highlighted by metaphors of possession and infantilization. Not to fear, our narrator will use this overheard intelligence to approach her when all the other guys are too intimidated, and she can be his baby. Let’s hope he’s a decent guy.

But anyway, the point is: modals. The first verse is epistemic necessity, but the second verse has got to be deontic necessity. It must be deontic necessity. It’s so fine.

Describing differences in pronunciation

Leave a comment January 27, 2016 Angus Andrea Grieve-Smith

Last month I wrote that instead of only two levels of phonetic transcription, “broad” and “narrow,” what people do in practice is to adjust their level of detail according to the point they want to make. In this it is like any other form of communication: too much detail can be a distraction.

But how do we decide how much detail to put in a given transcription, and how can we teach this to our students? In my experience there is always some kind of comparison. Maybe we’re comparing two speakers from different times or different regions, ethnicities, first languages, social classes, anatomies. Maybe we’re comparing two utterances by the same person in different phonetic, semantic, social or emotional contexts.

Sometimes there is no overt comparison, but at those times there is almost always an implicit comparison. If we are presenting a particular pronunciation it is because we assume our readers will find it interesting, because it is pathological or nonstandard. This implies that there is a normal or standard pronunciation that we have in our heads to contrast it to.

The existence of this comparison tells us the right level of detail to include in our transcriptions: enough to show the contrasts that we are describing, maybe a little more, but not so much to distract from this contrast. And we want to focus on that contrast, so we will include details about tone, place of articulation or laryngeal timing, and leave out details about nasality, vowel tongue height or segment length.

This has implications for the way we teach transcription. For our students to learn the proper level of detail to include, they need practice comparing two pronunciations, transcribing both, and checking whether their transcriptions highlight the differences that they feel are most relevant to the current discussion.

I can illustrate this with a cautionary tale from my teaching just this past semester. I had found this approach of identifying differences to be useful, but students found the initial assignments overwhelming. Even as I was jotting down an early draft of this blog post, I just told my students to transcribe a single speech sample. I put off comparison assignments for later, and then put them off again.

As a result, I found myself focusing too much on some details while dismissing others. I could sense that my students were a bit frustrated, but I didn’t make the connection right away. I did ask them to compare two pronunciations on the final exam, and it went well, but not as well as it could have if they had been practicing it all semester. Overall the semester was a success, but it could have been better.

I’ll talk about how you can find comparable pronunciations in a future post.

Eclipsing

Leave a comment December 26, 2015 Angus Andrea Grieve-Smith

I’ve written about default assumptions before: how for example people in different parts of the English-speaking world have different assumptions about what they’ll get when they order “tea” or a “burger.” In the southern United States, the subcategory of “iced tea” has become the default, while in the northern US it’s “hot tea,” and in England it’s “hot tea with milk.” But even though iced tea is the default “tea” in the South, everyone there will still agree that hot tea is “tea.” In other cases, though, one subcategory can be so salient, so familiar as to crowd out all the other subcategories, essentially taking over the category.

An example of this eclipsing is the category of “concentration camp.” When you read those words, you probably imagined a Nazi death camp like Auschwitz, where my cousin Dora was imprisoned. (Unlike many of her fellow prisoners she survived the ordeal, and died peacefully earlier this year at the age of 101.) Almost every time we hear those words, they have referred to camps where our enemies killed millions of innocent civilians as part of a genocidal project, so that is what we expect.

This expectation is why so many people wrote in when National Public Radio’s Neal Conan referred to the camps where Japanese-Americans were imprisoned in World War II as “concentration camps” in 2012. NPR ombudspeople Edward Schumacher-Matos and Lori Grisham observed that the word dates back to the Boer War. Dan Carlin goes into detail about how widely the word “campos de reconcentraci?n” was used in the Spanish-American war. Last year, Aya Katz compared the use of “concentration camp” to that of “cage,” and earlier this year, reviewed the history of the word.

In general, the “concentration camps” of the Boer War and the Spanish American War, as well as the “camps de regroupement” used by the French in the wars of independence in Algeria and Indochina, were a counter-insurgency tactic, whereby the colonial power controlled the movements of the civilian population in an effort to prevent insurgents from hiding among noncombatants, and to prevent noncombatants from being used as human shields.

As Roger Daniels writes in his great article “Words Do Matter: A Note on Inappropriate Terminology and the Incarceration of the Japanese Americans” (PDF), the concept of “internment” refers to the process of separating “alien enemies” – nationals of an enemy power – from the general population, and was first practiced with British subjects during the War of 1812. While this was done for citizens of Japan (and other enemy powers) during World War II, Daniels objects to the use of “internment” to describe the incarceration of American citizens on the basis of Japanese ancestry. He notes that President Roosevelt used the term “concentration camp” to describe them, and asks people to use that word instead of “internment.”

In the case of the colonial wars, the camps were used to isolate colonized people from suspected insurgents. In the case of the Japanese-American incarceration, the camps were used to isolate suspected spies from the general population. In neither case were they used to exterminate people, or to commit genocide. They were inhumane, but they were very different from Nazi death camps.

It is not hard to understand why the Nazi death camps have come to eclipse all other kinds of concentration camps. They were so horrific, and have been so widely discussed and taught, that the inhumanity of relocating the populations of entire towns and rounding up people based on ethnicity pales by comparison. It makes complete sense to spend so much more time on them. As a result, if we have ever heard the term “concentration camp” used outside of the context of extermination and genocide it doesn’t stick in our memory.

For most English speakers, “concentration camp” means a Nazi death camp, or one equally horrific. This is why Daniels acknowledges, following Alice Yang Murray, that “it is clearly unrealistic to expect everyone to agree to use the contested term concentration camp.”

The word “cisgender” is anti-trans

13 Comments December 13, 2015 Angus Andrea Grieve-Smith

The word “cisgender” was coined to refer to people who aren’t transgender, as an alternative to problematic terms like “normal,” “regular” and “real.” Some have gone beyond this and asked their allies to “identify as cis,” and even treat trans people as the default realization of their genders.

As a trans person and a linguist, I disagree with these last two for a number of reasons. As I wrote last month, it’s bad etymology, and there is no evidence it will work. You might ask, well, what’s the harm in trying? The problem is that there is a cost to using “cisgender”: it divides the trans community. This may seem surprising at first, but it hinges on the fact that there are at least four different but overlapping meanings of the word “transgender.”

tg-definitions1 The original use of “transgender” was as an “umbrella” term including transvestites, transsexuals, drag queens, butch lesbians, genderqueer people and more. Another popular definition is based on “gender identity,” including everyone who believes that their essential gender is different from the one assigned to them at birth. A third sense is based on feelings like gender dysphoria, and a fourth is restricted to those trans people who transition. Trans people regularly argue about these definitions, but in my observations it is common for a single person to use more than one of these senses in the same conversation, and even the same sentence.

These overlapping meanings produce what I call the Transgender Bait and Switch. Intentionally or not, many trans people use the broader “umbrella” or “dysphoria” definitions to show the largest numbers, neediest cases or historical antecedents when they are looking to get funding, legitimacy, or political or social support, but then switch to narrower “identity” or “transition” senses when they are deciding how to allocate funding or space resources, or who is entitled to speak for the group, or who is an acceptable representation of trans people in the media.

This is a problem because the meaning of “cis” depends on the meaning of “trans.” Who are the “cis” people? Are they the opposite of “umbrella” trans – those who don’t belong to any of the categories under the umbrella? Are they the opposite of “identity” trans – those who do not believe they have a gender different from the one assigned them at birth? The opposite of “feeling” trans – those who do not feel gender dysphoria on a regular basis? Or are they the opposite of “transition” trans – those who don’t transition? I’ve heard all four uses.

For all their lofty claims about the goals of “cis,” when trans people use it they do so to exclude, and typically they focus on excluding the marginal cases as part of the Transgender Bait-and-Switch: people who fit in one definition of “trans” but not another. It has become commonplace to refer to drag queens as “cis gay men,” and gynophilic transvestites as “cis straight men.” Drag queens, transvestites, non-binary people and others are regularly challenged when we try to speak from our experiences as trans people, and the refrain is always: “You are cis, you have not transitioned, you do not have the same experience.” Meanwhile, the same people seem to have no problem presenting themselves as the representatives of the transgender umbrella when they want to, even when they do not have experiences of drag performance, fetishism or non-binary presentation.

The best known challenges to “cisgender” have come from people who are not trans under any definition: didn’t transition, don’t have a gender identity mismatch, don’t feel chronic gender dysphoria, and don’t fit in any of the identities under the umbrella. They claim that the word is used as a weapon against them. They have a point: many trans people blame “cis people” for oppressing them, conveniently ignoring the fact that we’re just as capable of oppressing each other as they are of oppressing us. And it is counterproductive: since almost all estimates – using any of the definitions – put us at less than one percent of the population, we can’t live without non-trans people.

But the reason I hate “cisgender,” the reason I’m asking you not to use it, is because it’s used as a weapon to exclude other trans people. When they want money, we’re trans. When they want to claim our legacy, we’re trans. But when we want some of the money, we’re “cis.” When we want representation, we’re “cis.” When we want to speak for the trans community, or even for our segment of the trans community, we’re “cis.”

“Cisgender” divides the trans community and reinforces a hierarchy with transitioned trans people on top and nonbinary people, drag queens and transvestites at the bottom. So next time your transgender buddy Kyle tells you to “identify as cis” to prove you’re a real ally and stay on the invite list to his parties, I’m asking you to tell him no. Tell him that your transgender buddy Angus said not to. And if he tells you that I don’t count because I’m not transitioning, tell him he just proved my point. And his parties suck anyway.

Levels of phonetic description

Leave a comment December 4, 2015 Angus Andrea Grieve-Smith

When I first studied phonetic transcription I learned about broad and narrow transcription, where narrow transcription contains much more detail, like the presence of aspiration on consonants and fine distinctions of tongue height. Of course it makes sense that you wouldn’t always want to go into such detail, but at the time I didn’t think about what detail was excluded from broad transcription and why.

In phonology we learned about phonemes, and how phoneme categories glossed over many of those same details that were excluded from broad transcription. For reasons I never quite grasped, though, we were told that phonemic transcription was a very different thing from broad transcription, and we were not to confuse them. Okay.

I got a better explanation from my first phonetics professor, Jacques Filliolet, who used three levels of analysis: niveau g?n?ralisant, niveau pertinent and niveau particularisant. We can translate them as general, specific and detailed levels.

When I started teaching phonology, I realized that the broad vs. narrow distinction did not reflect what I read in books and papers and saw at conferences. When people are actually using phonetic transcription there is no consistent set of features that they leave out or include.

What people do instead is include the relevant features and leave out the irrelevant ones. Which features are relevant depends on the topic of discussion. If it’s a paper about aspiration, or a paper about variation where aspiration may or may not be relevant, they will include aspiration. If it isn’t, they won’t.

I realized that sometimes linguists need to go into more detail than phonetic transcription can easily handle, so they use even finer-grained representations like formant frequencies, gestural scores and voice onset times.

Recently I realized that this just means phonetic transcription is a form of communication. In all forms of communication we adjust the level of detail we provide to convey the relevant information to our audience and leave out the irrelevant parts.

Phonemes are another, more organic way that we do this. This explains why phonemic transcription is not the same as broad transcription: we often want to talk about what sounds go into a phoneme without adding other details. For example, we may want to talk about how English /t/ typically includes both aspirated and unaspirated stops, without talking about fundamental frequency or lip closure.

Another possible translation of Filliolet’s niveau pertinent is “the appropriate level.” This is really what we’re all aiming for: the level of detail that is most appropriate for the circumstances.

Finding the right level of detail for phonetic transcription is actually not hard for students to learn; they do it all the time in regular language. The simplest way to teach it is to give the students assignments that require a particular level of detail.

Students are sometimes frustrated that there is not a single way to transcribe a given utterance. In addition to these differences of level of description, there are stylistic differences: do you write [r] instead of [ɹ] for an English bunched /r/?

Of course the International Phonetic Alphabet was sold as just such a consistent system: one symbol for one sound, in contrast with the messy reality of writing systems. To me this feels very Modernist and Utopian, and it is no accident that it was invented at the same time as other big modernist projects like Esperanto, Principia Mathematica, and International Style architecture.

The IPA falls short of the ideal consistent representation that was sold to people, but has largely succeeded in providing enough consistency, and keeping enough of the mess at bay, for specific purposes like documenting language variation and language acquisition.

The key is that almost everything we use phonetic transcription for involves comparing two or more different pronunciations. When we teach transcription, we need to highlight that.

Will “cisgender” work?

Leave a comment November 25, 2015 Angus Andrea Grieve-Smith

Some people have come up with the word “cisgender” to refer to people who aren’t transgender, as an alternative to problematic terms like “normal,” “regular” and “real.” Some have gone beyond this and asked their allies to “identify as cis,” and even treat trans people as the default realization of their genders. As a trans person and a linguist, I disagree with these last two for a number of reasons.

One quick objection that I have to get out of the way: “cisgender” is bad etymology. It’s true that “cis” is the opposite of “trans,” but only in the sense of location, existing on this side or the other side of a boundary. We are “trans” in the sense of direction, crossing from one gender expression to the other. In Latin as far as I know there is no prefix for something that never crosses a boundary. Of course, that’s a silly objection. We have plenty of words based on inaccurate analogies and they work just fine. I just had to get it off my chest.

Now, for real: the simplest objection is that there is no evidence “cis” will work as advertised. First of all, default status is not necessary for acceptance or admiration. Blond hair is marked in the United States, and that can make some people with blond hair unhappy. But there is no real discrimination or harassment against people with blond hair, not like that against trans people. People with English accents are marked, but they tend to be admired.

Transgender people (under almost any definition) make up less than one percent of the population. Do we even have the right to ask to be the default? Why should everyone have to think about us a hundred percent of the time when they only deal with us one percent of the time? Why should we be the default and not, say, intersex people?

Let’s say we manage to convince everyone to make us the default and themselves the marked ones. How is that going to make them more tolerant or accepting of us? There are plenty of groups who are or were the default, and even the majority, but were oppressed anyway: Catholics in British Ireland, Muslims in French Algeria, French Canadians in Quebec before the Quiet Revolution. The people who tell everyone to say “cis” don’t mention any of this.

The proponents of “cisgender” do not point to any time that this strategy has succeeded in the past, because there is no evidence of it succeeding. There are in fact intentional language changes that have some record of success, like avoiding names with implied insults. Switching the marked subcategory of a contested category is not one of them.

The main reason to not say “cisgender” is that it probably won’t work. If it were easy to get everyone to say “cis,” and it had no negative consequences, I would say that we should all just go ahead and say it, knowing full well that it probably won’t work, to humor its proponents. But the fact of the matter is that it does have negative consequences, consequences that affect me directly. I’ll talk about them in the next post in this series.

In the meantime, if you want to do something to help us, I’ve got some suggestions for you on my Trans blog. You can ask your friends and family to take a pledge not to kill us, or not to beat trans teenagers. You can even write the missing hip-hop song where a guy treats a trans woman with something other than violent contempt.

Teaching phonetic transcription in the digital age

2 Comments November 17, 2015 Angus Andrea Grieve-Smith

When I first taught phonetic transcription, almost seven years ago, I taught it almost the same way I had learned it twenty-five years ago. Today, the way I teach it is radically different. The story of the change is actually two stories intertwined. One is a story of how I’ve adopted my teaching to the radical changes in technology that occurred in the previous eighteen years. The other is a story of the more subtle evolution of my understanding of phonetics, phonology, phonological variation and the phonetic transcription that allows us to talk about them.

When I took Introduction to Linguistics in 1990 all the materials we had were pencil, paper, two textbooks and the ability of the professor to produce unusual sounds. In 2007 and even today, the textbooks have the same exercises: Read this phonetic transcription, figure out which English words were involved, and write the words in regular orthography. Read these words in English orthography and transcribe the way you pronounce them. Transcribe in broad and narrow transcription.

The first challenge was moving the homework online. I already assigned all the homework and posted all the grades online, and required my students to submit most of the assignments online; that had drastically reduced the amount of paper I had to collect and distribute in class and schlep back and forth. For this I had the advantage that tuition at Saint John’s pays for a laptop for every student. I knew that all of my students had the computing power to access the Blackboard site.

Thanks to the magic of Unicode and Richard Ishida’s IPA Picker, my students were able to submit their homework in the International Phonetic Alphabet without having to fuss with fonts and keyboard layouts. Now, with apps like the Multiling Keyboard, students can even write in the IPA on phones and tablets.

The next problem was that instead of transcribing, some students would look up the English spellings on dictionary sites, copy the standard pronunciation guides, and paste them into the submission box. Other students would give unusual transcriptions, but I couldn’t always tell whether these transcriptions reflected the students’ own pronunciations or just errors.

At first, as my professors had done, I made up for these homework shortcomings with lots of in-class exercises and drills, but they still all relied on the same principle: reading English words and transcribing them. Both in small groups and in full-class exercises, we were able to check the transcriptions and correct each other because everyone involved was listening to the same sounds. It wasn’t until I taught the course exclusively online that I realized there was another way to do it.

When I tell some people that I teach online courses, they imagine students from around the world tuning in to me lecturing at a video camera. This is not the way Saint John’s does online courses. I do create a few videos every semester, but the vast majority of the teaching I do is through social media, primarily the discussion forums on the Blackboard site connected with the course. I realized that I couldn’t teach phonetics without a way to verify that we were listening to the same sounds, and without that classroom contact I no longer had a way.

I also realized that with high-speed internet connections everywhere in the US, I had a new way to verify that we were listening to the same sounds: use a recording. When I took the graduate Introduction to Phonetics in 1993, we had to go to the lab and practice with the cassette tapes from William Smalley’s Manual of Articulatory Phonetics, but if I’m remembering right we didn’t actually do any transcription of the sounds; we just practiced listening to them and producing them. Some of us were better at that than others.

In 2015 we are floating in rivers of linguistic data. Human settlements have always been filled with the spontaneous creation of language, but we used to have to pore over their writings or rely on our untrustworthy memories. In the twentieth century we had records and tape, film and video, but so much of what was on that was scripted and rehearsed. If we could get recordings of the unscripted language it was hard to store, copy and distribute them.

Now people create language in forms that we can grab and hold: online news articles, streaming video, tweets, blog posts, YouTube videos, Facebook comments, podcasts, text messages, voice mails. A good proportion of these are even in nonstandard varieties of the language. We can read them and watch them and listen to them – and then we can reread and rewatch and relisten, we can cut and splice in seconds what would have taken hours – and then analyze them, and compare our analyses.

Instead of telling my students to read English spelling and transcribe in IPA, now I give them a link to a video. This way we’re working from the exact same sequence of sounds, a sequence that we can replay over and over again. I specifically choose pronunciations that don’t match what they find on the dictionary websites. This is precisely what the IPA is for.

Going the other way, I give my students IPA transcriptions and ask them to record themselves pronouncing the transcriptions and post it to Blackboard. Sure, my professor could have assigned us something like this in 1990, but then he would have had to take home a stack of cassettes and spend time rewinding them over and over. Now all my students have smartphones with built-in audio recording apps, and I could probably listen to all of their recordings on my own smartphone if I didn’t have my laptop handy.

So that’s the story about technology and phonetic transcription. Stay tuned for the other story, about the purpose of phonetic transcription.

Trans, cis and the default

Leave a comment September 4, 2015 Angus Andrea Grieve-Smith

In a recent post, I talked about one reason that the word “cisgender” was coined. I agree that it is a good idea to have ways of talking about people who aren’t trans without evoking a context of “real” or “normal” to imply that we are not legitimate or to highlight our minority status. If this were the case, something like “non-transgender men” might be enough. But many of the arguments for “cis” go beyond this.

The first step beyond simply using “cis” is asking non-trans people to “identify as cis.” The idea is that trans women are marked as “not normal” just by virtue of having a word for ourselves, while non-trans women are the default “women.” There are similar situations for women in general, for example in soccer:

Asking people to “identify as cis” – possibly as a condition of being accepted as an ally – means asking them to center trans people as the norm and mark themselves as deviating from that norm, at least in that context.

Some people have gone beyond simply asking people to “identify as cis,” and made a point of criticizing the use of unmodified “woman” in contexts that do not apply to all (or any) trans women. The idea is not just to make “trans” one acceptable default, but to exclude anything else from default status.

These three linguistic goals – replacing words like “normal,” admitting “trans” as a possible default status, and removing default status from non-trans people – are all aimed at removing the stigma associated with transgender actions. This stigma is real: I’ve received dirty looks and received petty harassment for wearing women’s clothes.

Of course, I’m relatively fortunate. I have never been attacked for being trans. I have received unconditional love and support from my family, and found a reasonable amount of success in my work life and acceptance from my neighbors. Others have been fired, kicked out of their homes, beaten and even killed for “being a man” in a dress or in the women’s bathroom – or for *not* “being a man” enough in the family or the workplace.

This stigma is not fair, and it needs to stop. The question is whether a word like “cisgender” can confer default status on us, whether default status will actually help to stop it, and if so how much.

How to Connect an Insignia NS-15AT10 to ADB on Windows

3 Comments August 15, 2015 Angus Andrea Grieve-Smith

I bought a nice little tablet at BestBuy, and I wanted to use it to test an Android app I’m developing. In order to do that, I have to connect the tablet to my Windows laptop and run something called ADB. Unfortunately, in order for ADB to connect to it, Windows needs to recognize it as an ADB device, and BestBuy hasn’t done the work to support that.

I did find a post by someone named pcdebol that tells you how to get other Insignia tablets working with ADB, and was able to get mine working using the Google USB drivers with some modifications. I wanted to post this for the benefit of other people who want to test their apps on this model of tablet.

The first thing to do is to download the Google driver, unpack it and modify the android_winusb.inf file to add the following lines in the [Google.NTamd64] section.

;NS-15AT10 %SingleAdbInterface% = USB_Install, USB\VID_0414&PID_506B&MI_01 %CompositeAdbInterface% = USB_Install, USB\VID_0414&PID_506B&REV_FFFF&MI_01

I found the “VID” and “PID” codes by looking at the hardware IDs in the Windows Device Manager. They should be the same for all NS-15AT10 tablets, but different for any other model. The next step is to edit the file adb_usb.ini in the .android folder in your user profile (for me, in windows 7, that’s “c:\users\grvsmth\”). If there is no .android folder, you should make one, and if your .android folder has no adb_usb.ini file you should make one of those. Then you put in the file this code, on a line by itself.

0x0414

It took me a little while to figure out that it’s the VID number from the Device Manager, with an 0x prefix to tell Windows that it’s a hexidecimal number. Once I did that and saved the file, I was able to re-add the device in Device Manager, Windows recognized it, and I was able to connect ADB to it flawlessly and test my app. I hope you have similar success!

Author: Angus Andrea Grieve-Smith