School of Information Blogs

July 22, 2014

Ph.D. student

responding to @npdoty on ethics in engineering

Nick Doty wrote a thorough and thoughtful response to my earlier post about the Facebook research ethics problem, correcting me on a number of points.

In particular, he highlights how academic ethicists like Floridi and Nissenbaum have an impact on industry regulation. It’s worth reading for sure.

Nick writes from an interesting position. Since he works for the W3C himself, he is closer to the policy decision makers on these issues. I think this, as well as his general erudition, give him a richer view of how these debates play out. Contrast that with the debate that happens for public consumption, which is naturally less focused.

In trying to understand scholarly work on these ethical and political issues of technology, I’m struck by how differences in where writers and audiences are coming from lead to communication breakdown. The recent blast of popular scholarship about ‘algorithms’, for example, is bewildering to me. I had the privilege of learning what an algorithm was fairly early. I learned about quicksort in an introductory computing class in college. While certainly an intellectual accomplishment, quicksort is politically quite neutral.

What’s odd is how certain contemporary popular scholarship seeks to introduce an unknowing audience to algorithms not via their basic properties–their pseudocode form, their construction from more fundamental computing components, their running time–but for their application in select and controversial contexts. Is this good for the public education? Or is this capitalizing on the vagaries of public attention?

My democratic values are being sorely tested by the quality of public discussion on matters like these. I’m becoming more content with the fact that in reality, these decisions are made by self-selecting experts in inaccessible conversations. To hope otherwise is to downplay the genuine complexity of technical problems and the amount of effort it takes to truly understand them.

But if I can sit complacently with my own expertise, this does not seem like a political solution. The FCC’s willingness to accept public comment, which normally does not elicit the response of a mass action, was just tested by Net Neutrality activists. I see from the linked article that other media-related requests for comments were similarly swamped.

The crux, I believe, is the self-referential nature of the problem–that the mechanics of information flow among the public are both what’s at stake (in terms of technical outcomes) and what drives the process to begin with, when it’s democratic. This is a recipe for a chaotic process. Perhaps there are no attractor or steady states.

Following Rash’s analysis of Habermas and Luhmann’s disagreement as to the fate of complex social systems, we’ve got at least two possible outcomes for how these debates play out. On the one hand, rationality may prevail. Genuine interlocutors, given enough time and with shared standards of discourse, can arrive at consensus about how to act–or, what technical standards to adopt, or what patches to accept into foundational software. On the other hand, the layering of those standards on top of each other, and the reaction of users to them as they build layers of communication on top of the technical edifice, can create further irreducible complexity. With that complexity comes further ethical dilemmas and political tensions.

A good desideratum for a communications system that is used to determine the technicalities of its own design is that its algorithms should intelligently manage the complexity of arriving at normative consensus.

by Sebastian Benthall at July 22, 2014 08:55 PM

This is truly unfortunate

This is truly unfortunate.

In one sense, this indicates that the majority of Facebook users have no idea how computers work. Do these Facebook users also know that their use of a word processor, or their web browser, or their Amazon purchases, are all mediated by algorithms? Do they understand that what computers do–more or less all they ever do–is mechanically execute algorithms?

I guess not. This is a massive failure of the education system. Perhaps we should start mandating that students read this well-written HowStuffWorks article, “What is a computer algorithm?” That would clear up a lot of confusion, I think.

by Sebastian Benthall at July 22, 2014 07:09 PM

July 16, 2014

Ph.D. student

Re: Homebrew Website Club: July 16, 2014

Sure, I'm in for tonight's Homebrew meeting. I don't have a ton of progress to report, but I've been working on academic writing that can be simultaneously posted to the Web (where it can be easily shared and annotated) and also formatted to PDF via LaTeX. Oh, and I'm excited to chat with people about OpenPGP for indieweb purposes.

P.S. While I like the idea of posting RSVPs via my website, it seems a little silly to include them in RSS feeds or the blog index page like any other blog entry. What are people doing to filter/distinguish different kinds of posts?

by at July 16, 2014 10:04 PM

July 15, 2014

Ph.D. student

Re: The Facebook ethics problem is a political problem

Thanks for writing. I’m inspired to write a couple of comments in response.

First, are academic, professional ethicists as irrelevant as you suggest? (Okay, that’s a bit of a strawman framing, but I hope the response is still useful.)

Floridi is an interesting example. I’m also a fan of his work (although I know him more for his philosophy of information work — I like to cite him on semantics/ontologies, for example (Floridi 2013) — rather than his ethics work), but he’s also in the news this week because he’s on Google’s panel of experts (their “Advisory Council”) for determining the right balance in processing right-to-be-forgotten requests.

Also, I think we see the influence of these ethical and other academic theories play out in practical terms, even if they’re not cited in a direct company response to a particular scandal. For example, you can see Nissenbaum’s contextual integrity theory of privacy (Nissenbaum 2004) throughout the Federal Trade Commission’s 2012 report on privacy (FTC 2012), even though she’s never explicitly cited. And, forgive me for rooting for the home team here, but I think Ken and Deirdre’s research of “on the ground” privacy (Bamberger and Mulligan 2011) played a pretty prominent role in the White House framework for consumer privacy (“Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy” 2012).

But second, I’m even more excited about your conclusion. Yes, decentralize!, despite the skepticism about it (Narayanan et al. 2012). But more than just repeating that rallying cry (which I still think needs repeating – I’m trying to support #indieweb as my part of that), is the form of the problem.

I think a really cool project that everybody who cares about this should be working on is designing and executing on building that alternative to Facebook. That’s a huge project. But just think about how great it would be if we could figure out how to fund, design, build, and market that. These are the big questions for political praxis in the 21st century.

Politics in our century might be defined by engineering challenges, and if that’s true, then it emphasizes even more how coding is not just entangled with, but is itself a question of, policy and values. I think our institution could dedicate a group blog just to different takes on that.


Some references:

Bamberger, KA, and DK Mulligan. 2011. “Privacy on the Books and on the Ground.” Stanford Law Review.

“Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy.” 2012. White House, Washington, DC.

Floridi, Luciano. 2013. “Semantic Conceptions of Information.” In Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Spring 201.

FTC. 2012. “Protecting Consumer Privacy in an Era of Rapid Change Recommendations for Businesses and Policymakers.” Technical report March. Federal Trade Commission.

Narayanan, Arvind, Vincent Toubiana, Helen Nissenbaum, and Dan Boneh. 2012. “A Critical Look at Decentralized Personal Data Architectures.”

Nissenbaum, Helen. 2004. “Privacy as Contextual Integrity.” Washington Law Review 79 (1): 101–139.

by at July 15, 2014 06:02 AM

July 14, 2014

MIMS 2011

Full disclosure: Diary of an internet geography project #1

First posted at the site for the new project, Connectivity, Inclusivity and Inequality that I’m involved with.

OII research fellow, Mark Graham and DPhil student, Heather Ford (both part of the CII group) are working with a group of computer scientists including Brent HechtDave Musicant and Shilad Sen to understand how far Wikipedia has come to representing ‘the sum of all human knowledge’. As part of the project, they will be making explicit the methods that they use to analyse millions of data records from Wikipedia articles about places in many languages. The hope is that by experimenting with a reflexive method of doing multidisciplinary ‘big data’ project, others might be able to use this as a model for pursuing their own analyses in the future. This is the first post in a series in which Heather outlines the team’s plans and processes.  

Screen Shot 2014-07-10 at 12.28.58 PM

It was a beautiful day in Oxford and we wanted to show our Minnesotan friends some Harry Pottery architecture, so Mark and I sat on a bench in the Balliol gardens while we called Brent, Dave and Shilad who are based in Minnesota for our inaugural Skype meeting. I have worked with Dave and Shilad on a paper about Wikipedia sources in the past, and Mark and Brent know each other because they both have produced great work on Wikipedia geography, but we’ve never all worked together as a team. A recent grant from Oxford University’s John Fell Fund provided impetus for the five of us to get together and pool efforts in a short, multidisciplinary project that will hopefully catalyse further collaborative work in the future.

In last week’s meeting, we talked about our goals and timing and how we wanted to work as a team. Since we’re a multidisciplinary group who really value both quantitative and qualitative approaches, we thought that it might make sense to present our goals as consisting of two main strands: 1) to investigate the origins of knowledge about places on Wikipedia in many languages, and 2) to do this in a way that is both transparent and reflexive.

In her eight ‘big tent’ criteria for excellent qualitative research, Sarah Tracy (2010, PDF) includesself-reflexivity and transparency in her conception of researcher ‘sincerity’. Tracy believes that sincerity is a valuable quality that relates to researchers being earnest and vulnerable in their work and ‘considering not only their own needs but also those of their participants, readers, coauthors and potential audiences’. Despite the focus on qualitative research in Tracy’s influential paper, we think that practicing transparency and reflexivity can have enormous benefits for quantitative research as well but one of the challenges is finding ways to pursue transparency and reflexivity as a team rather than as individual researchers.


Tracy writes that transparency is about researchers being honest about the research process.

‘Transparent research is marked by disclosure of the study’s challenges and unexpected twists and turns and revelation of the ways research foci transformed over time.’

She writes that, in practice, transparency requires a formal audit trail of all research decisions and activities. For this project, we’ve set up a series of Google docs folders for our meeting agendas, minutes, Skype calls, screenshots of our video call as well as any related spreadsheets and analyses produced during the week. After each session, I clean up the meeting minutes that we’ve co-produced on the Google doc while we’re talking, and write a more narrative account about what we did and what we learned beneath that.

Although we’re co-editing these documents as a team, it’s important to note that, as the documenter of the process, it’s my perspective that is foregrounded and I have to be really mindful of this as reflect what happened. Our team meetings are occasions for discussion of the week’s activities, challenges and revelations which I try to document as accurately as possible, but I will probably also need to conduct interviews with individual members of the team further along in the process in order to capture individual responses to the project and the process that aren’t necessarily accommodated in the weekly meetings.


According to Tracy, self-reflexivity involves ‘honesty and authenticity with one’s self, one’s research and one’s audience’. Apart from the focus on interrogating our own biases as researchers, reflexivity is about being frank about our strengths and weaknesses, and, importantly, about examining our impact on the scene and asking for feedback from participants.

Soliciting feedback from participants is something quite rare in quantitative research but we believe that gaining input from Wikipedians and other stakeholders can be extremely valuable for improving the rigor of our results and for providing insight into the humans behind the data.

As an example, a few years ago when I was at a Wikimedia Kenya meetup, I asked what editorsthought about Mark Graham’s Swahili Wikipedia maps. One respondent was immediately able to explain the concentration of geolocated articles from Turkey because he knew the editor who was known as a specialist of Turkey geography stubs. Suddenly the map took on a more human form — a reflection of the relationships between real people trying to represent their world. More recently, a Swahili Wikipedians contacted Mark about the same maps and engaged him in a conversation about how they could be made better. Inspired by these engagements, we want to really encourage those conversations and invite people to comment on our process as it evolves. To do this, we’ll be blogging about the progress of the project and inviting particular groups of stakeholders to provide comments and questions. We’ll then discuss those comments and questions in our weekly meetings and try to respond to as many of them as possible in thinking about how we move the analysis forward.

In conclusion, transparency and reflexivity are two really important aspects of researcher sincerity. The challenge with this project is trying to put this into practice in a quantitative rather than qualitative project, a project driven by a team rather than an individual researcher. Potential risks are that I inaccurately report on what we’re doing, or expose something about our process that is considered inappropriate. What I’m hoping is that we can mark these entries clearly as my initial, necessarily incomplete reflections on our process and that this can feed into the team’s reflections going forward. Knowing the researchers in the team and having worked with all of them in the past, my goal is to reflect the ways in which they bring what Tracy values in ‘sincere’ researchers: the empathy, kindness, self-awareness and self deprecation that I know all of these team members display in their daily work.

by Heather Ford at July 14, 2014 03:10 PM

July 09, 2014

Ph.D. alumna

The Cost of Contemporary Policing: A Review of Alice Goffman’s ‘On the Run’

Growing up in Lancaster, Pennsylvania in the 80s and 90s, I had a pretty strong sense of fear and hatred for cops. I got to witness corruption and intimidation first hand, and I despised the hypocritical nature of the “PoPo.” As a teen, I worked at Subway. Whenever I had a late shift, I could rely on cops coming by. About half of them were decent. They’d order politely and, as if recognizing the fear in my body, would try to make small talk to suggest that we were on even ground in this context. And they’d actually pay their bills. The other half were a different matter. Especially when they came in in pairs. They’d yell at me, demean me, sexualize me. More importantly, I could depend on the fact that they would not pay for their food and threaten me if I tried to get them to pony up. On the job, I got one free sandwich per shift. If I was lucky, and it was only one cop, I could cover it by not eating dinner. For each additional cop, I would be docked an hour’s pay. There were nights where I had to fork over my entire paycheck.

I had it easy. Around me, I saw much worse. A girl at a neighboring school was gang raped by a group of cops after her arrest for a crime it turned out she didn’t commit but which was committed by a friend of her first cop rapist. Men that I knew got beaten up when they had a run-in. The law wasn’t about justice; it was about power and I knew to stay clear. The funny thing is that I always assumed that this was because “old” people were messed up. And cops were old people. This notion got shattered when I went back for a friend’s high school reunion. Some of his classmates had become police officers and so they decided to do a series of busts that day to provide drugs to the revelers. Much to my horror, some of the very people that I grew up with became corrupt cops. I had to accept that it wasn’t just “old” people; it was “my” people.

I did not grow up poor, although we definitely struggled. We always had food on the table and the rent got paid, but my mother worked two jobs and was always exhausted to the bones. Of course, we were white and living in a nice part of town so I knew my experiences were pretty privileged from the getgo. Most of my close friends who got arrested were arrested for hacking and drug-related offenses. Only those of color were arrested for more serious crimes. I knew straight up that my white, blonde self wasn’t going to be targeted which meant that I just needed to keep my nose clean. But in practice, that meant dumping OD’ed friends off at the steps of the hospital and driving away rather than walking through the front door.

As I aged and began researching teens, my attitude towards law enforcement became more complex. I met police officers who were far more interested in making the world a better place than those who I encountered as a kid. At the same time, I met countless youth whose run-ins were far worse than anything that I ever experienced. I knew that certain aspects of policing were far darker than I got to see first hand, but I didn’t really have the right conceptual frame for understanding what was at play with many of the teens that I met.

And then I read Alice Goffman’s On the Run.

This book has forced to me to really contend with all of my mixed and complicated feelings towards law enforcement, while providing a deeper context for my own fieldwork with teens. More than anything, this book has shed a spotlight on exactly what’s at stake in our racist and classist policing practices. She brilliantly deciphers the cultural logic of black men’s relationship with law enforcement, allowing outsiders to better understand why black communities respond the way they do. In doing so, she challenges most people’s assumptions about policing and inequality in America.

Alice Goffman’s ‘On the Run

For the better part of her undergraduate and graduate school years, Alice Goffman embedded herself in a poor black neighborhood of Philadelphia, in a community where young men are bound to run into the law and end up jailed. What began as fieldwork for a class paper turned into an undergraduate thesis and then grew into a dissertation which resulted in her first book, published by University of Chicago, called On the Run: Fugitive Life in an American City. This book examines the dynamics of a group of boys — and the people around them — as they encounter law enforcement and become part of the system. She lived alongside them, participated in their community, and bore witness to their experiences. She lived through arrests, raids, and murders. She saw it all and the account she offers doesn’t pull punches.

While I’ve seen police intimidation and corruption, the detail with which Goffman documents the practices of policing in the community in which she studied is both eloquent and harrowing. Through her writing, you can see what she saw, offering insight into a dynamic that few privileged people can bear witness. What’s most striking about Goffman’s accounting is the empathy with which she approaches the community. It is a true ethnographic account, in every sense. But, at the same time, it is so accessible and delightful that I want the world to read it.

Although most Americans realize that black men are overrepresented in US jails, most people don’t realize just how bad it is. As Goffman notes in her prologue, 1 in every 107 people in the adult population is currently in jail while 3% of the adult population is under correctional supervision. Not only are 37% of those in prison black, but 60% of black men who didn’t finish high school will go to prison by their mid-30s. We’ve built a prison-industrial complex and most of our prison reform laws have only made life worse for poor blacks.

The incentive structures around policing are disgusting and, with the onset of predictive policing, getting worse. As Goffman shows, officers have to hit their numbers and they’re free to use many abusive practices to get there. Although some law enforcement officers have a strong moral compass, many have no qualms about asserting their authority in the most vicious and abusive ways imaginable. The fear that they produce in poor communities doesn’t increase lawful behavior; it undermines the very trust in authority that is necessary to a health democracy.

The most eye-opening chapter in Goffman’s book is her accounting of what women experience as they are forced into snitching on the men in their communities. All too often, their houses are raided and they are threatened with violence, arrest, eviction, and the loss of children. Their homes are torn apart, their money is taken, and they are constantly surveilled. Police use phone records to “prove” that their boyfriends are cheating on them or offer up witnesses who suggest that the men in their lives aren’t really looking out for them. While she describes how important loyalty is in these communities, she also details just how law enforcement actively destroys the fabric of these communities through intimidation and force. Under immense pressure, most everyone breaks. It’s a modern day instantiation ofantebellum slavery practices. If you tear apart a community, authority has power.

For all of the abuse and intimidation faced by those targeted by policing practices, it delights me to see the acts of creative resistance that many of Goffman’s informants undertake. Consider, for example, the realities of banking in poor communities. Most poor folks have no access to traditional banks to store their money and keeping cash on them is tricky. Not only might they be robbed by someone in the community, but they can rely on the fact that any police officer who frisks them will take whatever cash is found. So where should they store money for safe keeping?

When you bail someone out of jail and they show up for their court dates, you can get your bail money back. But why not just leave it at the court for safe keeping? You have up to six months to recover it and it’s often safer there than anywhere else. In her analysis, Goffman offers practices like these as well as other innovative ways poor people use the unjust system to their advantage.

Seeing Police Through the Eyes of Teens

Reading Goffman’s book also allowed me to better understand the teens that I encountered through my research. Doing fieldwork with working class and poor youth of color was both the highlight of my study and the hardest to fully grok. I have countless fieldnotes about teens’ recounted problems with cops, their struggles to stay out of trouble, and the violence that they witnessed all around them. I knew the stats. I knew that many of the teens that I met would probably end up in jail, if they hadn’t already had a run-in with the law. But I didn’t really get it.

Perhaps the hardest interview I had was with a young man who had just gotten out of jail and was in a halfway house. When he was a small boy, his mom got sick of his dad and so asked him to rat out his dad when the cops showed up. He obliged and his father was sent to jail. His mom then moved him and his younger brother across the country. By the time he was a teenager, his mom would call the cops on him and his brother whenever she wanted some peace and quiet. He’d eventually ran away and was always looking for a place to stay. His brother made a different decision — he found older white men who would “take care of him.” The teen I met was disgusted by his brother’s activities and thought that these men were gross so one day, he planted drugs on one of the guy’s cars and called the cops on him. And so the cycle continues.

In order to better understand human trafficking, I began talking to commercially exploited youth. Here, I also witnessed some pretty horrible dynamics. Teens who were arrested for prostitution “to keep them safe,” not to mention the threats and rapes that many young people engaged in sex work encountered from the very same law enforcement officers who were theoretically there to protect them. All too often, teens told me that their abusive “boyfriends” were much better than the abusive State apparatus (and their fathers). And based on what I saw, this was a fair assessment. And so I continue to struggle with policy discussions that center on empowering law enforcement. Sure, I had met some law enforcement folks in this work that were really working to end commercial sexual abuse of minors. And I want to see law enforcement serve a healthy enforcing role. But every youth I met feared the cops far more than they feared their abusers. And I still struggle to make sense of the right path forward.

Although the teens that I met often recounted their negative encounters with police, I never fully understood the underlying dynamics that shaped what they were telling me. What I was studying theoretically had nothing to do with teens’ relationship with the law and so this data was simply context. Context I was curious about, but not context that I got to observe properly. I knew that there was a lot more going on. A lot that I didn’t see. Enough to make me concerned about how law enforcement shapes the lives of working class and poor youth, but not enough to enable me to do anything about it.

What Goffman taught me was to appreciate the way in which the teens that I met were forced into a game of survival that was far more extreme than what I imagined. They are trying to game a system that is systematically unfair, that leaves them completely disempowered, and that teaches them to trust no one. For most poor populations, authority isn’t just corrupt — it’s outright abusive. Why then should we expect marginalized populations to play within a system that is out to get them?

As Ta-Nehisi Coates eloquently explained in “The Case for Reparations,” we may speak of a post-racial society where we no longer engage in racist activities, but the on-the-ground realities are much more systemically destructive. The costs of our historical racism and the damage done by slavery are woven into the fabric of our society. “It is as though we have run up a credit-card bill and, having pledged to charge no more, remain befuddled that the balance does not disappear. The effects of that balance, interest accruing daily, are all around us.”

We cannot expect the most marginalized people in American society to simply start trusting authority when authority continues to actively fragment their communities in an abusive assertion of power. It is both unfair and unreasonable to expect poor folks to work within a system that was designed to oppress them. If we want change, we need to better understand what’s at stake.

Goffman’s On the Run offers a brilliant account of what poor black people who are targeted by policing face on a daily basis. And how they learn to live in a society where their every move is surveilled. It is a phenomenal and eye-opening book, full of beauty and sorrow. Without a doubt, it’s one of the best books I’ve read in a long time. It makes very clear just how much we need policing reform in this country.

Understanding the cultural logic underpinning poor black men’s relationship with the law is essential for all who care about equality in this country. Law enforcement has its role in society, but, as with any system of power, it must always be checked. This book is a significant check to power, making visible some of the most invisible mechanisms of racism and inequality that exist today.

(Photo by Pavel P.)

(This entry was first posted on June 9, 2014 at Medium under the title “The Cost of Contemporary Policing” as part of The Message.)

by zephoria at July 09, 2014 06:21 PM

Ph.D. student

The Facebook ethics problem is a political problem

So much has been said about the Facebook emotion contagion experiment. Perhaps everything has been said.

The problem with everything having been said is that by an large people’s ethical stances seem predetermined by their habitus.

By which I mean: most people don’t really care. People who care about what happens on the Internet care about it in whatever way is determined by their professional orientation on that matter. Obviously, some groups of people benefit from there being fewer socially imposed ethical restrictions on data scientific practice, either in an industrial or academic context. Others benefit from imposing those ethical restrictions, or cultivating public outrage on the matter.

If this is an ethical issue, what system of ethics are we prepared to use to evaluate it?

You could make an argument from, say, a utilitarian perspective, or a deontological perspective, or even a virtue ethics standpoint. Those are classic moves.

But nobody will listen to what a professionalized academic ethicist will say on the matter. If there’s anybody who does rigorous work on this, it’s probably somebody like Luciano Floridi. His work is great, in my opinion. But I haven’t found any other academics who work in, say, policy that embrace his thinking. I’d love to be proven wrong.

But since Floridi does serious work on information ethics, that’s mainly an inconvenience to pundits. Instead we get heat, not light.

If this process resolves into anything like policy change–either governmental or internally at Facebook–it will because of a process of agonistic politics. “Agonistic” here means fraught with conflicted interests. It may be redundant to modify ‘politics’ with ‘agonistic’ but it makes the point that the moves being made are strategic actions, aimed at gain for ones person or group, more than they are communicative ones, aimed at consensus.

Because e.g. Facebook keeps public discussion fragmented through its EdgeRank algorithm, which even in its well-documented public version is full of apparent political consequences and flaws, there is no way for conversation within the Facebook platform to result in consensus. It is not, as has been observed by others, a public. In a trivial sense, it’s not a public because the data isn’t public. The data is (sort of) private. That’s not a bad thing. It just means that Facebook shouldn’t be where you go to develop a political consensus that could legitimize power.

Twitter is a little better for this, because it’s actually public. Facebook has zero reason to care about the public consensus of people on Twitter though, because those people won’t organize a consumer boycott of Facebook, because they can only reach people that use Twitter.

Facebook is a great–perhaps the greatest–example of what Habermas calls the steering media. “Steering,” because it’s how powerful entities steer public opinion. For Habermas, the steering media control language and therefore culture. When ‘mass’ media control language, citizens no longer use language to form collective will.

For individualized ‘social’ media that is arranged into filter bubbles through relevance algorithms, language is similarly controlled. But rather than having just a single commanding voice, you have the opportunity for every voice to be expressed at once. Through homophily effects in network formation, what you’d expect to see are very intense clusters of extreme cultures that see themselves as ‘normal’ and don’t interact outside of their bubble.

The irony is that the critical left, who should be making these sorts of observations, is itself a bubble within this system of bubbles. Since critical leftism is enacted in commercialized social media which evolves around it, it becomes recuperated in the Situationist sense. Critical outrage is tapped for advertising revenue, which spurs more critical outrage.

The dependence of contemporary criticality on commercial social media for its own diffusion means that, ironically, none of them are able to just quit Facebook like everyone else who has figured out how much Facebook sucks.

It’s not a secret that decentralized communication systems are the solution to this sort of thing. Stanford’s Liberation Tech group captures this ideology rather well. There’s a lot of good work on censorship-resistant systems, distributed messaging systems, etc. For people who are citizens in the free world, many of these alternative communication platforms where we are spared from algorithmic control are very old. Some people still use IRC for chat. I’m a huge fan of mailing lists, myself. Email is the original on-line social media, and ones inbox is ones domain. Everyone who is posting their stuff to Facebook could be posting to a WordPress blog. WordPress, by the way, has a lovely user interface these days and keeps adding “social” features like “liking” and “following”. This goes largely unnoticed, which is too bad, because Automattic, the company the runs WordPress, is really not evil at all.

So there are plenty of solutions to Facebook being bad for manipulative and bad for democracy. Those solutions involve getting people off of Facebook and onto alternative platforms. That’s what a consumer boycott is. That’s how you get companies to stop doing bad stuff, if you don’t have regulatory power.

Obviously the real problem is that we don’t have a less politically problematic technology that does everything we want Facebook to do only not the bad stuff. There are a lot of unsolved technical accomplishments to getting that to work. I think I wrote a social media think piece about this once.

I think a really cool project that everybody who cares about this should be working on is designing and executing on building that alternative to Facebook. That’s a huge project. But just think about how great it would be if we could figure out how to fund, design, build, and market that. These are the big questions for political praxis in the 21st century.

by Sebastian Benthall at July 09, 2014 04:35 AM

July 08, 2014

Ph.D. student

Theorizing the Web and SciPy conferences compared

I’ve just been through two days of tutorials at SciPy 2014–that stands for Scientific Python (the programming language). The last conference I went to was Theorizing the Web 2014. I wonder if I’m the first person to ever go to both conferences. Since I see my purpose in grad school as being a bridge node, I think it’s worthwhile to write something comparing the two.

Theorizing the Web was held in a “gorgeous warehouse space” in Williamsburg, the neighborhood of Brooklyn, New York that was full of hipsters ten years ago and now is full of baby carriages but still has gorgeous warehouse spaces and loft apartments. The warehouse spaces are actually gallery spaces that only look like warehouses from the outside. On the inside of the one where TtW was held, whole rooms with rounded interior corners were painted white, perhaps for a photo shoot. To call it a “warehouse” is to appeal to the blue color and industrial origins that Brooklyn gentrifiers appeal to in order to distinguish themselves from the elites in Manhattan. During my visit to New York for the conference, I crashed on a friend’s air mattress in the Brooklyn neighborhood I had been gentrifying just a few years earlier. The speakers included empirical scientific researchers, but these were not the focus of the event. Rather, the emphasis was on theorizing in a way that is accessible to the public. The most anticipated speaker was a porn actress. Others were artists or writers of one sort or another. One was a sex worker who then wrote a book. Others were professors of sociology and communications. Another was a Buzzfeed editor.

SciPy is taking place in the AT&T Education and Conference Center in Austin, Texas, near the UT Austin campus. I’m writing from the adjoining hotel. The conference rooms we are using are in the basement; they seat many in comfortable mesh rolling chairs on tiers so everybody can see the dual projector screens. The attendees are primarily scientists who do computationally intensive work. One is a former marine biologist who now does bioinformatics mainly. Another team does robotics. Another does image processing on electron microscope of chromosomes. They are not trying to be accessible to the public. What they are trying to teach is hard enough to get across to others with similar expertise. It is a small community trying to enlarge itself by teaching others its skills.

At Theorizing the Web, the rare technologist spoke up to talk about the dangers of drones. In the same panel, it was pointed out how the people designing medical supply drones for use in foreign conflict zones were considering coloring them white, not black, to make them less intimidating. The implication was that drone designers are racist.

It’s true that the vast majority of attendees of the conference are white and male. To some extent, this is generational. Both tutorials I attended today–including the one one on software for modeling multi-body dynamics, useful for designing things like walking robots–were interracial and taught by guys around my age. The audience has some older folks. These are not necessarily academics, but may be industry types or engineers whose firms are paying them to attend to train on cutting edge technology.

The afterparty first night of Theorizing the Web was in a dive bar in Williamsburg. Brooklyn’s Williamsburg has dive bars the same way Virginia’s Williamsburg has a colonial village–they are a cherished part of its cultural heritage. But the venue was alienating for some. One woman from abroad confided to me that they were intimidated by how cool the bar felt. It was my duty as an American and a former New Yorker to explain that Williamsburg stopped being cool a long time ago.

I’m an introvert and am initially uneasy in basically any social setting. Tonight’s SciPy afterparty was in the downtown office of Enthought, in the Bank of America building. Enthought’s digs are on the 21st floor, with spatious personal offices and lots of whiteboards which display serious use. As an open source product/consulting/training company, it appears to be doing quite well. I imagine really cool people would find it rather banal.

I don’t think it’s overstating things to say that Theorizing the Web serves mainly those skeptical of the scientific project. Knowledge is conceived of as a threat to the known. One panelist at TtW described the problem of “explainer” sites–web sites whose purpose is to explain things that are going on to people who don’t understand them–when they try to translate cultural phenomena that they don’t understand. It was argued that even in cases where these cultural events are public, to capture that content and provide a interpretation or narration around it can be exploitative. Later, Kate Crawford, a very distinguished scholar on civic media, spoke to a rapt audience about the “conjoint anxieties” of Big Data. The anxieties of the watched are matched by the anxieties of the watchmen–like the NSA and, more implicitly, Facebook–who must always seek out more data in order to know things. The implication is that their political or economic agenda is due to a psychological complex–damning if true. In a brilliant rhetorical move that I didn’t quite follow, she tied this in to normcore, which I’m pretty sure is an Internet meme about a fake “fashion” trend in New York. Young people in New York go gaga for irony like this. For some reason earlier this year hipsters ironically wearing unstylish clothing became notable again.

I once met somebody from L.A. who told me their opinion of Brooklyn was that all nerds gathered in one place and thought they could decide what cool was just by saying so. At the time I had only recently moved to Berkeley and was still adjusting. Now I realize how parochial that zeitgeist is, however much I may still identify with it some.

Back in Austin, I have interesting conversations with folks at the SciPy party. One conversation is with two social scientists (demographic observation: one man, one woman) from New York that work on statistical analysis of violent crime in service to the city. They talk about the difficulty of remaining detached from their research subjects, who are eager to assist with the research somehow, though this would violate the statistical rigor of their study. Since they are doing policy research, objectivity is important. They are painfully aware of the limitations of their methods and the implications this has on those their work serves.

Later, I’m sitting alone when I’m joined by an electrical engineer turned programmer. He’s from Tennessee. We talk shop for a bit but the conversation quickly turns philosophical–about the experience of doing certain kinds of science, the role of rationality in human ethics, whether religion is an evolved human impulse and whether that mattes. We are joined by a bioinformatics researcher from Paris. She tells us later that she has an applied math/machine learning background.

The problem in her field, she explains, is that for rare diseases it is very hard to find genetic causes because there isn’t enough data to do significant inference. Genomic data is very highly dimensional–thousands of genes–and for some diseases there may be less than fifty cases to study. Machine learning researchers are doing their best to figure out ways for researchers to incorporate “prior knowledge”–theoretical understanding from beyond the data available–to improve their conclusions.

Over meals the past couple days I’ve been checking Twitter, where a lot of the intellectuals who organize Theorizing the Web or are otherwise prominent in that community are active. One conversation extended conversation is about the relative failure of the open source movement to produce compelling consumer products. My theory is that this has to do with business models and the difficulty of coming up with upfront capital investment. But emotionally my response to that question is that it is misplaced: consumer products are trivial. Who cares?

Today, folks on Twitter are getting excited about using Adorno’s concept of the culture industry to critique Facebook’s emotional contagion experiment and other media manipulation. I find this both encouraging–it’s about time the Theorizing the Web community learned to embrace Frankfurt School thought–and baffling, because I believe they are misreading Adorno. The culture industry is that sector of the economy that produces cultural products, like Hollywood and television productions companies. On the Internet, the culture industry is Buzzfeed, the Atlantic, and to a lesser extent (though this is surely masked by it’s own ideology) The New Inquiry. My honest opinion for a long time has been that the brand of “anticapitalist” criticality indulged in on-line is a politically impotent form of entertainment equivalent to the soap opera. A concept more appropriate for understanding Facebook’s role in controlling access to news and the formation of culture is Habermas’ idea of steering media.

He gets into this in Theory of Communicative Action, vol. 2, which is underrated in America probably due to its heaviness.

by Sebastian Benthall at July 08, 2014 05:33 AM

July 06, 2014

Ph.D. student

economic theory and intellectual property

I’ve started reading Picketty’s Capital. His introduction begins with an overview of the history of economic theory, starting with Ricardo and Marx.

Both these early theorists predicted the concentration of wealth into the hands of the owners of factors of production that are not labor. For Ricardo, land owners extract rents and dominate the economy. For Marx, capitalists–owners of private capital–accumulate capital and dominate the economy.

Since those of us with an eye on the tech sector are aware of a concentration of wealth in the hands of the owners of intellectual property, it’s a good question what kind of economic theory ought to apply to those cases.

One one sense, intellectual property is a kind of capital. It is a factor of production that is made through human labor.

On the other hand, we talk about ideas being ‘discovered’ like land is discovered, and we imagine that intellectual property can in principle be ‘shared’ like a ‘commons’. If we see intellectual property as a position in a space of ideas, it is not hard to think of it like land.

Like land, a piece of intellectual property is unique and gains in value due to further improvements–applications or innovations–built upon it. In a world where intellectual property ownership never expires and isn’t shared, you can imagine that whoever hold some critical early work in some field could extract rents for perpetuity. Owning a patent would be like owning a land estate.

Like capital, intellectual property is produced by workers and often owned by those investing in the workers with pre-existing capital. The produced capital is then owned by the initiating capitalist, and accumulates.

Open source software is an important exception to this pattern. This kind of intellectual property is unalienated from those that produce it.

by Sebastian Benthall at July 06, 2014 06:38 PM

Ph.D. student

notary digital?

Recently I had the honor of swearing, and having notarized, an affidavit of bona fide marriage for a good friend as part of an immigration application. Speaking with another friend who had done the same for a friend of hers, she remarked that it was such a basic and important thing to do, that even if she did nothing else this year it would have been an accomplishment. And the formal, official process of notarization was interesting enough itself that I spent some time looking into how to become one.

Notary Public

Becoming a notary is a strange process. By its nature, it's an extremely regulated field: state law specifies exactly what a notary must do, what training they must have, what level of verification is needed for different notarized documents, exactly how much a notary may charge for each service, how the notary may advertise itself, etc. That is, you become a notary public, not just a notary. Presumably this is in part because other legal and commercial processes depend on notarization of certain kinds.

Given all those regulations, if the notary errs or forgets when conducting her duties, the law provides penalties. Forgot to thumbprint someone when you notarized their affidavit? That's $2500. Forgot to inform the Secretary of State when you moved to a new apartment? $500. Screw up the process for identifying an individual in a way that screws up someone else's business? They can sue you for damages. In short, if you're a notary, you need to buy notary errors and omissions insurance, at least $50 for four years. Also, the State wants to be sure that you can pay if you become a rogue notary who violates all these rules. As a result, as soon as you become a notary you're required to execute a bond of $15,000 with your county. In short, you pay a certified bond organization maybe $50 for the bond; if the State thinks you screwed up, they get the money directly from the bondsman and then the bondsman comes and gets the money from you.

Notary Digital?

But mostly I'm curious about this just because I've been thinking about the idea of a digital notary. (This is not to be confused with completing notary public activities with webcam verification instead of in-person, which appears to be illegal in most states, and not what I'm offering.)

That is, it seems like there are some operations we do in our digital, electronic lives these days that could benefit from some in-person verification. Those operations might otherwise just be cumbersome or awkward, but if we have an existing structure — of people who advertise themselves as carefully completing these verification operations in person — maybe that would actually work well, even with our online personas. These thoughts are, charmingly I hope, inchoate and I would appreciate your thoughts about them.

Backup / Escrow

Some really important digital files you want to backup in a secure, offline way, where you're guaranteed to be able to get them back. (Say: Bitcoin wallets; financial records; passwords, certificate revocations, private keys.) You meet with the digital notary; she confirms who you are, who can have access to the files, whether you want them encrypted in a way that she can't access them, how and when to get them back to you (offline-only, online with certain verifications, etc.). You pay her a fee then and a fee at the time if you ever need to retrieve them.

Alternatives: online "cold storage" services; university IT escrow services (not sure if this is common, but Chicago provides it for faculty and staff); bank safety deposit boxes with USB keys in them; online backup you really hope is secure.

Verification and Certification

You can go to a digital notary to get some digital confirmation that you are who you say you are online. The digital notary can give you a certificate to use that has your legal name and her signature (complete with precise verification steps) that you can use to sign electronic documents or sign/encrypt email. Sure, anyone can sign your OpenPGP key and confirm your identity, but the notary can help you set it up and give you a trusted verification (based on her well-known reputation and connection to the Web of Trust and other notaries).

And, traditional to the notary, she can sign a jurat. That is, you can swear an affidavit of some statement and she can verify that it was really you saying exactly what you said, but do so in a way that can be automatically and remotely verified.

Alternatives: key-signing parties; certificate authorities (some do this for free, others require a fee, or require a fee if it's not just personal use); creating your own key and participating in the Web of Trust in order to establish some reputation.


While we hope to see an increase in the thanatosensitivity (oh man, I've been waiting for an excuse to use that term again; here are all my bookmarks related to the topic) of online services — like Google's Inactive Account Manager — after we die, it's likely that our online accounts will become defunct and difficult for our next-of-kin to access. It would be useful to give someone instructions for what we want done with our accounts and data after death; that person will likely have to securely maintain passwords and keys and be able to verify, offline, our identities. Pay your digital notary a fee and she can execute certain actions (deleting some data, revealing some passwords to whichever family members you chose, disabling social media accounts) after your death, after verifying it using not just inactivity, but also confirmation with government or family.

Alternatives: a lawyer who understands technology well enough to execute these digital terms of your will just as they do your regular will and testament. (Does anyone know the current state of the art for lawyers who know how to handle these things?)


And actually what might be most valuable about digital notary services is that she can explain to you these digital verifications work. That is, not only can a digital notary provide digital execution with in-person verification, she can provide the basic capability, explain how it works and then conduct it. Another advantage of in-person meetings, you can seek individualized counsel, not just formalistic execution of tasks.

It would be nice if information technology had a profession with a fiduciary responsibility to its clients; the implications of digital work are increasingly important to us but remain hard for non-experts to understand, much less control. Just as we expect with our doctors and our lawyers, we should be able to ask technological experts for advice and services that are in our own best, and varied, interests. Related, it would be useful if the law reflected that relationship and provided liability but also confidentiality, for such transactions. That latter part will take a little while (the law is slow to change, as we know), but a description of the profession and some common ethical guidelines of its own could help.

A Shingle?

As an experiment, I offer you all and our friends the services described above — escrow of files/keys; authentication, encryption and certification of messages; execution of a digital will and testament — at a nominal $2 fee per service.

Sincerely yours,


P.S. Did you know that payment of fees is one factor used to determine that a privileged client-attorney relationship has been established?

by at July 06, 2014 04:41 AM

July 03, 2014

Ph.D. student

Preparing for SciPy 2014

I’ve been instructed to focus my attention on mid-level concepts rather than grand theory as I begin my empirical

This is difficult for me, as I tend to oscillate between thinking very big and thinking very narrowly. This is an occupational hazard of a developer. Technical minutiae accumulate into something durable and powerful. To sustain ones motivation one has to be able to envision ones tiny tasks (correcting the spelling of some word in a program) stepping towards a larger project.

I’m working in my comfort zone. I’ve got my software project open on GitHub and I’m preparing to present my preliminary results at SciPy 2014 next week. A colleague and mentor I met with today told me it’s not a conference for people marking up career points. It’s a conference for people to meet each other, get an update on how their community is doing as a whole, and to learn new skills from each other.

It’s been a few years since I’ve been to a developer conference. In my past career I went to FOSS4G, the open source geospatial conference, a number of times. In 2008, the conference was in South Africa. I didn’t know anybody, so I blogged about it, and got chastised for being too divisive. I wasn’t being sensitive to the delicate balance between the open source developer geospatial community and their greatest proprietary coopetitor, ESRI. I was being an ideologue at a time when the open source model was in that industry just in its inflection point and becoming mainstream. Obviously I didn’t understand the subtlety of the relationships, business and personal, threaded through the conference.

Later I attended FOSS4G in 2010 to pitch the project my team had recently launched, GeoNode. It was a very exciting time for me personally. I was very personally invested in the project, and I was so proud of my team and myself for pulling through on the beta release. In retrospect, building a system for serving spatial data modeled on a content management system seems like a no-brainer. Today there are plenty of data management startups and services out there, some industrial, some academic. But at the time we were ahead of the curve, thanks largely to the vision of Chris Holmes, who at the time the wunderkind visionary president of OpenGeo.

Cholmes always envisioned OpenGeo turning into an anti-capitalist organization, a hacker coop with as much transparency as it could handle. If only it could get its business model right. It was incubating in a pre-crash bubble that thinned out over time. I was very into the politics of the organization when I joined it, but over time I became more cynical and embraced the economic logic I was being taught by the mature entrepreneurs who had been attracted to OpenGeo’s promise and standing in the geospatial world. While trying to wrap my head around managing developers, clients, and the budget around GeoNode, I began to see why businesses are the way they are, and how open source plays out in the industrial organization of the tech industry as a whole.

GeoNode, the project, remains a success. There is glory to that, though in retrospect I can claim little of it. I made many big mistakes and the success of the project has always been due to the very intelligent team working on it, as well as its institutional positioning.

I left OpenGeo because I wanted to be a scientist. I had spent four years there, and had found my way onto a project where we were building data plumbing for disaster reduction scientists and the military. OpenGeo had become a victim of its own success and outgrown its non-profit incubator, buckling under the weight of the demand for its services. I had deferred enrollment at Berkeley for a year to see GeoNode through to a place where it couldn’t get canned. My last major act was to raise funding for a v1.1 release that fixed the show-stopping bugs in the v1.0 version.

OpenGeo is now Boundless, a for-profit company. It’s better that way. It’s still doing revolutionary work.

I’ve been under the radar in the open source world for the three years I’ve been in grad school. But as I begin this dissertation work, I feel myself coming back to it. My research questions, in one framing, are about software ecosystem sustainability and management. I’m drawing from my experience participating in and growing open source communities and am trying to operationalize my intuitions from that work. At Berkeley I’ve discovered the scientific Python community, which I feel at home with since I learned about how to do open source from the inimitable Whit Morris, a Pythonista of the Plone cohort, among others.

After immersing myself in academia, I’m excited to get back into the open source development world. Some of the most intelligent and genuine people I’ve ever met work in that space. Like the sciences, it is a community of very smart and creative people with the privilege to pursue opportunity but with goals that go beyond narrow commercial interests. But it’s also in many ways a more richly collaborative and constructive community than the academic world. It’s not a prestige economy, where people are rewarded with scarce attention and even scarcer titles. It’s a constructive economy, where there is always room to contribute usefully, and to be recognized even in a small way for that contribution.

I’m going to introduce my research on the SciPy communities themselves. In the wake of the backlash against Facebook’s “manipulative” data science research, I’m relieved to be studying a community that has from the beginning wanted to be open about its processes. My hope is that my data scientific work will be a contribution to, not an exploitation of, the community I’m studying. It’s an exciting opportunity that I’ve been preparing for for a long time.

by Sebastian Benthall at July 03, 2014 11:18 PM

July 01, 2014

Ph.D. alumna

What does the Facebook experiment teach us?

I’m intrigued by the reaction that has unfolded around the Facebook “emotion contagion” study. (If you aren’t familiar with this, read this primer.) As others have pointed out, the practice of A/B testing content is quite common. And Facebook has a long history of experimenting on how it can influence people’s attitudes and practices, even in the realm of research. An earlier study showed that Facebook decisions could shape voters’ practices. But why is it that *this* study has sparked a firestorm?

In asking people about this, I’ve been given two dominant reasons:

  1. People’s emotional well-being is sacred.
  2. Research is different than marketing practices.

I don’t find either of these responses satisfying.

The Consequences of Facebook’s Experiment

Facebook’s research team is not truly independent of product. They have a license to do research and publish it, provided that it contributes to the positive development of the company. If Facebook knew that this research would spark the negative PR backlash, they never would’ve allowed it to go forward or be published. I can only imagine the ugliness of the fight inside the company now, but I’m confident that PR is demanding silence from researchers.

I do believe that the research was intended to be helpful to Facebook. So what was the intended positive contribution of this study? I get the sense from Adam Kramer’s comments that the goal was to determine if content sentiment could affect people’s emotional response after being on Facebook. In other words, given that Facebook wants to keep people on Facebook, if people came away from Facebook feeling sadder, presumably they’d not want to come back to Facebook again. Thus, it’s in Facebook’s better interest to leave people feeling happier. And this study suggests that the sentiment of the content influences this. This suggests that one applied take-away for product is to downplay negative content. Presumably this is better for users and better for Facebook.

We can debate all day long as to whether or not this is what that study actually shows, but let’s work with this for a second. Let’s say that pre-study Facebook showed 1 negative post for every 3 positive and now, because of this study, Facebook shows 1 negative post for every 10 positive ones. If that’s the case, was the one week treatment worth the outcome for longer term content exposure? Who gets to make that decision?

Folks keep talking about all of the potential harm that could’ve happened by the study – the possibility of suicides, the mental health consequences. But what about the potential harm of negative content on Facebook more generally? Even if we believe that there were subtle negative costs to those who received the treatment, the ongoing costs of negative content on Facebook every week other than that 1 week experiment must be more costly. How then do we account for positive benefits to users if Facebook increased positive treatments en masse as a result of this study? Of course, the problem is that Facebook is a black box. We don’t know what they did with this study. The only thing we know is what is published in PNAS and that ain’t much.

Of course, if Facebook did make the content that users see more positive, should we simply be happy? What would it mean that you’re more likely to see announcements from your friends when they are celebrating a new child or a fun night on the town, but less likely to see their posts when they’re offering depressive missives or angsting over a relationship in shambles? If Alice is happier when she is oblivious to Bob’s pain because Facebook chooses to keep that from her, are we willing to sacrifice Bob’s need for support and validation? This is a hard ethical choice at the crux of any decision of what content to show when you’re making choices. And the reality is that Facebook is making these choices every day without oversight, transparency, or informed consent.

Algorithmic Manipulation of Attention and Emotions

Facebook actively alters the content you see. Most people focus on the practice of marketing, but most of what Facebook’s algorithms do involve curating content to provide you with what they think you want to see. Facebook algorithmically determines which of your friends’ posts you see. They don’t do this for marketing reasons. They do this because they want you to want to come back to the site day after day. They want you to be happy. They don’t want you to be overwhelmed. Their everyday algorithms are meant to manipulate your emotions. What factors go into this? We don’t know.

Facebook is not alone in algorithmically predicting what content you wish to see. Any recommendation system or curatorial system is prioritizing some content over others. But let’s compare what we glean from this study with standard practice. Most sites, from major news media to social media, have some algorithm that shows you the content that people click on the most. This is what drives media entities to produce listicals, flashy headlines, and car crash news stories. What do you think garners more traffic – a detailed analysis of what’s happening in Syria or 29 pictures of the cutest members of the animal kingdom? Part of what media learned long ago is that fear and salacious gossip sell papers. 4chan taught us that grotesque imagery and cute kittens work too. What this means online is that stories about child abductions, dangerous islands filled with snakes, and celebrity sex tape scandals are often the most clicked on, retweeted, favorited, etc. So an entire industry has emerged to produce crappy click bait content under the banner of “news.”

Guess what? When people are surrounded by fear-mongering news media, they get anxious. They fear the wrong things. Moral panics emerge. And yet, we as a society believe that it’s totally acceptable for news media – and its click bait brethren – to manipulate people’s emotions through the headlines they produce and the content they cover. And we generally accept that algorithmic curators are perfectly well within their right to prioritize that heavily clicked content over others, regardless of the psychological toll on individuals or the society. What makes their practice different? (Other than the fact that the media wouldn’t hold itself accountable for its own manipulative practices…)

Somehow, shrugging our shoulders and saying that we promoted content because it was popular is acceptable because those actors don’t voice that their intention is to manipulate your emotions so that you keep viewing their reporting and advertisements. And it’s also acceptable to manipulate people for advertising because that’s just business. But when researchers admit that they’re trying to learn if they can manipulate people’s emotions, they’re shunned. What this suggests is that the practice is acceptable, but admitting the intention and being transparent about the process is not.

But Research is Different!!

As this debate has unfolded, whenever people point out that these business practices are commonplace, folks respond by highlighting that research or science is different. What unfolds is a high-browed notion about the purity of research and its exclusive claims on ethical standards.

Do I think that we need to have a serious conversation about informed consent? Absolutely. Do I think that we need to have a serious conversation about the ethical decisions companies make with user data? Absolutely. But I do not believe that this conversation should ever apply just to that which is categorized under “research.” Nor do I believe that academe is necessarily providing a golden standard.

Academe has many problems that need to be accounted for. Researchers are incentivized to figure out how to get through IRBs rather than to think critically and collectively about the ethics of their research protocols. IRBs are incentivized to protect the university rather than truly work out an ethical framework for these issues. Journals relish corporate datasets even when replicability is impossible. And for that matter, even in a post-paper era, journals have ridiculous word count limits that demotivate researchers from spelling out all of the gory details of their methods. But there are also broader structural issues. Academe is so stupidly competitive and peer review is so much of a game that researchers have little incentive to share their studies-in-progress with their peers for true feedback and critique. And the status games of academe reward those who get access to private coffers of data while prompting those who don’t to chastise those who do. And there’s generally no incentive for corporates to play nice with researchers unless it helps their prestige, hiring opportunities, or product.

IRBs are an abysmal mechanism for actually accounting for ethics in research. By and large, they’re structured to make certain that the university will not be liable. Ethics aren’t a checklist. Nor are they a universal. Navigating ethics involves a process of working through the benefits and costs of a research act and making a conscientious decision about how to move forward. Reasonable people differ on what they think is ethical. And disciplines have different standards for how to navigate ethics. But we’ve trained an entire generation of scholars that ethics equals “that which gets past the IRB” which is a travesty. We need researchers to systematically think about how their practices alter the world in ways that benefit and harm people. We need ethics to not just be tacked on, but to be an integral part of how *everyone* thinks about what they study, build, and do.

There’s a lot of research that has serious consequences on the people who are part of the study. I think about the work that some of my colleagues do with child victims of sexual abuse. Getting children to talk about these awful experiences can be quite psychologically tolling. Yet, better understanding what they experienced has huge benefits for society. So we make our trade-offs and we do research that can have consequences. But what warms my heart is how my colleagues work hard to help those children by providing counseling immediately following the interview (and, in some cases, follow-up counseling). They think long and hard about each question they ask, and how they go about asking it. And yet most IRBs wouldn’t let them do this work because no university wants to touch anything that involves kids and sexual abuse. Doing research involves trade-offs and finding an ethical path forward requires effort and risk.

It’s far too easy to say “informed consent” and then not take responsibility for the costs of the research process, just as it’s far too easy to point to an IRB as proof of ethical thought. For any study that involves manipulation – common in economics, psychology, and other social science disciplines – people are only so informed about what they’re getting themselves into. You may think that you know what you’re consenting to, but do you? And then there are studies like discrimination audit studies in which we purposefully don’t inform people that they’re part of a study. So what are the right trade-offs? When is it OK to eschew consent altogether? What does it mean to truly be informed? When it being informed not enough? These aren’t easy questions and there aren’t easy answers.

I’m not necessarily saying that Facebook made the right trade-offs with this study, but I think that the scholarly reaction of research is only acceptable with IRB plus informed consent is disingenuous. Of course, a huge part of what’s at stake has to do with the fact that what counts as a contract legally is not the same as consent. Most people haven’t consented to all of Facebook’s terms of service. They’ve agreed to a contract because they feel as though they have no other choice. And this really upsets people.

A Different Theory

The more I read people’s reactions to this study, the more that I’ve started to think that the outrage has nothing to do with the study at all. There is a growing amount of negative sentiment towards Facebook and other companies that collect and use data about people. In short, there’s anger at the practice of big data. This paper provided ammunition for people’s anger because it’s so hard to talk about harm in the abstract.

For better or worse, people imagine that Facebook is offered by a benevolent dictator, that the site is there to enable people to better connect with others. In some senses, this is true. But Facebook is also a company. And a public company for that matter. It has to find ways to become more profitable with each passing quarter. This means that it designs its algorithms not just to market to you directly but to convince you to keep coming back over and over again. People have an abstract notion of how that operates, but they don’t really know, or even want to know. They just want the hot dog to taste good. Whether it’s couched as research or operations, people don’t want to think that they’re being manipulated. So when they find out what soylent green is made of, they’re outraged. This study isn’t really what’s at stake. What’s at stake is the underlying dynamic of how Facebook runs its business, operates its system, and makes decisions that have nothing to do with how its users want Facebook to operate. It’s not about research. It’s a question of power.

I get the anger. I personally loathe Facebook and I have for a long time, even as I appreciate and study its importance in people’s lives. But on a personal level, I hate the fact that Facebook thinks it’s better than me at deciding which of my friends’ posts I should see. I hate that I have no meaningful mechanism of control on the site. And I am painfully aware of how my sporadic use of the site has confused their algorithms so much that what I see in my newsfeed is complete garbage. And I resent the fact that because I barely use the site, the only way that I could actually get a message out to friends is to pay to have it posted. My minimal use has made me an algorithmic pariah and if I weren’t technologically savvy enough to know better, I would feel as though I’ve been shunned by my friends rather than simply deemed unworthy by an algorithm. I also refuse to play the game to make myself look good before the altar of the algorithm. And every time I’m forced to deal with Facebook, I can’t help but resent its manipulations.

There’s also a lot that I dislike about the company and its practices. At the same time, I’m glad that they’ve started working with researchers and started publishing their findings. I think that we need more transparency in the algorithmic work done by these kinds of systems and their willingness to publish has been one of the few ways that we’ve gleaned insight into what’s going on. Of course, I also suspect that the angry reaction from this study will prompt them to clamp down on allowing researchers to be remotely public. My gut says that they will naively respond to this situation as though the practice of research is what makes them vulnerable rather than their practices as a company as a whole. Beyond what this means for researchers, I’m concerned about what increased silence will mean for a public who has no clue of what’s being done with their data, who will think that no new report of terrible misdeeds means that Facebook has stopped manipulating data.

Information companies aren’t the same as pharmaceuticals. They don’t need to do clinical trials before they put a product on the market. They can psychologically manipulate their users all they want without being remotely public about exactly what they’re doing. And as the public, we can only guess what the black box is doing.

There’s a lot that needs reformed here. We need to figure out how to have a meaningful conversation about corporate ethics, regardless of whether it’s couched as research or not. But it’s not so simple as saying that a lack of a corporate IRB or a lack of a golden standard “informed consent” means that a practice is unethical. Almost all manipulations that take place by these companies occur without either one of these. And they go unchecked because they aren’t published or public.

Ethical oversight isn’t easy and I don’t have a quick and dirty solution to how it should be implemented. But I do have a few ideas. For starters, I’d like to see any company that manipulates user data create an ethics board. Not an IRB that approves research studies, but an ethics board that has visibility into all proprietary algorithms that could affect users. For public companies, this could be done through the ethics committee of the Board of Directors. But rather than simply consisting of board members, I think that it should consist of scholars and users. I also think that there needs to be a mechanism for whistleblowing regarding ethics from within companies because I’ve found that many employees of companies like Facebook are quite concerned by certain algorithmic decisions, but feel as though there’s no path to responsibly report concerns without going fully public. This wouldn’t solve all of the problems, nor am I convinced that most companies would do so voluntarily, but it is certainly something to consider. More than anything, I want to see users have the ability to meaningfully influence what’s being done with their data and I’d love to see a way for their voices to be represented in these processes.

I’m glad that this study has prompted an intense debate among scholars and the public, but I fear that it’s turned into a simplistic attack on Facebook over this particular study rather than a nuanced debate over how we create meaningful ethical oversight in research and practice. The lines between research and practice are always blurred and information companies like Facebook make this increasingly salient. No one benefits by drawing lines in the sand. We need to address the problem more holistically. And, in the meantime, we need to hold companies accountable for how they manipulate people across the board, regardless of whether or not it’s couched as research. If we focus too much on this study, we’ll lose track of the broader issues at stake.

by zephoria at July 01, 2014 11:00 PM

June 27, 2014

Ph.D. alumna

‘Selling Out’ Is Meaningless: Teens live in the commercial world we created

In the recent Frontline documentary “Generation Like,” Doug Rushkoff lamented that today’s youth don’t even know what the term “sell-out” means. While this surprised Rushkoff and other fuddy duddies, it didn’t make me blink for a second. Of course this term means nothing to them. Why do we think it should?

The critique of today’s teens has two issues intertwined into one. First, there’s the issue of language — is this term the right term? Second, there’s the question of whether or not the underlying concept is meaningful in contemporary youth culture.

Slang Shifts Over Time

My cohort grew up with the term “dude” with zero recognition that the term was originally a slur for city slickers and dandies known for their fancy duds (a.k.a. clothing). And even as LGBT folks know that “gay” once meant happy, few realize that it once referred to hobos and drifters. Terms change over time.

Even the term “sell-out” has different connotations depending on who you ask… and when you ask. While it’s generally conceptualized as a corrupt bargain, it was originally of political origins, equivalent to traitor. For example, it was used to refer to those in the South who chose to leave the Confederacy for personal gain. Among the black community, it took a different turn, referring to those African-Americans who appeared to be too white. Of course, the version that Rushkoff is most familiar with stems from when musicians were being attacked for putting commercial interests above artistic vision. Needless to say, those who had the privilege to make these decisions were inevitably white men, so it’s not that surprising that the notion of selling out was particularly central to the punk and alternative music scenes from the 1960s-1990s, when white men played a defining role. For many other musicians, hustling was always part of the culture and you were darn lucky to be able to earn a living doing what you loved. This doesn’t mean that the music industry isn’t abusive or corrupt or corrupting. Personally, I’m glad that today’s music ecosystem isn’t as uniformly white or male as it once was.

All that said, why on earth should contemporary adults expect today’s teens to use the same terms that us old fogies have been using to refer to cultural dynamics? Their musical ecosystem is extraordinarily different than what I grew up with. RIAA types complain about how technology undercut their industry, but I would argue that the core industry got greedy and, then, abusive. Today’s teens are certainly living in a world with phenomenally famous pop stars, but they are also experiencing the greatest levels of fragmentation ever. Rather than relying on the radio for music recommendations, they turn to YouTube and share media content through existing networks, undermining industrial curatorial control. As a result, I constantly meet teens whose sense of the music industry is radically different than that of peers who live next in the next town over. The notion of selling out requires that there is one reigning empire. That really isn’t the case anymore.

Of course, the issue of slang is only the surface issue. Do teens recognize the commercial ecosystem that they live in? And how do they feel about it? What I found in my research was pretty consistent on this front.

Growing Up in a Commercial World

Today’s teens are desperate for any form of freedom. In a world where they have limited physical mobility and few places to go, they’re deeply appreciative of any space that will accept them. Because we’ve pretty much obliterated all public spaces for youth to gather in, they find their freedomin commercial spaces, especially online. This doesn’t mean teens like the advertisements that are all around them, but they’ll accept this nuisance for the freedom to socialize with their friends. They know it’s a dirty trade-off and they’re more than happy to mess with the data that the systems scrape, but they are growing up in a world where they don’t feel as though they have much agency or choice.

These teens are not going to critique their friends for being sell-outs because they’ve already been sold out by the adults in their world. These teens want freedom and it’s our fault that they don’t have it except in commercial spaces. These teens want opportunities and we do everything possible to restrict those that they have access to. Why should we expect them to stand up to commercial surveillance when every adult in their world surveils their every move “for their own good”? Why should these teens lament the commercialization of public spaces when these are the only spaces that they feel actually allow them to be authentic?

It makes me grouchy when adults gripe about teens’ practice without taking into account all of the ways in which we’ve forced them into the corners that they’re trying to navigate. There’s good reason to be critical of how commercialized American society has become, but I don’t think that we should place the blame on the backs of teenagers who are just trying to find their way. If we don’t like what we see when we watch teenagers, it’s time to look in the mirror. We’ve created this commercially oriented society. Teens are just trying to figure out how to live in it.

(Thanks to Tamara Kneese for helping track down some of the relevant history for this post.)

(This entry was first posted on May 27, 2014 at Medium under the title “‘Selling Out’ Is Meaningless” as part of The Message.)

by zephoria at June 27, 2014 06:36 PM

June 25, 2014

Ph.D. student

metaphorical problems with logical solutions

There are polarizing discourses on the Internet about the following four dichotomies:

  • Public vs. Private (information)
  • (Social) Inclusivity vs. Exclusivity.
  • Open vs. Closed (systems, properties, communities).

Each of these pairings enlists certain metaphors and intuitions. Rarely are they precisely defined.

Due to their intuitive pull, it’s easy to draw certain naive associations. I certainly do. But how do they work together logically?

To what extent can we fill in other octants of this cube? Or is that way of modeling it too simplistic as well?

If privacy is about having contextual control over information flowing out of oneself, then that means that somebody must have the option of closing off some access to their information. To close off access is necessarily to exclude.


But it has been argued that open sociotechnical systems exclude as well by being inhospitable to those with greater need for privacy.


These conditionals limit the kinds of communities that can exist.


Social inclusivity in sociotechnical systems is impossible. There is no such thing as a sociotechnical system that works for everybody.

There are only three kinds of systems: open systems, private systems, or systems that are neither open nor private. We can call the latter leaky systems.

These binary logical relations capture only the limiting properties of these systems. If there has ever been an open system, it is the Internet; but everyone knows that even the Internet isn't truly open because of access issues.

The difference between a private system and a leaky system is participant's ability to control how their data escapes the system.

But in this case, systems that we call 'open' are often private systems, since participants choose whether or not to put information into the open.

So is the only question whether and when information is disclosed vs. leaked?

by Sebastian Benthall at June 25, 2014 11:33 PM

June 24, 2014

Ph.D. student

Protected: some ruminations regarding ‘openness’

This post is password protected. You must visit the website and enter the password to continue reading.

by Sebastian Benthall at June 24, 2014 06:22 AM

June 19, 2014

Ph.D. student

turns out network backbone markets in the US are competitive after all

I’ve been depressed lately about the oligopolistic control of telecommunications for a while now. There’s the Web We’ve Lost; there’s Snowden leaks; there’s the end of net neutrality. I’ll admit a lot of my moodiness about this has been just that–moodiness. But it was moodiness tied to a particular narrative.

In this narrative, power is transmitted via flows of information. Media is, if not determinative of public opinion, determinative of how that opinion is acted up. Surveillance is also an information flow. Broadly, mid-20th century telecommunications enabled mass culture due to the uniformity of media. The Internet’s protocols allowed it to support a different kind of culture–a more participatory one. But monetization and consolidation of the infrastructure has resulted in a society that’s fragmented but more tightly controlled.

There is still hope of counteracting that trend at the software/application layer, which is part of the reason why I’m doing research on open source software production. One of my colleagues, Nick Doty, studies the governance of Internet Standards, which is another piece of the puzzle.

But if the networking infrastructure itself is centrally controlled, then all bets are off. Democracy, in the sense of decentralized power with checks and balances, would be undermined.

Yesterday I learned something new from Ashwin Mathew, another colleague who studies Internet governance at the level of network administration. The man is deep in the process of finishing up his dissertation, but he looked up from his laptop for long enough to tell me that the network backbone market is in fact highly competitive at the moment. Apparently, there was a lot of dark fiberoptic cable (“dark fiber“–meaning, no light’s going through it) laid during the first dot-com boom, which has been laying fallow and getting bought up by many different companies. Since there are many routes from A to B and excess capacity, this market is highly competitive.

Phew! So why the perception of oligopolistic control of networks? Because the consumer-facing telecom end-points ARE an oligopoly. Here there’s the last-mile problem. When wire has to be laid to every house, the economies of scale are such that it’s hard to have competitive markets. Enter Comcast etc.

I can rest easier now, because I think that this means there’s various engineering solutions to this (like AirJaldi networks? though I think those still aren’t last mile…; mesh networks?) as well as political solutions (like a local government running its last mile network as a public utility).

by Sebastian Benthall at June 19, 2014 10:38 PM

June 18, 2014

Ph.D. student


This post is password protected. You must visit the website and enter the password to continue reading.

by Sebastian Benthall at June 18, 2014 10:50 PM

MIMS 2010

Reworking the CourtListener Datamodel

Brian and I have been hard at work the past week figuring out how to make CourtListener able to understand more that one document type. Our goal right now is to make it possible to add:

  • oral arguments and other audio content,
  • video content if it's available,
  • content from RECAP, and
  • thousands of ninth circuit briefs that has recently scanned

The problem with our current database is that it's not organized in a way that supports linkages between content. So, if we have the oral argument and the opinion from a single case, we have no way of pointing them at each other. Turns out this is a sticky problem.

The solution we've come up with is an architecture like the following:

(we also have a more detailed version and an editable version)

And eventually, this will also have a Case table above the docket that allows multiple dockets to be associated with a single case. For now though, that's moot, as we don't have anyway of figuring out which dockets go together.

The first stage of this will be to add support for oral arguments, since they make a simple case to work with. Once that's complete the next stage will be either to add the RECAP documents or those from


Since this is such a big change, we're also taking this opportunity to re-work our URLs. Currently, they look like this:


For example:

A few things bug me about that. First, it doesn't tell you anything about what kind of thing you can expect to see if you click that link. Second, the alpha-numeric ID is kind of lame. It's just a reference to the database primary key for the item, and we should just show that value (in this case, "yjn" means "108713"). To fix both of these issues, the new URLs will be:



That should be easier to read and should tell you what type of item you're about to look at. Don't worry, the old URLs will keep working just fine.

And the rest of the new URLs will be:


and eventually:



We expect these changes to come with changes to the API, so we'll likely be releasing API version 1.1 that will add suport for dockets and oral arguments.

The current version 1.0 should keep working just fine, since we're not changing any of the underlying data, but I expect that it will have some changes to the URLs and things like that. I'll be posting more about this in the CourtListener dev list. as the changes become more clear and as we sort out what a fair policy is for the deprecation of old APIs.

by mlissner at June 18, 2014 05:26 PM

MIMS 2012

Design Process of Optimizely's Sample Size Calculator

Optimizely just released a sample size calculator, which tells people how many visitors they need for an A/B test to get results. This page began as a hack week project, which produced a functioning page, but needed some design love before being ready for primetime. So my coworker Jon (a communication designer at Optimizely) and I teamed up to push this page over the finish line. In this post, I’m going to explain the design decisions we made along the way.

Finished sample size calculator

The finished sample size calculator

We Started with a Functioning Prototype

The page started as an Optimizely hack week project that functioned correctly, but suffered from a confusing layout that didn’t guide people through the calculation. After brainstorming some ideas, we decided the calculator’s layout should follow the form of basic math taught in primary school. You start at the top, write down each of the inputs in a column, and calculate the answer at the bottom.

Original sample size calculator prototype

The original sample size calculator prototype

This made sense conceptually, but put the most important piece of data (the number of visitors needed) at the bottom of the page. Conventional wisdom and design theory would say our information hierarchy is backwards, and users may not even see this content. It also meant the answer is below the fold, which could increase the bounce rate.

All of these fears make sense when viewing the page through the lens of static content. But this page is interactive, and requires input from the user to produce a meaningful answer to how many visitors are needed for an A/B test. Lining up the inputs in a vertical column, and placing the answer at the bottom, encourages people to look at each piece of data going into the calculation, and enter appropriate values before getting an answer. The risk of visitors bouncing, or not seeing the answer, is minimal. Although this is counter to best practices, we felt our reasons for breaking the rules were sound.

Even so, we did our due diligence and sketched a few variations that shuffled around the inputs (e.g. horizontal; multi-column) and the sample size per variation (e.g. top; sides). None of these alternates felt right, and having the final answer at the bottom made the most sense for the reasons described above. But sketching out other ideas made us confident in our original design decision.

Power User Inputs

After deciding on the basic layout of the page, we tackled the statistical power and significance inputs. We knew from discussions with our statisticians that mathematically speaking these were important variables in the calculation, but they don’t need to be changed by most people. The primary user of this page just wants to know how many visitors they’ll need to run through an A/B test, for whom the mathematical details of these variables are unimportant. However, the values should still be clear to all users, and editable for power users who understand their effect.

To solve this challenge, we decided to display the value in plain text, but hide the controls behind an “Edit” button. Clicking the button reveals a slider to change the input. We agreed that this solution gave enough friction to deter most users from playing around with these values, but it’s not so burdensome as to frustrate an expert user who wants to change it.

Removing the “Calculate” Button

The original version of the page didn’t spit out the number of visitors until the “Calculate” button was clicked. But once I started using the page and personally experienced the annoyance of having to click this button every time I changed the values, it was clear the whole process would be a lot smoother if the answer updated automatically anytime an input changed. This makes the page much more fluid to use, and encourages people to play with each variable to see how it affects the number of visitors their test needs.

This is a design decision that only became clear to me from using a working implementation. In a static mock, a button will look fine and come across as an adequate user experience. But it’s hard to assess the user experience unless you can actually experience a working implementation. Once I re-implemented the page, it was clear auto-updating the answer was a superior experience. But without actually trying each version, I wouldn’t have been confident in that decision.


This project was a fun cross-collaboration between product and communication design at Optimizely. I focused on the interactions and implementation, while Jon focused on the visual design, but we sat side-by-side and talked through each decision together, sketching and pushing each other along the way. Ultimately, the finished product landed in a better place from this collaboration. It was a fun little side project that we hope adds value to the customer experience.

by Jeff Zych at June 18, 2014 03:24 PM

June 13, 2014

Ph.D. alumna

San Francisco’s (In)Visible Class War

In 2003, I was living in San Francisco and working at a startup when I overheard a colleague of mine — a self-identified libertarian — spout off about “the homeless problem.” I don’t remember exactly what he said, but I’m sure it fit into a well-trodden frame about no-good lazy leeches. I marched right over to him and asked if he’d ever talked to someone who was homeless. He looked at me with shock and his cheeks flushed, so I said, “Let’s go!” Unwilling to admit discomfort, he followed.

>We drove down to 6th Street, and I nodded to a group of men sitting on the sidewalk and told him to ask them about their lives. Then I watched as he nervously approached one guy and stumbled through a conversation. I was pleasantly surprised that he ended up talking for longer than I expected before coming back to me.

“He’s a vet.”
“And he said the government got him addicted and he can’t shake the habit.”
“And he doesn’t know what he should do to get a job because no one will ever talk to him.”
“I didn’t think…. He’s not doing so well…”

I let him trail off as we got back into the car and drove back to the office in silence.

San Francisco is in the middle of a class war. It’s not the first or last city to have heart-wrenching inequality tear at its fabric, challenge its values, test its support structures. But what’s jaw-dropping to me is how openly, defensively, and critically technology folks demean those who are struggling. The tech industry has a sickening obsession with meritocracy. Far too many geeks and entrepreneurs worship at the altar of zeros and ones, believing that outputs can be boiled down to a simple equation based on inputs. In a modern-day version of the Protestant ethic, there’s a sense that success is a guaranteed outcome of hard work, skills, and intelligence. Thus, anyone who is struggling can be blamed for their own circumstances.

This attitude is front and center when it comes to people who are visibly homeless on the streets of San Francisco, a mere fraction of the total homeless population in that city.

I wish that more people working in the tech sector would take a moment to talk to these men and women. Listening to their stories is humbling. Vets who fought for our country, under the banner of “freedom,” only to be cognitively imprisoned by addiction and mental illness. Abused runaways trying to find someone who will treat them with respect. People who were working hard and getting by until an accident struck and they lost their job and ended up in medical debt. Immigrants who came looking for the American Dream only to find themselves trapped. These aren’t no-good lazy leeches. They’re people. People whose lives have been a hell of a lot harder than most of us can even fathom. People who struggle on a daily basis to find food and shelter. People who we’ve systematically disenfranchised and failed to support. People who the bulk of tech workers ignore, shun, resent, and demonize.

A city without a safety net cannot be a healthy society. And nothing exacerbates this worse than condescension, resentment, and dismissal. We can talk about the tech buses and the lack of affordable housing, but it all starts with appreciating those who are struggling. Only a mere fraction of San Francisco’s homeless population are visible, but those who are reveal the starkness of what’s unfolding. And, as with many things, there’s more of a desire to make the visible invisible than there is to grapple with dynamics of poverty, mental illness, addiction, abuse, and misfortune. Too many people think that they’re invincible.

If you’re living in the Bay Area and working in tech, take a moment to do what I asked my colleague to do a decade ago. Walk around the Tenderloin and talk with someone whose poverty is written on their body. Respectfully ask about their life. Where did they come from? How did they get here? Where do they want to go? Ask about their hopes and dreams, struggles and challenges. Get a sense for their story. Connect as people. Then think about what meritocracy in tech really means.

(Photo by Darryl Harris.)

Two great local organizations: Delancey Street Foundation and Homeless Children’s Network.

(This entry was first posted on May 13, 2014 at Medium under the title “San Francisco’s (In)Visible Class War” as part of The Message.)

by zephoria at June 13, 2014 04:27 PM

June 10, 2014

MIMS 2014

Spotlight: Diagknowzit


diagknowzitTo me, data science is the combination of better decision-making with technology that enables us to actually execute those better decisions in real time. Understanding the science behind better decision making is not on its own sufficient. Technology is required to gather the relevant information, present it to people, and enable them to act upon it in real time. I learned this first-hand five years ago when I was working at the global health NGO, Partners In Health (PIH).

In rural Lesotho, PIH developed an app for feature phones that enabled community health workers to verify whether patients had actually taken their antiretroviral (ARV) medications via SMS. The text messages were uploaded directly into PIH’s electronic medical record (EMR) system, and allowed the medical staff to monitor ARV adherence rates in their catchment areas in real time. By using a simple feature phone app, PIH leapfrogged over the serious infrastructure challenges that are present in Lesotho– a landlocked country where there are a lot of mountains and very few roads.

Seeing PIH pioneer this solution inspired me to take a more technical turn in my career. When it came time to do my final project for my masters degree, I reached out to PIH to see if we could find an opportunity to put data science to work for PIH. We identified the following problem:

When patients arrive at PIH clinics, someone has to document their visit in the EMR. This person may not have much medical expertise; they might not be very familiar with the in’s and out’s of the EMR system. Still, the way the system is set up, that person has to record a presumed diagnosis into the EMR system. To deal with the problem currently, PIH allows entry technicians to enter whatever they want into the system as a presumed diagnosis. If the system doesn’t recognize the user input, the input is stored as a placeholder value that someone has to go back and manually code at some point in the future. Manually coding the data must be done by someone with greater medical/system knowledge–perhaps even a doctor. Looking at the entire process, we concluded these workers would be more effective seeing patients instead of dealing with data entry.

To fix the problem, we built Diagknowzit, a recommendation engine built on top of Open-MRS, the open source EMR used by PIH. Diagknowzit works similarly to Google’s “Did you mean…?” feature–except whereas “Did you mean…?” gets the user to spell things correctly, Diagknowzit matches the medical condition meant by the user to the EMR system’s official representation of that condition.

The challenges in building Diagknowzit could fill many blog posts, but here I’ll only mention a few. For one thing, the core codebase of Open-MRS–built using Spring/Hibernate and Java–was built in the mid 2000s and is a bit dated at this point. My partners and I had to spend a significant amount of time porting our knowledge of more Pythonic frameworks back in time to be able to build with the requisite toolset.

Perhaps the biggest challenge we faced was the fact that we only received the data we needed to train our recommendation engine two weeks before the final project was due. Sharing data across organizations is never easy, and when that information contains sensitive medical information, the risks are particularly high. It took a long time for our request to filter through all the necessary layers of oversight at PIH, but we’re very happy and grateful that it worked out in the end.Algorithm Performance

Because of the compressed timeline, we didn’t have quite as much time to explore and experiment with the data as I would have liked. Still, even with a relatively simple machine learning model, we were able to achieve pretty decent performance. Using multinomial logistic regression, our engine guessed correctly 71% of the time, which is in the ballpark of similar projects we found during our literature review.

Ultimately, our goal in doing this project was more than just to build something useful for PIH. I mentioned before how PIH’s EMR system, Open-MRS, is open source; it was actually developed by PIH and others as a response to the growing need for global health NGOs to manage their information effectively. The Open-MRS development community is thriving, but little attention has been paid thus far to the potential of data science-based tools to improve clinical decision support. We hope that our promising results inspire more development in this direction. In fact, if you are a member of the Open-MRS community and are interested in Diagknowzit or more tools like it, please don’t hesitate to get in touch :)

by dgreis at June 10, 2014 03:13 PM

June 06, 2014

Ph.D. student

i’ve started working on my dissertation // diversity in open source // reflexive data science

I’m studying software development and not social media for my dissertation.

That’s a bit of a false dichotomy. Much software development happens through social media.

Which is really the point–that software development is a computer mediated social process.

What’s neat is that it’s a computer mediated social process that, at its best, creates the conditions for it to continue as a social process. c.f. Kelty’s “recursive public”

What’s also neat is that this is a significant kind of labor that is not easy to think about given the tools of neoclassical economics or anything else really.

In particular I’m focusing on the development of scientific software, i.e. software that’s made and used to improve our scientific understanding of the natural world and each other.

The data I’m looking at is communications data between developers and their users. I’m including the code, under version control, as this. In addition to being communication between developers, you might think of source code as a communication between developers and machines. The process of writing code as a collaboration or conversation between people and machines.

There is a lot of this data so I get to use computational techniques to examine it. “Data science,” if you like.

But it’s also legible, readable data with readily accessible human narrative behind it. As I debug my code, I am reading the messages sent ten years ago on a mailing list. Characters begin to emerge serendipitously because their email signatures break my archive parser. I find myself Googling them. “Who is that person?”

One email I found while debugging stood out because it was written, evidently, by a woman. Given the current press on diversity in tech, I thought it was an interesting example from 2001:

From sag at Thu Nov 29 15:21:04 2001
From: sag at (Sue Giller)
Date: Thu Nov 29 15:21:04 2001
Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array
In-Reply-To: <000201c17917$ac5efec0$>
References: <>
Message-ID: <>


Well, you’re right. I did misunderstand your reply, as well as what
the various functions were supposed to do. I was mis-using the
sum, minimum, maximum as tho they were MA..reduce, and
my test case didn’t point out the difference. I should always have
been doing the .reduce version.

I apologize for this!

I found a section on page 45 of the Numerical Python text (PDF
form, July 13, 2001) that defines sum as
‘The sum function is a synonym for the reduce method of the add
ufunc. It returns the sum of all the elements in the sequence given
along the specified axis (first axis by default).’

This is where I would expect to see a caveat about it not retaining
any mask-edness.

I was misussing the MA.minimum and MA.maximum as tho they
were .reduce version. My bad.

The MA.average does produce a masked array, but it has changed
the ‘missing value’ to fill_value=[ 1.00000002e+020,]). I do find this
a bit odd, since the other reductions didn’t change the fill value.

Anyway, I can now get the stats I want in a format I want, and I
understand better the various functions for array/masked array.

Thanks for the comments/input.


I am trying to approach this project as a quantitative scientist. But the process of developing the software for analysis is putting me in conversation not just with the laptop I run the software on, but also the data. The data is a quantified representation–I count the number of lines, even the number of characters in a line as I construct the regular expression needed to parse the headers properly–but it represents a conversation in the past. As I write the software, I consult documentation written through a process not unlike the one I am examining, as well as Stack Overflow posts written by others who have tried to perform similar tasks. And now I am writing a blog post about this work. I will tweet a link of this out to my followers; I know some people from the Scientific Python community that I am studying follow me on Twitter. Will one of them catch wind of this post? What will they think of it?

by Sebastian Benthall at June 06, 2014 04:52 AM

June 05, 2014

Ph.D. student

Being the Machine



The age of digital fabrication is upon us and there are hundreds of articles that will tell you that it will surely change the world. My current research project called “Being the Machine” explores how the design of digital fabricators could be different and explores new possibilities for interacting with fabrication and computer numeric controlled (CNC) technologies. My approach designs for fabrication as a kind of performance rather than a tool for accomplishing a given task. As a performance, all parts of the system become aesthetically meaningful: the movements of the human, the movements of the machine, the objects that are produced, the contexts in which they are placed, and the materials used for development. The system I am building consists of a head-worn laser guide that draws G-Code paths that the user follows by hand. What this system does is guide someone in building any object in the way that a 3D printer would. This system allows us to fabrication in ways that are currently difficult with existing 3D printers as it is completely portable and the user has a wide range of choices about what materials to use in fabrication (sand at the beach, snow in the mountains, polenta in the kitchen etc.). Additionally, since fabrication is tied to a human rather than a machine, the user is free to explore different ways to “break” the system in order to reveal new aesthetic choices. For instance, the material properties of the objects one is building with can be unpredictable and subject to environmental factors (wind, rain, etc). In the spirit of indeterminacy, these “unknowns” can be productive ways to expose new aesthetic possibilities. By building and studying this system, I hope to reveal new insights about the way in which value is constructed in fabricated objects and the role fabrication might play in someones social and emotional life. I am currently developing this project as Graduate Student Researcher and an Artist-In-Residence at Instructables/Autodesk. I’m posting all of my prototypes and progress here:

by admin at June 05, 2014 08:40 PM

Ph.D. alumna

Will my grandchildren learn to drive? I expect not

I rarely drive these days, and when I do, it’s bloody terrifying. Even though I grew up driving and drove every day for fifteen years, my lack of practice is palpable as I grip the steering wheel. Every time I get behind the wheel, in order to silence my own fears about all of the ways in which I might crash, I ruminate over the anxieties that people have about teenagers and driving. I try not to get distracted in my own driving by looking to see if other drivers are texting while driving, but I can’t help but muse about these things. And while I was driving down the 101 in California last week, it hit me: driving is about to become obsolete.

The history of cars in America is tied up with what it means to be American in the first place. American history —with its ups and downs — can be understood through the automobile industry. In fact, it can be summed up with one word: Detroit. Once a booming metropolis, this one-industry town iconically highlights the issues that surround globalization, class inequality, and labor identities. But entwined with the very real economic factors surrounding the automobile industry is an American obsession with freedom.

It used to be that getting access to a car was the ultimate marker of freedom. As a teenager in the nineties, I longed for my sixteenth birthday and all that was represented by a driver’s license. Today, this sentiment is not echoed by the teens that I meet. Some still desperately want a car, but it doesn’t have the same symbolic feeling that it once did. When I ask teens about driving, what they share with me reveals the burdens imposed by this supposed tool of freedom. They talk about the costs — especially the cost of gas. They talk about the rules — especially the rules that limit them from driving with other teens in the car. And they talk about the risks — regurgitating back countless PSAs on drinking or texting while driving. While plenty of teens still drive, the very notion of driving doesn’t prompt the twinkle in their eyes that I knew from my classmates.

Driving used to be hard work. Before there was power steering and automatic transmission, maneuvering a car took effort. Driving used to be a gateway for learning. Before there were computers in every part of a car, curious youth could easily tear apart their cars and tinker with their innards. Learning to drive and manipulate a car used to be admired. Driving also used to be fun. Although speed limits and safety belts have saved many lives, I still remember the ways in which we would experiment with the boundaries of a car by testing its limits in parking lots on winter days. And I will never forget my first cross-country road trip, when I embraced the openness of the road and pushed my car to the limits and felt the wind on my face. Freedom, I felt freedom.

Today, what I feel is boredom, if not misery. The actual mechanisms of driving are easy, fooling me into a lull when I get into a car. Even with stimuli all around me, all I get to do is pump the gas, hit the brakes, and steer the wheel no more than ten degrees. My body is bored and my brain turns off. By contrast, I totally get the allure of the phone—or anything that would be more interesting than trying to navigate the road while changing the radio station to avoid the incessant chatter from not-very-entertaining DJs.

It’s rare that I hear many adults talk about driving with much joy. Some still get giddy about their cars; I hear this most often from my privileged friends when they get access to a car that changes their relationship to driving, such as an electric car or a hybrid or a Tesla. But even in those cases, I hear enthusiasm for a month before people go back to moaning about traffic and parking and surveillance. Outside of my friends, I hear people lament gas prices and tolls and this, that, or the other regulation. And when I listen to parents, they’re always complaining about having to drive their kids here, there, and everywhere. Not surprisingly, the teens that I meet rarely hear people talk joyously about cars. They hear it as a hassle.

So where does this end up? Data from both the CDC and AAA suggests that fewer and fewer American teens are bothering to even get their driver’s license. There’s so much handwringing about driving dangers, so much effort towards passing new laws and restrictions targeting teens in particular, and so much anxiety about distracted driving. Not surprisingly, more and more teens are throwing their hands in the air and giving up, demanding their parents drive them because there’s no other way. This, in turn, means that parents hate driving even more. And since our government is incapable of working together to invest in infrastructural investments, thereby undermining any hopes of public transit in huge parts of the country, what we’re effectively doing is laying the groundwork for autonomous vehicles. It’s been 75 years since General Motors exhibited an autonomous car at the 1939 World’s Fair, but we’ve now created the cultural conditions for this innovation to fit into American society.

We’re going to see a decade of people flipping out over fear that autonomous vehicles are dangerous, even though I expect them to be a lot less dangerous that sleepy drivers, drunken drivers, distracted drivers, and inexperienced drivers. Older populations that still associate driving with freedom are going to be resistant to the very idea of autonomous vehicles, but both parents and teenagers will start to see them as more freeing than driving. We’re still a long way from autonomous vehicles being meaningfully accessible to the general population. But we’re going to get there. We’ve spent the last thirty years ratcheting up fears and safety measures around cars, and we’ve successfully undermined the cultural appeal of driving. This is what will open the doors to a new form of transportation. And the opportunities for innovation here are only just beginning.

(This entry was first posted on May 5, 2014 at Medium under the title “Will my grandchildren learn to drive? I expect not” as part of The Message.)

by zephoria at June 05, 2014 03:23 PM

May 25, 2014

MIMS 2012

I, Too, Only Work On Unshiny Products

Jeff Domke’s article about working on “unshiny” products sums up my view of my own work at Optimizely. The crux of his argument is that many designers are drawn to working on “shiny” products — products that are pretty and lauded in the design community — but “unshiny” products are a lot more interesting to work on. They’re often solving difficult problems, and have more room for you to make an impact. You’re working to make the product reach its potential. Shiny products, on the other hand, have reached that potential, and you are less able to make your mark.

Optimizely sits right in the sweet spot. We aren’t “shiny” (compared to sexy products like Square and Medium, we have a long way to go); nor are we “unshiny” (our customers describe us as well-designed and easy-to-use). Rather, we land more in the middle — we have a solid user experience that has a lot of room for improvement.

We’re also solving really hairy, complex problems. Apps that get mounds of praise tend to solve relatively simple problems (such as to-do list apps). It’s much more interesting to work on a problem space that is unexplored and is full of murky, vague, conflicting goals that must be untangled. And once you’ve made sense of the mess, you know you’ve enabled someone to do their job better.

And that’s the most exciting part of working at Optimizely — making the product fulfill its potential, and solving tough problems that impact businesses' bottom lines.

Want to come work on hard problems with me at Optimizely? Check out our jobs, or reach out @jlzych.

by Jeff Zych at May 25, 2014 06:55 PM

Ph.D. alumna

Matt Wolf’s “Teenage”

Close your eyes and imagine what it was like to be a teenager in the 1920s. Perhaps you are out late dancing swing to jazz or dressed up as a flapper. Most likely, you don’t visualize yourself stuck at home unable to see your friends like today’s teenagers. And for good reason. In the 1920s, teenagers used to complain when their parents made them come home before 11pm. Many, in fact, earned their own money; compulsory high school wasn’t fully implemented until the 1930s when adult labor became anxious about the limited number of available jobs.

Although contemporary parents fret incessantly about teenagers, most people don’t realize that the very concept of a “teenager” is a 1940s marketing invention. And it didn’t arrive overnight. It started with a transformation in the 1890s when activists began to question child labor and the psychologist G. Stanley Hall identified a state of “adolescence” that was used to propel significant changes in labor laws. By the early 1900s, with youth out of the work force and having far too much free time, concerns about the safety and morality of the young emerged, prompting reformers to imagine ways to put youthful energy to good use. Up popped the Scouts, a social movement intended to help produce robust youth, fit in body, mind, and soul. This inadvertently became a training ground for World War I soldiers who, by the 1920s, were ready to let loose. And then along came the Great Depression, sending a generation into a tailspin and prompting government intervention. While the US turned to compulsory high school and the Civilian Conservation Corps, Germany saw the rise of Hitler Youth. And an entire cohort, passionate about being a part of a community with meaning, was mobilized on the march towards World War II.

All of this (and much more) is brilliantly documented in Jon Savage’s beautiful historical account Teenage: The Creation of Youth Culture. This book helped me rethink how teenagers are currently understood in light of how they were historically positioned. Adolescence is one of many psychological and physical transformations that people go through as they mature, but being a teenager is purely a social construct, laden with all sorts of political and economic interests.

When I heard that Savage’s book was being turned into a film, I was both ecstatic and doubtful. How could a filmmaker do justice to the 576 pages of historical documentation? To my surprise and delight, the answer was simple: make a film that brings to visual life the historical texts that Savage referenced.

In his new documentary, Teenage, Matt Wolf weaves together an unbelievable collection of archival footage to produce a breathless visual collage. Overlaid on top of this visual eye candy are historical notes and diary entries that bring to life the voices and experiences of teens in the first half of the 20th century. Although this film invites the viewer to reflect on the past, doing so forces a reflection on the present. I can’t help but wonder: what will historians think of our contemporary efforts to isolate young people “for their own good”?

This film is making its way through US independent theaters so it may take a while until you can see it, but to whet your appetite, watch the trailer:


(This entry was first posted on April 25, 2014 at Medium under the title “A Dazzling Film About Youth in the Early 20th Century” as part of The Message.)

by zephoria at May 25, 2014 03:15 PM

May 20, 2014

MIMS 2012

On Being a Generalist

I recently read Frank Chimero’s excellent article, “Designing in the Borderlands”. The gist of it is that we (designers, and the larger tech community) have constructed walls between various disciplines that we see as opposites, such as print vs. digital, text vs. image, and so on. However, the most interesting design happens in the borderlands, where these different media connect. He cites examples that combine physical and digital media, but the most interesting bit for me was his thoughts on roles that span disciplines:

For a long time, I perceived my practice’s sprawl as a defect—evidence of an itchy mind or a fear of commitment—but I am starting to learn that a disadvantage can turn into an advantage with a change of venue. The ability to cross borders is an asset. Who else could go from group to group and be welcomed? The pattern happens over and over: if you’re not a part of any group, you can move amongst them all by tip-toeing across the lines that connect them.

I have felt this way many times throughout my career (especially that “fear of commitment” part). I have long felt like a generalist who works in both design and engineering, and I label myself to help people understand what I do (not to mention the necessity of a title). But I’ve never cleanly fit into any discipline.

This line was further blurred by my graduate degree from UC Berkeley’s School of Information. The program brings together folks with diverse backgrounds, and produces T-shaped people who can think across disciplines and understand the broader context of their work, whether it be in engineering, design, policy & law, sociology, or dozens of other fields in which our alumni call home.

These borderlands are the best place for a designer like me, and maybe like you, because the borderlands are where things connect. If you’re in the borderlands, your different tongues, your scattered thoughts, your lack of identification with a group, and all the things that used to be thought of as drawbacks in a specialist enclave become the hardened armor of a shrewd generalist in the borderlands.

Couldn’t have said it any better. Being able to move between groups and think across disciplines is more of an advantage than a disadvantage.

by Jeff Zych at May 20, 2014 04:02 AM

May 14, 2014

Ph.D. student

A dynamically-generated robots.txt: will search engine bots recognize themselves?

In short, I built a script that dynamically generates a robots.txt file for search engine bots, who download the file when they seek direction on what parts of a website they are allowed to index. By default, it directs all bots to stay away from the entire site, but then presents an exception: only the bot that requests the robots.txt file is allowed full reign over the site. If Google’s bot downloads the robots.txt file, it will see that only Google’s bot gets to index the entire site. If Yahoo’s bot downloads the robots.txt file, it will see that only Yahoo’s bot gets to index the entire site. Of course, this is assuming that bots identify themselves to my server in a way that they recognize when it is reflected back to them.

What is a robots.txt file? Most websites have one of these very simple file called “robots.txt” on the main directory of their server. The robots.txt file has been around for almost two decades, and it is now a standardized way of communicating what pages search engine bots (or crawlers) should and should not visit. Crawlers are supposed to request and download a robots.txt file from any website they visit, and then obey the directives mentioned in such a file. Of course, there is nothing which prevents a crawler from still crawling pages which are forbidden in a robots.txt file, but most major search engine bots behave themselves. 

In many ways, robots.txt files stand out as a legacy from a much earlier time. When was the last time you wrote something for public distribution in a .txt file, anyway? In an age of server-side scripting and content management systems, robots.txt is also one the few public-facing files a systems administrator will actually edit and maintain by hand, manually adding and removing entries in a text editor. A robots.txt file has no changelog in it, but its revision history would be a partial chronicle of a systems administrator’s interactions with how their website is represented by various search engines.You can specify different directives for different bots by specifying a user agent, and well-behaved bots are supposed to look for their own user agents in a robots.txt file and follow the instructions left for them. As for my own, I’m sad to report that I simply let all bots through wherever they roam, as I use a sitemap.tar.gz file which a WordPress plugin generates for me on a regular basis and submits to the major search engines. So my robots.txt file just looks like this:

User-agent: *
Allow: /

An interesting thing about contemporary web servers is that file formats no longer really matter as much as they used to. In fact, files don’t even have to exist as we they are typically represented in URLs. When your browser requests the page, there is a directory called “wordpress” on my server, but everything after that is a fiction. There is no directory called 2014, no a subdirectory called 05, and no file called robots-txt that existed on the server before or after you downloaded it. Rather, when WordPress receives a request to download this non-existent file, it intercepts it and interprets it as a request to dynamically generate a new HTML page on the fly. WordPress queries a database for the content of the post, inserts that into a theme, and then has the server send you that HTML page — with linked images, stylesheets, and Javascript files, which often do actually exist as files on a server. The server probably stores the dynamically-generated HTML page in its memory, and sometimes there is caching to pre-generate these pages to make things faster, but other than that, the only time an HTML file of this page ever exists in any persistent form is if you save it to your hard drive. 

Yet robots.txt lives on, doing its job well. It doesn’t need any fancy server-side scripting; it does just fine on its own. Still, I kept thinking about what it would be like to have a script dynamically generate a robots.txt file on the fly whenever it is requested. Given that the only time a robots.txt file is usually downloaded is when an automated software agent requests it, there is something strangely poetic about an algorithmically-generated robots.txt file. It is something that would, for the most part, only ever really exist in the fleeting interaction between two automated routines. So of course I had to build one.

The code required to implement this is trivial. First, I needed to modify how my web server interprets requests, so that whenever a request was made to robots.txt, the server would execute a script called robots.php and send the client the output as robots.txt. Modify the .htaccess file to add:

RewriteEngine On
RewriteBase /
RewriteRule ^robots.txt$ /robots.php

Next, the PHP script itself:

echo "User-agent: *" . "\r\n";
echo "Allow: /" . "\r\n";

Then I realized that this was all a little impersonal, and I could do better since I’m scripting. With PHP, I can easily query the user-agent of the client which is requesting the file, the identifier it sends to the web server. Normally, user agents define the browser that is requesting the page, but bots are supposed to have an identifiable user-agent like “Googlebot” or “Twitterbot” so that you can know them when they come to visit. Instead of granting access to every user agent with the asterisk, I made it so that the user agent of the requesting client is the only one that is directed to have full access.

echo "User-agent:" . $_SERVER['HTTP_USER_AGENT'] . "\r\n";
echo "Allow: /" . "\r\n";

After making sure this worked, I realized that I needed to go out there a little more. If the bots didn’t recognize themselves, then by default, they would still be allowed to crawl the site anyway. robots.txt works on a principle of allow by default. So I needed to add a few more lines which made it so that the robots.txt file the bot downloaded would direct all other bots to not crawl the site, but give full reign to bots with the user agent it sent the server.

 echo "User-agent: *" . "\r\n";
 echo "Disallow: /" . "\r\n";
 echo "User-agent:" . $_SERVER['HTTP_USER_AGENT'] . "\r\n";
 echo "Allow: /" . "\r\n";

This is what you get if you download it in Chrome:

User-agent: *
Disallow: /
User-agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36 
Allow: /

The restrictive version is now live, up at I’ve also put it up on github, because apparently that’s what cool kids do. I’m looking forward to seeing what will happen. Google’s webmaster tools will notify me if its crawlers can’t index my site, for whatever reason, and I’m curious if Google’s bots will identify themselves to my servers in a way that they will recognize.

by stuart at May 14, 2014 01:37 AM

May 12, 2014

MIMS 2012

Why We Hire UI Engineers on the Design Team

This Happy Cog article by Stephen Caver perfectly encapsulates why we hire UI Engineers on the design team at Optimizely (as opposed to the engineering team). We want the folks coding a UI to be involved in the design process from the beginning, to understand the design system that underlies a user experience, and to be empowered to make design decisions while developing a UI. Successful designs must adapt to various contexts and degrade gracefully. The people most qualified to make those kinds of decisions are the ones writing the code. As said in the article, “In this new world, the best thing a developer can do is to acquire an eye for design—to be able to take design aesthetic, realize its essential components, and reinterpret them on the fly.” By embracing this mindset in our hiring and design process, we’ve found the end result is a higher quality product.

by Jeff Zych at May 12, 2014 04:13 AM

May 07, 2014

Ph.D. student


autocatalysis sustains autopoeisis

by Sebastian Benthall at May 07, 2014 05:36 PM

MIMS 2010

Ways to communicate with me that are more effective than leaving a voicemail

  • text message
  • email
  • postal mail
  • Twitter direct message
  • pager
  • skywriting
  • interpretive dance
  • smoke signals
  • drunk carrier pigeon
  • singing telegram
  • sports arena jumbotron
  • tell Suzy to tell Rachel to tell Bill to tell me
  • message in a bottle thrown into ocean
  • give me a telling look
  • send a taxi to pick me up and drive me to the coast where a crewman aboard a ship signals using flag semaphore
  • Western Union
  • telepathy

Expanded from the condensed version. Hat-tip to @ravi and @av.

by Ryan at May 07, 2014 06:18 AM

May 06, 2014

Ph.D. alumna

What if the sexual predator image you have in your mind is wrong?

(I wrote the following piece for Psychology Today under the title “Sexual Predators: The Imagined and the Real.”)

If you’re a parent, you’ve probably seen the creepy portraits of online sexual predators constructed by media: The twisted older man, lurking online, ready to abduct a naive and innocent child and do horrible things. If you’re like most parents, the mere mention of online sexual predators sends shivers down your spine. Perhaps it prompts you to hover over your child’s shoulder or rally your school to host online safety assemblies.

But what if the sexual predator image you have in your mind is wrong? And what if that inaccurate portrait is actually destructive?

When it comes to child safety, the real statistics don’t stop parental worry. Exceptions dominate the mind. The facts highlight how we fail to protect those teenagers who are most at-risk for sexual exploitation online.

If you poke around, you may learn that 1 in 7 children are sexually exploited online. This data comes from the very reputable Crimes Against Children Research Center, however, very few take the time to read the report carefully. Most children are sexually solicited by their classmates, peers, or young adults just a few years older than they are. And most of these sexual solicitations don’t upset teens. Alarm bells should go off over the tiny percentage of youth who are upsettingly solicited by people who are much older than them. No victimization is acceptable, but we need to drill into understanding who is at risk and why if we want to intervene.

The same phenomenal research group, led by David Finkelhor, went on to analyze the recorded cases of sexual victimization linked to the internet and identified a disturbing pattern. These encounters weren’t random. Rather, those who were victimized were significantly more likely to be from abusive homes, grappling with addiction or mental health issues, and/or struggling with sexual identity. Furthermore, the recorded incidents showed a more upsetting dynamic. By and large, these youth portrayed themselves as older online, sought out interactions with older men, talked about sex online with these men, met up knowing that sex was in the cards, and did so repeatedly because they believed that they were in love. These teenagers are being victimized, but the go-to solutions of empowering parents, educating youth about strangers, or verifying the age of adults won’t put a dent into the issue. These youth need professional help. We need to think about how to identify and support those at-risk, not build another an ad campaign.

What makes our national obsession with sexual predation destructive is that it is used to justify systematically excluding young people from public life, both online and off. Stopping children from connecting to strangers is seen as critical for their own protection, even though learning to navigate strangers is a key part of growing up. Youth are discouraged from lingering in public parks or navigating malls without parental supervision. They don’t learn how to respectfully and conscientiously navigate new people because they are taught to fear all who are unknown.

The other problem with our obsession with sexual predators is that it distracts parents and educators. Everyone rallies to teach children to look out for and fear rare dangers without giving them the tools for managing more common forms of harm that they might encounter. Far too many young people are raped and sexually victimized in this country. Only a minuscule number of them are harmed at the hands of strangers, online or off. Most who will be abused will suffer at the hands of their classmates and peers.

In a culture of abstinence-only education, schools don’t want to address any aspect of sexual and reproductive health for fear of upsetting parents. As a result, we fail to give young people the tools to handle sexual victimization. When the message is “just say no,” we shame young people who were sexually abused or violated.

It’s high time that we walk away from our nightmare scenarios and focus on addressing the serious injustices that exist. The world we live in isn’t fair and many youth who are most at-risk do not have concerned parents looking out for them. Because we have stopped raising children as a community, adults are often too afraid to step on other parents’ toes. Yet, we need adults who are looking out for more than just their children. Furthermore, our children need us to talk candidly about sexual victimization without resorting to boogeymen.

While it’s important to protect youth from dangers, a society based on fear-mongering is not healthy. Let’s instead talk about how we can help teenagers be passionate, engaged, constructive members of society rather than how we can protect them from statistically anomalous dangers. Let’s understand those teens who are truly at risk; these teens often have the least support.

(This piece was first published at Psychology Today.)

by zephoria at May 06, 2014 01:17 AM

May 04, 2014

Ph.D. student

Re: The Great Works of Software

Hi Paul,

This "Great Works of Software" piece is fantastic. Of course I want to correct it, and I'm sure everyone does and I'm fairly confident that was the intention of it, and getting everyone to reflect and debate the greatest pieces of software is as worthy of an intention for a blog post (even one hosted on Medium) as any I can think of.

I don't dispute any of your five [0], but I was surprised by something: where are the Internet and the Web? Sure, the Web is a little young at 25, but it's old enough to have been declared dead a good handful of times and the Internet calls Word and Photoshop young whippersnappers. Does the Web satisfy your criteria of everyday, meaningful use? Of course. But I'm guessing that you didn't just forget the Web when writing about meaningful software. Instead, I suspect you very intentionally chose [1] to leave these out to illustrate an important point: that the Web isn't a single piece of software in the same sense that the programs you listed are.

The Web is made up of software (and hardware): web server software running on millions of machines all around the world; user agents running on every client machine we can think of (desktop, mobile, laptop, refrigerator); proxies and caching middleboxes; DNS servers; software and firmware running on routers and switches, in Internet Exchange Points and Internet Service Providers; software not included in this classification; crawlers constantly indexing and archiving Web pages; open source libraries which encrypt communications for Transport Layer Security; et cetera. But even if one had an overly-simplified view of Web architecture (and I wouldn't criticize anyone for this; this is the poor-man's Web architecture that I teach students all the time) consisting of servers and browsers, anyone would see that there's no singular piece of software involved. You mentioned the TCP/IP stack as a runner up, but there's no single TCP/IP implementation that's particularly great or important: what's important is that separate implementations of the relevant IETF standards interoperate [2]. Other listmakers included a browser (Kirschenbaum highlighted Mosaic [3]; PC World, Navigator) or you could imagine listing Apache as a canonical server (and the corresponding foundation and software development methodology), but even as important as those pieces were (and are!), alone they just don't make a difference.

As a thought experiment then, I submit a preliminary list for a Web software canon, listing not single pieces of software but systems of software, standards and people.

Non-exhaustive, of course, but I hope it's helpful for your next blog post, which I hope to see on Is there something distinctive to these systems of software that are intrinsically tied up with the communities that use and develop them? Whole publics that are recursive, say [4]? I hope there are a few people out there writing books and dissertations about that. (I should really get back to writing that prospectus.)


[0] Okay, I'm skeptical about Emacs -- isn't the operating system/joining of small software pieces already well-covered by Unix?

[1] By the Principle of Charity.

[2] It might be tempting, for someone who works on Web standards like I do, to claim that the Web is really just a set of interoperable standards, but that's nonsense as soon as I think about it at all. Sure, I think standards are important, but a standard without an implementation is just a bit of text somewhere. An of course, that's not hypothetical at all: standards without widespread implementation are commonplace, and bittersweet.

[3] Also, Kirschenbaum includes Hypercard in his list, with a reference to Vannevar Bush and the Memex, which I love, and it might be the closest in these lists to something that looks like the Web/hypertext but in non-networked single-piece-of-software form.

[4] Kelty, Christopher M. Two Bits: The Cultural Significance of Free Software. Duke University Press Books, 2008.

by at May 04, 2014 05:12 AM


Can you say "sacrilegious"? Can I?

Every year I serve as the judge for an Author's Spelling Bee held to benefit the nonprofit outfit Small Press Distribution in Berkeley, which distributes works (poetry, fiction, journals, translations) published by several hundred  independent presses (check them out, really). My job is a lot less harrowing than being a competitor: I just say the words to the authors and then signal their success (bell) or failure (slide whistle), the latter obliging them to flip over their name tags and withdraw from the competition. The only place where linguistic expertise plays any role is in helping the organizers winnow down the word list they've prepared, suggesting additions and deleting items that are too obscure or that have several alternate spellings (acknowledgment, say). The best words to use are the relatively familiar but tricky ones that trip up many literate people but not everyone—braggadocio, supersede, absorption, minuscule—the object being to neither eliminate everybody on the first go round (which could easily happen with an item like dieffenbachia) nor let the affair run on more than 45 minutes or so. Oh, and I check beforehand to make sure I actually know how to pronounce the words. Which is what sent me to Merriam-Webster to see what they gave for sacrilegious:

That's all?? How about Oxford Dictionaries?

And the redoubtable American Heritage, of whose usage panel I hold the august title of chair emeritus?

Hmm. What's an honest judge to do? Granted,  the [-lɪdʒəs] pronunciation (rhymes with prestigious) is by far more common than the historically correct [-li:dʒəs] (rhymes with egregious), which is the only pronunciation given in the OED's first edition (the second accepts both).

That's almost certainly because everyone folk-etymologizes the word as a derivative of religious, rather than of sacrilege, which in turn is why so many people misspell it with the i and e reversed (we lost two or three competitors before Daniel Levin Becker nailed it). And when the preponderance of literate usage favors a particular pronunciation, how could it be other than correct? As H. W. Fowler wrote, "Pronounce as your neighbours do, not better; for words in general use your neighbour is the general public." Indeed, as early as 1912, the author of a book called Correct Pronunciation was enjoining readers to say the word with /i:/ rather than /I/, which means of course that the latter was already common.

But Fowler's rule doesn't carry over to orthography: no dictionary would think of recording the word as sacrilegious, even as an alternate. And if you do know how it's spelled, is it right to accede to a pronunciation that based on what you know to be a misconception about how it's derived and written? Shouldn't the dictionary at least nod to [li:dʒəs] as an alternate, as the OED Second Edition did? It makes you wonder whether the editors themselves fretted over this—or did the people who recorded the words just use the pronunciation they were familiar with without giving it a thought?

So what to do? I couldn't bring myself to say it as if it were spelled sacreligious, not just because it would have encouraged the competitors to spell it that way, but also because—let's be frank—I couldn't bear to imagine that someone might think I didn't know myself how the word was spelled  or where it came from. But I couldn't say it to rhyme with egregious (or for that matter sortilegious), which would only suggest an unseasonable ostentation of learning (as Johnson once defined pedantry) but would likely have given the orthographic game away. So I sort of swallowed the vowel, and nobody seemed the wiser.

Fortunately, it's not a problem you're apt to encounter outside of a spelling bee. If you need the concept you could go with any of a number of near-synonyms whose pronunciation presents no problems, like impious. And anyway, what would you have done in my place—other, I mean, than judiciously dropping the word from the list?

by Geoff Nunberg at May 04, 2014 04:19 AM

MIMS 2012

Matthew Carter&#8217;s &ldquo;My Life in Typefaces&rdquo;

I just got around to watching Matthew Carter’s excellent TED talk, “My Life in Typefaces”. In it, he talks about his experience designing type for the past 5 decades, and how technical constraints influenced his designs. The central question he tries to answer is, “Does a constraint force a compromise? By accepting a constraint, are you working to a lower standard?” This is a question that comes up in every discipline, and with every technological change. Matthew Carter’s take on this subject is interesting because he’s experienced numerous technological changes, and has designed superb typefaces for all of them.

At first blush, it’s easy to conclude that constraints force designers to compromise their vision. But design isn’t produced in a vacuum, and ultimately must be realized through one or more mediums (print, screen, radio, etc.). Therefore, one must work within constraints to produce the best designs. To do so, designers must understand the technology that enables their designs to be experienced, be it code, the printing process, and so on. As Matthew Carter said in this talk, “I’m a pragmatist, not an idealist, out of necessity,” which is a valuable lesson that all designers should take to heart.

by Jeff Zych at May 04, 2014 03:24 AM

May 01, 2014

Ph.D. alumna

New White House Report on Big Data

I’m delighted to see that the White House has just released its report on “big data” — “Big Data: Seizing Opportunities, Preserving Values” along with an amazing collection of supporting documents. This report is the culmination of a 90-day review by the Administration, spearheaded by Counselor John Podesta. I’ve had the fortune to be a part of this process and have worked hard to share what I know with Podesta and his team.

In January, shortly after the President announced his intention to reflect on the role of big data and privacy in society, I received a phone call from Nicole Wong at the Office of Science and Technology Policy, asking if I’d help run one of the three public conferences that the Administration hoped to co-host as part of this review. Although I was about to embark on a book tour, I enthusiastically agreed, both because the goal of the project aligned brilliantly with what I was hoping to achieve with my new Data & Society Research Institute and also because one does not say no when asked to help Nicole (or the President). We hadn’t intended to publicly launch Data & Society until June nor did we have all of the infrastructure necessary to run a large-scale event, but we had passion and gumption so we teamed up with the great folks at New York University’s Information Law Institute (directed by the amazing Helen Nissenbaum) and called on all sorts of friends and collaborators to help us out. It was a bit crazy at times, but we did it.

In under six weeks, our amazing team produced six guiding documents and crafted a phenomenal event called The Social, Cultural & Ethical Dimensions of “Big Data.” On our conference page, you can find an event summary, videos of the sessions, copies of the workshop primers and discussion notes, a zip file of important references, and documents that list participants, the schedule, and production team. This amazing event was made possible through the generous gifts and institutional support of: Alfred P. Sloan Foundation, Ford Foundation, John D. and Catherine T. MacArthur Foundation, the John S. and James L. Knight Foundation, Microsoft Research, and Robert Wood Johnson Foundation. (These funds were not solicited or collected on behalf of the Office of Science & Technology Policy (OSTP), or the White House. Acknowledgment of a contributor by the Data & Society Research Institute does not constitute an endorsement by OSTP or the White House.) Outcomes from this event will help inform the National Science Foundation-supported Council on Social, Legal, and Ethical aspects of Big Data (spearheaded by the conference’s steering committee: danah boyd, Geoffrey C. Bowker, Kate Crawford, and Helen Nissenbaum). And, of course, the event we hosted help shape the report that was released today.

Words cannot express how grateful I am to see the Administration seriously reflect on the issues of discrimination and power asymmetries as they grapple with both the potential benefits and consequences of data-centric technological development. Discrimination is a tricky issue, both because of its implications on individuals and because of what it means for society as a whole. In teasing out the issues of discrimination and big data, my colleague Solon Barocas pointed me to this fantastic quote by Alistair Croll:

Perhaps the biggest threat that a data-driven world presents is an ethical one. Our social safety net is woven on uncertainty. We have welfare, insurance, and other institutions precisely because we can’t tell what’s going to happen — so we amortize that risk across shared resources. The better we are at predicting the future, the less we’ll be willing to share our fates with others.

Navigating the messiness of “big data” requires going beyond common frames of public vs. private, collection vs. usage. Much to my frustration, the conversation around the “big data” phenomenon tends to get quickly polarized – it’s good or it’s bad, plain and simple. But it’s never that simple. The same tools that streamline certain practices and benefit certain individuals can have serious repercussions for other people and for our society as a whole. As the quote above hints at, what’s at stake is the very essence of our societal fabric. Building a healthy society in a data-centric world requires keeping one eye on the opportunities and one eye on the potential risks. While it’s not perfect, the report from the White House did a darn good job of striking this balance.

Not only did the White House team tease out many core issues for both public and private sector, but they helped scaffold a framework for policy makers. The recommendations they offer aren’t silver bullets, but they are reasonable first steps. Many will inevitably argue that they don’t go far enough (or, in some cases, go too far) – and I can definitely get nitpicky here – but that’s par for the course. This doesn’t damper my appreciation. I’m still uber grateful to see the Administration take the time to tease out the complexity of the issues and offer a path forward that is not simply polarizing.

Please take a moment to read this important report. I’d love to hear your thoughts. Data & Society would love to hear your thoughts. And if you’re curious to know more about what I’ll be doing next with this Research Institute, please join our newsletter.

Psst: Academics – check out the last line of the report ends on Page 68. Science and Technology Studies for teh win!

(Flickr credit: Stuart Richards)

by zephoria at May 01, 2014 05:48 PM

April 30, 2014

Ph.D. student

G-Code Visualization



This project is a G-Code visualizer, so that you can learn how 3D printing works without actually having to go out and buy a 3D printer. While it may sound difficult, I find the process of 3D printing to be deceptively simple and I’m hoping to communicate that to others using this visualization. G-Code can be described as the language that 3D printers speak. If you have a 3D model on your computer that you want to print out, your 3D printer converts the model into a series of steps that it has to draw in order to make the model, those steps are specified in G-Code. We humans like to think of 3D objects in logical units. For instance, if asked to build a model of a house, we would probably make four walls and then a roof. A 3D printer “thinks” about 3D objects differently. It cuts the model into a series of slices and then draws those slices layer by layer. G-Code just tells the printer the lines to draw. While it sounds complicated, it’s really quite simple. I find this process to be one the most interesting part of 3D printing. From an art perspective, these paths are fascinating and a fun way to rethink our we conceptualize of structures in 3D. I looked online for a place where I could visualize GCode models and the exact process in which they are constructed and found one project that came pretty close to what I was after. I forked the project and built my own code on top of it to get an even more in depth look at the paths. I wanted to see how the printer head would move through every layer of the model. I think it’s particularly fun to watch how shapes morph as you “fly” through the layers of a model.

by admin at April 30, 2014 07:15 PM

April 29, 2014

Ph.D. alumna

Rule #1: Do no harm.

Rule #2: Fear-mongering causes harm.

I believe in the enterprise of journalism, even when it lets me down in practice. The fourth estate is critically important for holding systems of power accountable. But what happens when journalists do harm?

On Sunday, a salacious article flew across numerous news channels. In print, it was given titles like “Teenagers can no longer tell the real world from the internet, study claims” (Daily Mail) and “Real world v online world: teens do not distinguish” (The Telegraph). This claim can’t even pass the basic sniff test, but it was picked up by news programs and reproduced on blogs.

The articles make reference to a “Digital Lives” study produced by Vodafone and Google, but there’s nothing in the articles themselves that even support the claims made by the headlines. No quotes from the authors, no explanation, no percentages (even though it’s supposedly a survey study). It’s not even remotely clear how the editors came up with that title because it’s 100% disconnected from the article itself.

So I decided to try to find the study. It’s not online. There’s a teaser page by the firm who appears to have run the study. Interestingly, they argue that the methodology was qualitative, not a survey. And it sounds like the study is about resilience and cyberbullying. Perhaps one of the conclusions is that teens don’t differentiate between bullying at school and cyberbullying? That would make sense.

Yesterday, I got a couple of pings about this study. Each time, I asked the journalist if they could find the study because I’d be happy to analyze it. Nada. No one had seen any evidence of the claim except for the salacious headline flying about. This morning, I went to do some TV for my book. Even though I had told the production team that this headline made no sense and there was no evidence to even support it, they continued to run with the story because the producer had decided that it was an important study. And yet, the best they could tell me is that they had reached out to the original journalist who said that he had interviewed the people who ran the study.

Why why why do journalists feel the need to spread these kinds of messages even once they know that there’s no evidence to support those claims? Is it the pressure of 24/7 news? Is it a Milgram-esque hierarchy where producers/editors push for messages and journalists/staffers conform even though they know better because they simply can’t afford to question their superiors given the state of journalism?

I’d get it if journalists really stood by their interpretations even though I disagreed with them. I can even stomach salacious headlines that are derived by the story. And as much as I hate fear-mongering in general, I can understand how it emerges from certain stories. But since when did the practice of journalism allow for uncritically making shit up? ::shaking head:: Where’s the fine line between poor journalism and fabrication?

As excited as I am to finally have my book out, it’s been painful to have to respond to some of the news coverage. I mean, it’s one thing to misunderstand cyberbullying but what reasonable person can possibly say with a straight face that today’s youth can no longer distinguish between the internet and everyday life!?!? Gaaaah.

(Image by Reuben Stanton)

by zephoria at April 29, 2014 12:44 PM

April 25, 2014

Ph.D. alumna

Rekindling my blogging practice: Why I’m part of “The Message” on Medium

When I started blogging in 1997, it was a social practice. It was something that my friend Andrew and I started doing connected to an independent study that we were taking at Brown. As a result, we read each other’s work and talked about it, online and off. My practice evolved over the years as I switched to LiveJournal and then forked my blogging into public and private practices. When blogging was “cool” in the mid-2000s, I was immersed in a blogging community where we were all reading and thinking about each other’s writing. As more and more people caught onto blogging, the practice became professionalized and my blog professionalized alongside that transformation. I still get angry and frustrated (“someone is wrong on the internet!!”) which prompts me to blog rants and essays, but blogging hasn’t really felt social in a long time. And I’ve been sad at how little I’ve written in recent years, especially as I reflect on all sorts of things happening around me.

A few weeks ago, the good folks at Medium came to me with an interesting proposal. They asked if I’d be willing to be a regular contributor to a collection they were putting together. Rather than simply offering me a platform with a large audience, they offered me something else: a small community to blog with. To my delight, that community included all sorts of old friends as well as folks who I don’t know but respect. Members of the group who are much funnier than I am concluded that the collection should be labeled: “The Message: A Pandaemonium Revolver Collection.”

This is an experiment. We’re trying to figure out what it means to incentivize each other in our writing, to spark ideas for each other, and to give feedback to each other as we blog about all sorts of things. For me, this is an opportunity to step back and think about blogging whatever’s on my mind – not just research-driven essays or angry rants, but reflections and commentary and all sorts of other good stuff. Per my agreement with Medium, I will not be reposting here what I blog there until 30 days after the post originally goes up on Medium. But hopefully this arrangement will allow me to start really engaging with the practice of blogging again.

I have just posted my first post: A Dazzling Film About Youth in the Early 20th Century, which is a review of the beautiful and brilliant new film “Teenage” which is currently making its rounds in independent theaters in the United States.

by zephoria at April 25, 2014 03:24 PM

April 24, 2014

MIMS 2012



* 意志力就是驾驭“我要做”,“我不要”和“我想要”这三种力量。而分管这三种力量的生理组织是前额皮质。

* 每天5分钟的冥想练习有利于帮助锻炼意志力。

* 切记三思而后行: 在身体和心理状况不是特别好的时候最好不要冲动做决定。

* 自控力不仅和心理有关,更和生理有关,还和环境(吃什么,住哪里)有关。

* 锻炼,保证良好睡眠,健康饮食和朋友家人度过美好时光,参加宗教活动,都能增强意志力储备。自控力最好的良药是锻炼!改善心情,缓解压力最有效的锻炼是每次5分钟。

* 放松能让你恢复意志力储备。

* 自控力的死敌是压力: 心理上或者生理上的压力。要赢得意志力挑战,我们需要调整到正确的身心状态:深/慢呼吸,出门转转,睡眠,放松都是方法。

* 人们早上意志力最强,然后随着时间推移逐渐减弱。意志力是消耗品,每次利用意志力,人们会变的虚弱无力。

* 如果你想彻底改变旧习惯,最好找一种简单的方法来训练自控力,提高意志力,而不是设定一个过高的目标。

* 人其实没有极限, 疲惫只不过是大脑产生某种身体反映而已。

* 如果我们的思想中存在正反两面,好的行为总是允许我们做一点坏事。不要把支持目标实现的行为误认为是目标本身,不是说你做了一件和你目标一致的事情,你就不再面临危险。不要说你进步了多少,而是说你达到目标重点没有。

* 今天和明天没有任何区别,所以明天的事情今天能做就做掉。

* 为了更好的自控,需要忘记美德,关注目标和价值观。取消给自己许可的理由,牢记理由。

* 有些东西很自然的和奖励产生了联系。比如,反复的回复邮件,上Facebook,上色情网站,我们自然的认为我们可以活动奖励的机会。于是身体里面产生了让人兴奋的多巴胺。

* 大量分泌的多巴胺会放大及时行乐的快感,让你不在关心长期的后果。

* 新闻中的死亡报道会让观众对豪华轿车,劳力士手表等贵重物品产生更积极的回应。因为我们意识到自己不会永生时,我们会更容易屈服于各种诱惑。

* 自我批评会降低积极性和自控力,而且也是最容易导致抑郁的因素。相反,自我同情会提升积极性和自控力。增强责任感的不是罪恶感,而是自我谅解。研究发现,在个人挫折面前,持自我同情态度的人比持自我批评态度的人更愿意承担责任,他们也愿意接受别人的反馈和意见。

* 延迟折扣:等待奖励的时间越长,奖励对你的价值越低。当你知道什么会引起欲望的时候,将他放在视线之外,它就不会在吸引你了。

* “十分钟法则”: 某人如果想戒烟,在它非常想抽烟的时候,它必须对自己说,等十分钟就可以再抽烟了。几次之后,他就能转移注意力,忘掉吸烟的冲动了。

* 耐心最好的孩子通常更受人欢迎,学习成绩最好,也很能擅长处理压力。

* 大部分人想避免失败,失去50块钱往往比得到50块钱影响大的多。

* 有些经济学家认为破斧沉舟是最佳的自控方法。要实现自己的目标,我们必须限制自己的选择,逼迫自己去做。

* 有几类东西会有传染效果:1)无意识的模仿:一个人交叉双臂,和他说话的人过会也会交叉双臂。2)情绪传染:电视情景喜剧里面的笑声会感染到观众去笑。3)看到别人屈服于诱惑的时候,我们也会:别人吃那么多,我也可以吃那么多;别人赢那么多钱,自己也要提高赌注。

* 感染自控力,当你需要一些额外的意志力的时候,给自己树立一个榜样,问问自己,那个意志力强人会怎么做?

* 自豪感的力量,公开你的意志力挑战,想象你在意志力挑战成功后多么自豪。

* 把挑战意志力作为一个集体项目,你能在挑战上赢过其他人么?

* Ironic Rebound: 当人们试着不去想某件事情的时候,反而会比没有控制自己的思维时想的更多,比自己有意去想的时候还要多。这个效应在人处于紧张,疲劳或者烦乱状态时最为严重。

* 解决 Ironic Rebound的方法就是。。。放弃自控,让我们自然而然的去想这个事情!不要压抑自己!越是压抑消极情绪,人越是可能变的抑郁。

* 如果遇到烦恼,把注意力转移到你的身体感受(紧张,心律)上,一旦观察到这些感受,把注意力转移到呼吸上。


by admin at April 24, 2014 07:44 PM

April 17, 2014

MIMS 2012

If you’re a tech worker in California, make sure you’re not getting screwed

(obligatory disclaimer: I am not an employment lawyer, nor a lawyer of any sort.)

For my first software engineering job out of grad school, I was offered a salary of $80,000 a year. Said job was full-time, non-hourly, and located in San Francisco. I negotiated, because negotiation is Something You’re Supposed to Do, and got a bump up to $82K. Not bad, for 20 minutes of extremely uncomfortable phone conversation and feeling a bit like a bitch!

Today I learned that that the salary I was originally offered was illegally low. My negotiation just pushed it back into the legal range. And when 2013 rolled around, even that amount was below the minimum salary for exempt (meaning non-overtime-paying) computer programming jobs under California law. I wasn’t (just?) paid less relative to other tech workers in the frothy, dysfunctional Bay Area tech sector– I was legally underpaid!

This job wasn’t at a tiny three-person startup without a clue, either. This was a 40+ person company, with actual in-house HR. A generous interpretation is that a lot of tech companies out there are unaware of this law–certainly most individual employees are! A few might be banking on that ignorance, though. Protect yourself.

What the law says

(Again, I am totally not a lawyer. This is just what you get once you learn to google “Computer Software Occupations Exemption”.)

By default, jobs in California entail an hourly wage with time-and-a-half overtime, lunch breaks, and other requirements. Under California law, there are a few classes of job that are exempt from these requirements (hence why on more old-school job sites and payroll tools you see the term Exempt)–for example, taxi drivers and farmers. Among white-collar jobs, “Executive”, “Administrative”, and “Professional” positions are also exempt, subject to a number of requirements, including a minimum salary corresponding to double the local minimum wage times 40 hours a week. These three exceptions do not apply to most tech workers.

Unlike federal law, and most other states to the best of my knowledge, California has an additional class of exempt job: one for “Computer Software Occupations”. For these positions, the minimum salary is higher. It also gets adjusted each fall by the Division of Labor Statistics and Research to keep up with cost of living. What that amounts to in terms of full-time minimum salary for a given year:

Minimum Computer Software Occupation Salary

2008: $75,000
2009: $79,050
2010: $79,050
2011: $79,050
2012: $81,026.25
2013: $83,132.93
2014: $84,130.53

“Computer Software Occupations” is defined as:

California Labor Code §515.5
(a) Except as provided in subdivision (b), an employee in the computer software field shall be exempt from the requirement that an overtime rate of compensation be paid pursuant to Section 510 if all of the following apply:
(1) The employee is primarily engaged in work that is intellectual or creative and that requires the exercise of discretion and independent judgment, and the employee is primarily engaged in duties that consist of one or more of the following:
(A) The application of systems analysis techniques and procedures, including consulting with users, to determine hardware, software, or system functional specifications.
(B) The design, development, documentation, analysis, creation, testing, or modification of computer systems or programs, including prototypes, based on and related to, user or system design specifications.
(C) The documentation, testing, creation, or modification of computer programs related to the design of software or hardware for computer operating systems.

Employees who are still learning (e.g. interns, others unable to work independently without close supervision), IT workers, people who mainly work on hardware instead of software, copywriters, and special effects artists and similar movie industry employees are exceptions to the above. (Check the full snippet of the law.)

If you think your job fits this category, you make less than $84,130.53 working full-time, and you work in California: now would be a nice time to talk to your employer’s HR department and/or consult with an actual lawyer.

If you’re an employer and think this law is weird and overpays some roles: sure, whatever, but this is the freaking law. Your options are: get out of California, make your less-well-paid tech employees non-exempt (with overtime and the myriad other headaches that entails), or bump people’s salaries up. Don’t be a jerk.

by Karen at April 17, 2014 08:16 AM

MIMS 2012

Designing with instinct vs. data

Braden Kowitz wrote a great article exploring the ever increasing tension between making design decisions based on instinct versus data. As he says, “It’s common to think of data and instincts as being opposing forces in design decisions.” This is especially true at an A/B testing company — we have a tendency to quantify and measure everything.

He goes on to say, “In reality, there’s a blurry line between the two,” and I couldn’t agree more. When I’m designing, I’m in the habit of always asking, “What data would help me make this decision?” Sometimes it’s usage logs, sometimes it’s user testing, sometimes it’s market research, and sometimes there isn’t anything but my own intuition. Even when there is data, it’s all just input I use to help me reach a decision. It’s not a good idea to blindly follow data, but it’s equally bad to only use your gut. As Braden said, it’s important to balance the two.

by Jeff Zych at April 17, 2014 05:27 AM

April 14, 2014

Ph.D. alumna

Whether it’s bikes or bytes, teens are teens

(This piece was written for the LA Times, where it was published as an op-ed on April 11, 2014.)

If you’re like most middle-class parents, you’ve probably gotten annoyed with your daughter for constantly checking her Instagram feed or with your son for his two-thumbed texting at the dinner table. But before you rage against technology and start unfavorably comparing your children’s lives to your less-wired childhood, ask yourself this: Do you let your 10-year-old roam the neighborhood on her bicycle as long as she’s back by dinner? Are you comfortable, for hours at a time, not knowing your teenager’s exact whereabouts?

What American children are allowed to do — and what they are not — has shifted significantly over the last 30 years, and the changes go far beyond new technologies.
If you grew up middle-class in America prior to the 1980s, you were probably allowed to walk out your front door alone and — provided it was still light out and you had done your homework — hop on your bike and have adventures your parents knew nothing about. Most kids had some kind of curfew, but a lot of them also snuck out on occasion. And even those who weren’t given an allowance had ways to earn spending money — by delivering newspapers, say, or baby-sitting neighborhood children.

All that began to change in the 1980s. In response to anxiety about “latchkey” kids, middle- and upper-class parents started placing their kids in after-school programs and other activities that filled up their lives from morning to night. Working during high school became far less common. Not only did newspaper routes become a thing of the past but parents quit entrusting their children to teenage baby-sitters, and fast-food restaurants shifted to hiring older workers.

Parents are now the primary mode of transportation for teenagers, who are far less likely to walk to school or take the bus than any previous generation. And because most parents work, teens’ mobility and ability to get together casually with friends has been severely limited. Even sneaking out is futile, because there’s nowhere to go. Curfew, trespassing and loitering laws have restricted teens’ presence in public spaces. And even if one teen has been allowed out independently and has the means to do something fun, it’s unlikely her friends will be able to join her.

Given the array of restrictions teens face, it’s not surprising that they have embraced technology with such enthusiasm. The need to hang out, socialize, gossip and flirt hasn’t diminished, even if kids’ ability to get together has.

After studying teenagers for a decade, I’ve come to respect how their creativity, ingenuity and resilience have not been dampened even as they have been misunderstood, underappreciated and reviled. I’ve watched teenage couples co-create images to produce a portrait of intimacy when they lack the time and place to actually kiss. At a more political level, I’ve witnessed undocumented youth use social media to rally their peers and personal networks to speak out in favor of the Dream Act, even going so far as to orchestrate school walkouts and local marches.

This does not mean that teens always use the tools around them for productive purposes. Plenty of youth lash out at others, emulating a pervasive culture of meanness and cruelty. Others engage in risky behaviors, seeking attention in deeply problematic ways. Yet, even as those who are hurting others often make visible their own personal struggles, I’ve met alienated LGBT youth for whom the Internet has been a lifeline, letting them see that they aren’t alone as they struggle to figure out whom to trust.
And I’m on the board of Crisis Text Line, a service that connects thousands of struggling youth with counselors who can help them. Technology can be a lifesaver, but only if we recognize that the Internet makes visible the complex realities of people’s lives.
As a society, we both fear teenagers and fear for them. They bear the burden of our cultural obsession with safety, and they’re constantly used as justification for increased restrictions. Yet, at the end of the day, their emotional lives aren’t all that different from those of their parents as teenagers. All they’re trying to do is find a comfortable space of their own as they work out how they fit into the world and grapple with the enormous pressures they face.

Viewed through that prism, it becomes clear how the widespread embrace of technology and the adoption of social media by kids have more to do with non-technical changes in youth culture than with anything particularly compelling about those tools. Snapchat, Tumblr, Twitter and Facebook may be fun, but they’re also offering today’s teens a relief valve for coping with the increased stress and restrictions they encounter, as well as a way of being with their friends even when their more restrictive lives keep them apart.

The irony of our increasing cultural desire to protect kids is that our efforts may be harming them. In an effort to limit the dangers they encounter, we’re not allowing them to develop skills to navigate risk. In our attempts to protect them from harmful people, we’re not allowing them to learn to understand, let alone negotiate, public life. It is not possible to produce an informed citizenry if we do not first let people engage in public.
Treating technology as something to block, limit or demonize will not help youth come of age more successfully. If that’s the goal, we need to collectively work to undo the culture of fear and support our youth in exploring public life, online and off.

(More comments can be found over at the LA Times.)

by zephoria at April 14, 2014 03:07 PM

April 07, 2014

Ph.D. student

Why we need good computational models of peace and love

“Data science” doesn’t refer to any particular technique.

It refers to the cusp of the diffusion of computational methods from computer science, statistics, and applied math (the “methodologists”) to other domains.

The background theory of these disciplines–whose origin we can trace at least as far back at cybernetics research in the 1940′s–is required to understand the validity of these “data science” technologies as scientific instruments, just as a theory of optics is necessary to know the validity of what is seen through a microscope. Kuhn calls these kinds of theoretical commitments “instrumental commitments.”

For most domain sciences, instrumental commitment to information theory, computer science, etc. is not problematic. It is more so with some social sciences which oppose the validity of totalizing physics or formalism.

There aren’t a lot of them left because our mobile phones more or less instrumentally commit us to the cybernetic worldview. Where there is room for alternative metaphysics, it is because of the complexity of emergent/functional properties of the cybernetic substrate. Brier’s Cybersemiotics is one formulation of how richer communicative meaning can be seen as a evolved structure on top of cybernetic information processing.

If “software is eating the world” and we don’t want it to eat us (metaphorically! I don’t think the robots are going to kill us–I think that corporations are going to build robots that make our lives miserable by accident), then we are going to need to have software that understands us. That requires building out cybernetic models of human communication to be more understanding of our social reality and what’s desirable in it.

That’s going to require cooperation between techies and humanists in a way that will be trying for both sides but worth the effort I think.

by Sebastian Benthall at April 07, 2014 11:34 PM

April 03, 2014

Ph.D. alumna

Is the Oculus Rift sexist? (plus response to criticism)

Last week, I wrote a provocative opinion piece for Quartz called “Is the Oculus Rift sexist?” I’m reposting it on my blog for posterity, but also because I want to address some of the critiques that I received. First, the piece itself:

Is the Oculus Rift sexist?

In the fall of 1997, my university built a CAVE (Cave Automatic Virtual Environment) to help scientists, artists, and archeologists embrace 3D immersion to advance the state of those fields. Ecstatic at seeing a real-life instantiation of the Metaverse, the virtual world imagined in Neal Stephenson’s Snow Crash, I donned a set of goggles and jumped inside. And then I promptly vomited.

I never managed to overcome my nausea. I couldn’t last more than a minute in that CAVE and I still can’t watch an IMAX movie. Looking around me, I started to notice something. By and large, my male friends and colleagues had no problem with these systems. My female peers, on the other hand, turned green.

What made this peculiar was that we were all computer graphics programmers. We could all render a 3D scene with ease. But when asked to do basic tasks like jump from Point A to Point B in a Nintendo 64 game, I watched my female friends fall short. What could explain this?

At the time any notion that there might be biological differences underpinning computing systems was deemed heretical. Discussions of gender and computing centered around services like Purple Moon, a software company trying to entice girls into gaming and computing. And yet, what I was seeing gnawed at me.

That’s when a friend of mine stumbled over a footnote in an esoteric army report about simulator sickness in virtual environments. Sure enough, military researchers had noticed that women seemed to get sick at higher rates in simulators than men. While they seemed to be able to eventually adjust to the simulator, they would then get sick again when switching back into reality.

Being an activist and a troublemaker, I walked straight into the office of the head CAVE researcher and declared the CAVE sexist. He turned to me and said: “Prove it.”

The gender mystery

Over the next few years, I embarked on one of the strangest cross-disciplinary projects I’ve ever worked on. I ended up in a gender clinic in Utrecht, in the Netherlands, interviewing both male-to-female and female-to-male transsexuals as they began hormone therapy. Many reported experiencing strange visual side effects. Like adolescents going through puberty, they’d reach for doors—only to miss the door knob. But unlike adolescents, the length of their arms wasn’t changing—only their hormonal composition.

Scholars in the gender clinic were doing fascinating research on tasks like spatial rotation skills. They found that people taking androgens (a steroid hormone similar to testosterone) improved at tasks that required them to rotate Tetris-like shapes in their mind to determine if one shape was simply a rotation of another shape. Meanwhile, male-to-female transsexuals saw a decline in performance during their hormone replacement therapy.

Along the way, I also learned that there are more sex hormones on the retina than in anywhere else in the body except for the gonads. Studies on macular degeneration showed that hormone levels mattered for the retina. But why? And why would people undergoing hormonal transitions struggle with basic depth-based tasks?

Two kinds of depth perception

Back in the US, I started running visual psychology experiments. I created artificial situations where different basic depth cues—the kinds of information we pick up that tell us how far away an object is—could be put into conflict. As the work proceeded, I narrowed in on two key depth cues – “motion parallax” and “shape-from-shading.”

Motion parallax has to do with the apparent size of an object. If you put a soda can in front of you and then move it closer, it will get bigger in your visual field. Your brain assumes that the can didn’t suddenly grow and concludes that it’s just got closer to you.

Shape-from-shading is a bit trickier. If you stare at a point on an object in front of you and then move your head around, you’ll notice that the shading of that point changes ever so slightly depending on the lighting around you. The funny thing is that your eyes actually flicker constantly, recalculating the tiny differences in shading, and your brain uses that information to judge how far away the object is.

In the real world, both these cues work together to give you a sense of depth. But in virtual reality systems, they’re not treated equally.

The virtual-reality shortcut

When you enter a 3D immersive environment, the computer tries to calculate where your eyes are at in order to show you how the scene should look from that position. Binocular systems calculate slightly different images for your right and left eyes. And really good systems, like good glasses, will assess not just where your eye is, but where your retina is, and make the computation more precise.

It’s super easy—if you determine the focal point and do your linear matrix transformations accurately, which for a computer is a piece of cake—to render motion parallax properly. Shape-from-shading is a different beast. Although techniques for shading 3D models have greatly improved over the last two decades—a computer can now render an object as if it were lit by a complex collection of light sources of all shapes and colors—what they they can’t do is simulate how that tiny, constant flickering of your eyes affects the shading you perceive. As a result, 3D graphics does a terrible job of truly emulating shape-from-shading.

Tricks of the light

In my experiment, I tried to trick people’s brains. I created scenarios in which motion parallax suggested an object was at one distance, and shape-from-shading suggested it was further away or closer. The idea was to see which of these conflicting depth cues the brain would prioritize. (The brain prioritizes between conflicting cues all the time; for example, if you hold out your finger and stare at it through one eye and then the other, it will appear to be in different positions, but if you look at it through both eyes, it will be on the side of your “dominant” eye.)

What I found was startling (pdf). Although there was variability across the board, biological men were significantly more likely to prioritize motion parallax. Biological women relied more heavily on shape-from-shading. In other words, men are more likely to use the cues that 3D virtual reality systems relied on.

This, if broadly true, would explain why I, being a woman, vomited in the CAVE: My brain simply wasn’t picking up on signals the system was trying to send me about where objects were, and this made me disoriented.

My guess is that this has to do with the level of hormones in my system. If that’s true, someone undergoing hormone replacement therapy, like the people in the Utrecht gender clinic, would start to prioritize a different cue as their therapy progressed. 1
We need more research

However, I never did go back to the clinic to find out. The problem with this type of research is that you’re never really sure of your findings until they can be reproduced. A lot more work is needed to understand what I saw in those experiments. It’s quite possible that I wasn’t accounting for other variables that could explain the differences I was seeing. And there are certainly limitations to doing vision experiments with college-aged students in a field whose foundational studies are based almost exclusively on doing studies solely with college-age males. But what I saw among my friends, what I heard from transsexual individuals, and what I observed in my simple experiment led me to believe that we need to know more about this.

I’m excited to see Facebook invest in Oculus, the maker of the Rift headset. No one is better poised to implement Stephenson’s vision. But if we’re going to see serious investments in building the Metaverse, there are questions to be asked. I’d posit that the problems of nausea and simulator sickness that many people report when using VR headsets go deeper than pixel persistence and latency rates.

What I want to know, and what I hope someone will help me discover, is whether or not biology plays a fundamental role in shaping people’s experience with immersive virtual reality. In other words, are systems like Oculus fundamentally (if inadvertently) sexist in their design?

Response to Criticism

1. “Things aren’t sexist!”

Not surprisingly, most people who responded negatively to my piece were up in arms about the title. Some people directed that at Quartz which was somewhat unfair. Although they originally altered the title, they reverted to my title within a few hours. My title was intentionally, “Is the Oculus Rift sexist?” This is both a genuine question and a provocation. I’m not naive enough to not think that people would react strongly to the question, just as my advisor did when I declared VR sexist almost two decades ago. But I want people to take that question seriously precisely because more research needs to be done.

Sexism is prejudice or discrimination on the basis of sex (typically against women). For sexism to exist, there does not need to be an actor intending to discriminate. People, systems, and organizations can operate in sexist manners without realizing it. This is the basis of implicit or hidden biases. Addressing sexism starts by recognizing bias within systems and discrimination as a product of systems in society.

What was interesting about what I found and what I want people to investigate further is that the discrimination that I identified is not intentional by scientists or engineers or simply the product of cultural values. It is a byproduct of a research and innovation cycle that has significant consequences as society deploys the resultant products. The discriminatory potential of deployment will be magnified if people don’t actively seek to address it, which is precisely why I drudged up this ancient work in this moment in time.

I don’t think that the creators of Oculus Rift have any intentions to discriminate against women (let alone the wide range of people who currently get nauseous in their system which is actually quite broad), but I think that if they don’t pay attention to the depth cue prioritization issues that I’m highlighting or if they fail to actively seek technological redress, they’re going to have a problem. More importantly, many of us are going to have a problem. All too often, systems get shipped with discriminatory byproducts and people throw their hands in the air and say, “oops, we didn’t intend that.”

I think that we have a responsibility to identify and call attention to discrimination in all of its forms. Perhaps I should’ve titled the piece “Is Oculus Rift unintentionally discriminating on the basis of sex?” but, frankly, that’s nothing more than an attempt to ask the question I asked in a more politically correct manner. And the irony of this is that the people who most frequently complained to me about my titling are those who loathe political correctness in other situations.

I think it’s important to grapple with the ways in which sexism is not always intentional but at the vary basis of our organizations and infrastructure, as well as our cultural practices.

2. The language of gender

I ruffled a few queer feathers by using the terms “transsexual” and “biological male.” I completely understand why contemporary transgender activists (especially in the American context) would react strongly to that language, but I also think it’s important to remember that I’m referring to a study from 1997 in a Dutch gender clinic. The term “cisgender” didn’t even exist. And at that time, in that setting, the women and men that I met adamantly deplored the “transgender” label. They wanted to make it crystal clear that they were transsexual, not transgender. To them, the latter signaled a choice.

I made a choice in this essay to use the language of my informants. When referring to men and women who had not undergone any hormonal treatment (whether they be cisgender or not), I added the label of “biological.” This was the language of my transsexually-identified informants (who, admittedly, often shortened it to “bio boys” and “bio girls”). I chose this route because the informants for my experiment identified as female and male without any awareness of the contested dynamics of these identifiers.

Finally, for those who are not enmeshed in the linguistic contestations over gender and sex, I want to clarify that I am purposefully using the language of “sex” and not “gender” because what’s at stake has to do with the biological dynamics surrounding sex, not the social construction of gender.

Get angry, but reflect and engage

Critique me, challenge me, tell me that I’m a bad human for even asking these questions. That’s fine. I want people to be provoked, to question their assumptions, and to reflect on the unintentional instantiation of discrimination. More than anything, I want those with the capacity to take what I started forward. There’s no doubt that my pilot studies are the beginning, not the end of this research. If folks really want to build the Metaverse, make sure that it’s not going to unintentionally discriminate on the basis of sex because no one thought to ask if the damn thing was sexist.

by zephoria at April 03, 2014 11:35 PM

April 01, 2014

MIMS 2004

Imagine We Had No Transaction Receipts...

So, imagine you go to the store, you ask to buy a coffee, there is no cash register, no transaction receipt it given to you, but you are handed the coffee. They don't say anything. You payment is invisible. You don't know how much it will be but you agree to the opaque terms. If you get food poisoning later, it's going to be a huge hassle proving you where there, but it's possible. However, the authorities in charge of checking out food poisoning issues would need some proof. Maybe you threw away the cup, maybe you still have it. Maybe there is video surveillance and maybe not. No receipt for tax purposes, or proving the cost from the vendor, or your expense report, or documentation about what you purchased.. no warranty or food safety proof, no date or time or place or anything. You just have a cup of coffee. That's what it's like to go to a vendor online or on your phone, make an account and share some data. You do get something, but you don't really know what you "paid," you have no receipt after you agreed to get the service, and you have nothing from the vendor, other than maybe the confirmation email you received. Now imagine the opposite: You go to a digital vendor, you see the service's rating on the crowd sourced or professional review of the way the company will treat your personal data, and you see a comparison of how other similar services would treat your data. You pick one, and "consent" to share your information. A consent receipt is built, that shows you the vendor's TOU and Privacy Policy, the Consumer Report's style rating and comparison, from the consent date, the Date, Time and Jurisdiction you are in, your identifier, you terms such as a DNT signal, and the Jurisdictional requirements for treating personal data and consent. And your receipt is sent to you, and the vendor. Some statistics hit the public website, depersonalized but showing the world how vendors are doing with personal data consents. And you have a tweet that thanks the vendors doing good with your data, and asks the ones doing poorly why they aren't doing better. That is the Open Notice and Consent Receipt system from the user perspective. Think something like this:

April 01, 2014 09:51 PM

March 31, 2014

MIMS 2004

"Big Data" if Unspecfic, is Ridiculous

Here is a more specific look at what Big Data means, as a term: There is your data, there is "little data" where when you share it, it's wrapped around you as the user, centralized. And that's "Big Data" that is really a large amount of "Little Data." Then there is Big Data that you as a user co-create with a vendor or service, that is relatable back to you but it's wrapped around objects, data models and identifiers that are first about the object and not about you. And then there is aggregated data that is depersonalized .. though it may still be possible with some detective work to find you. My point in making this distinction is to note that talking about Big Data in an unspecific manner is a great opportunity to misunderstand, to miss potential solutions that apply to parts of this scale, but not all, and to talk past each other when we are discussing problems and solutions in the privacy arena.

March 31, 2014 07:40 PM

March 30, 2014

Ph.D. student

starting with a problem

The feedback I got on my dissertation prospectus draft when I presented it to my colleagues was that I didn’t start with a problem and then argue from there how my dissertation was going to be about a solution.

That was really great advice.

The problem of problem selection is a difficult. “What is a problem?” is a question that basically nobody asks. Lots of significant philosophical traditions maintain that it’s the perception of problems as problems that is the problem. “Just chill out,” say Great Philosophical Traditions. This does not help one orient ones research dissertation.

A lot of research is motivated by interest in particular problems like an engineering challenge or curing cancer. I’m somehow managed to never acquire the kind of expertise that would allow me to address any of these specific useful problems directly. My mistake.

I’m a social scientist. There are a lot of social problems, right? Of course. However, there’s a problem here that identifying any problems as problems in the social domain immediately implicates politics.

Are there apolitical social problems? I think I’ve found some. I had a great conversation last week with Anna Salamon about Global Catastrophic Risks. Those sound terrible! It echoes the work I used to do in support of Distaster Risk Reduction, except that there is more acknowledgment in the GCR space that some of the big risks are man-made.

So there’s a problem: arguably research into the solutions to these problems is good. On the other hand, that research is complicated by the political entanglement of the researchers, especially in the university setting. It took some convincing, but OK, those politics are necessarily part of the equation. Put another way, if there wasn’t the political complexity, then the hard problems wouldn’t be such hard problems. The hard problems are hard partly because they are so political. (This difference in emphasis is not meant to preclude other reasons why these problems are hard; for example, because people aren’t smart or motivated enough.)

Given that the political complexity is getting in the way of the efficiency of us solving hard problems–because these problems require collaboration across political lines, because the inherent politics of language choice and framing create complexity that is orthogonal to the problem solution (is it?), infrastructural solutions that manage that political complexity can be helpful.

(Counterclaim: the political complexity is not illogical complexity, rather scientific logic is partly political logic. We live in the best of all possible worlds. Just chill out. This is an empirical claim.)

The promise of computational methods to interdisciplinary collaboration is that they allow for more efficient distribution of cognitive labor across the system of investigators. Data science methodologists can build tools for investigation that work cross-disciplinarily, and the interaction between these tools can follow an a political logic in a way that discursive science cannot. Teleologically, we get an Internet of Scientific Things, and autonomous scientific aparatus, and draw your own eschatological conclusions.

An interesting consequence of algorithmically mediated communication is that you don’t actually need consensus to coordinate collective action. I suppose this is an argument Hayekians etc. have been making for a long time. However, the political maintenance of the system that ensures the appropriate incentive structures is itself prone to being hacked and herein lies the problem. That and the insufficiency of the total neurological market aparatus (in Hayek’s vision) to do anything like internalize the externalities of e.g. climate change, while the Bitcoin servers burn and burn and burn.

by Sebastian Benthall at March 30, 2014 08:54 PM

March 28, 2014

Ph.D. student


This article is making me doubt some of my earlier conclusions about the role of the steering media. Habermas, I’ve got to concede, is dated. As much as skeptics would like to show how social media fails to ‘democratize’ media (not in the sense of being justly won by elections, but rather in the original sense of being mob ruled), the fragmentation is real and the public is reciprocally involved in its own narration.

What can then be said of the role of new media in public discourse? Here are some hypotheses:

  • As a first order effect, new media exacerbates shocks, both endogenous and exogenous. See Didier Sornette‘s work on application of self-excited Hawkes process to social systems like finance and Amazon reviews. (I’m indebted to Thomas Maillart for introducing me to this research.) This changes the dynamics because rather than being Poisson distributed, new media intervention is strategically motivated.
  • As a second order effect, since new media acting strategically, it must make predictive assessments of audience receptivity. New media suppliers must anticipate and cultivate demand. But demand is driven partly by environmental factors like information availability. See these notes on Dewey’s ethical theory for how taste can be due to environmental adaptation with no truly intrinsic desire–hence, the inappropriateness of modeling these dynamics straightforwardly with ‘utility functions’–which upsets neoclassical market modeling techniques. Hence the ‘social media marketer’ position that engages regularly in communication with an audience in order to cultivate a culture that is also a media market. Microcelebrity practices achieve not merely a passively received branding but an actively nurtured communicative setting. Communication here is transmission (Shannon, etc.) and/or symbolic interaction, on which community (Carey) supervenes.
  • Though not driven be neoclassical market dynamics simpliciter, new media is nevertheless competitive. We should expect new media suppliers to be fluidly territorial. The creates a higher-order incentive for curatorial intervention to maintain and distinguish ones audience as culture. A critical open question here is to what extent these incentives drive endogenous differentiation, vs. to what extent media fragmentation results in efficient allocation of information (analogously to efficient use of information in markets.) There is no a priori reason to suppose that the ad hoc assemblage of media infrastructures and regulations minimizes negative cultural externalities. (What are examples of negative cultural externalities? Fascism, ….)
  • Different media markets will have different dialects, which will have different expressive potential because of description lengths of concepts. (Algorithmic information theoretic interpretation of weak Sapir-Whorf hypothesis.) This is unavoidable because man is mortal (cannot approach convergent limits in a lifetime.) Some consequences (which have taken me a while to come around to, but here it is):
    1. Real intersubjective agreement is only provisionally and locally attainable.
    2. Language use, as a practical effect, has implications for future computational costs and therefore is intrinsically political.
    3. The poststructuralists are right after all. ::shakes fist at sky::
    4. That’s ok, we can still hack nature and create infrastructure; technical control resonates with physical computational layers that are not subject to wetware limitations. This leaves us, disciplinarily, with post-positivist engineering, post-structuralist hermeneutics enabling only provisional consensus and collective action (which can, at best, be ‘society made durable’ via technical implementation or cultural maintenance (see above on media market making), and critical reflection (advancing social computation directly).
  • There is a challenge to Pearl/Woodward causality here, in that mechanistic causation will be insensitive to higher-order effects. A better model for social causation would be Luhmann’s autopoieisis (c.f Brier, 2008). Ecological modeling (Ulanowicz) provides the best toolkit for showing interactions between autopoietic networks?

This is not helping me write my dissertation prospectus at all.

by Sebastian Benthall at March 28, 2014 05:23 PM

March 27, 2014

Ph.D. alumna

Parentology: The first parenting book I actually liked

As a researcher and parent, I quickly learned that I have no patience for parenting books. When I got pregnant, I started trying to read parenting books and I threw more than my fair share of them across the room. I either get angry at the presentation of the science or annoyed at the dryness of the writing. Worse, the prescriptions make me furious because anyone who tells you that there’s a formula to parenting is lying. My hatred of parenting books was really disappointing because I didn’t want to have to do a literature review whenever I wanted to know what research said about XYZ. I actually want to understand what the science says about key issues of child development, childrearing, and parenting. But I can’t stomach the tone of what I normally encounter.

So when I learned that Dalton Conley was writing a book on parenting, my eyebrows went up. I’ve always been a huge fan of his self-deprecating autobiographical book Honky because it does such a fantastic job of showcasing research on race and class. This made me wonder what he was going to do with a book on parenting.

Conley did not disappoint. His new book Parentology is the first parenting book that I’ve read that I actually enjoyed and am actively recommending to others. Conley’s willingness to detail his own failings, neuroses, and foolish logic (and to smack himself upside the head with research data in the process) showcases the trials and tribulations of parenting. Even experts make a mess of everything, but watching them do so so spectacularly lets us all off the hook. If you read this book, you will learn a lot about parenting, even if it doesn’t present the material in a how-to fashion. Instead, this book highlights the chaos that ensues when you try to implement science on the ground. Needless to say, hilarity ensues.

If you need some comedy relief, pick up this book. It’s a fantastic traversal of contemporary research presented in a fashion that will have you rolling on the floor laughing. Lesson #1: If you buy your children pet guinea pigs to increase their exposure to allergens, make sure that they’re unable to mate.

by zephoria at March 27, 2014 07:56 PM

March 22, 2014

Ph.D. student

Knight News Challenge applications

The Knight News Challenge applications are in and I find them a particularly exciting batch this year, perhaps because of a burst of activity spurred on by a handful of surveillance revelations you might have heard about. I read through all 660: below are my list of promising applications from friends and colleagues. I’m sure there are many more awesome ones, including some I already “applauded”, but I thought a starter list would still be useful. Go applaud these and add comments to help them improve.

Which are your favorites that I’ve missed? I’m keeping a running list here:

Encrypt all the things

Mailpile - secure e-mail for the masses!

Making secure email (using the OpenPGP standard) easier by developing an awesome native email client where encryption is built-in. They already have an alpha running that you might have seen on Kickstarter.

Encryption Usability Prize

Peter Eckersley, just over the Bay at EFF, wants to develop criteria for an annual prize for usable encryption software. (Noticing a theme to these encryption projects yet?) Notes SOUPS (CMU’s conference on usable security, happening this summer at Facebook) as a venue for discussion.

LEAP Encryption Access Project: Tools for Creating an Open, Federated and Secure Internet

LEAP ( is a project for developing a set of encryption tools, including proxies, email (with automatic key discovery) and chat, in an effort to make encryption the default for a set of at-risk users. (My colleague Harry Halpin at W3C works with them, and it all sounds very powerful.)

TextSecure: Simple Private Communication For Everyone

TextSecure is likely the most promising protocol and software project for easy-to-use widely adopted asynchronous encrypted messaging. (Android users should be using the new TextSecure already, fyi; it basically replaces your SMS app but allows for easy encryption.) Moxie (formerly of Twitter) is pretty awesome and it’s an impressive team.


Speaking of encryption, there are two proposals for standards work directly related to encryption and security.

Advancing DANE (DNS-Based Authentication of Named Entities) to Secure the Internet’s Transport Layer

This one may sound a little deep in the weeds, but DANE is a standard which promises end-to-end transport security on the Internet via DNSSEC, without relying on the brittle Certificate Authority system. Yay IETF!

Improved Privacy and Security through Web Standards

My colleagues at W3C are working on WebCrypto — a set of APIs for crypto to be implemented in the browser so that all your favorite Web applications can start implementing encryption without all making the same mistakes. Also, and this is of particular interest to me, while we’ve started to do privacy reviews of W3C specs in general via the Privacy Interest Group, this proposal suggests dedicated staff to provide privacy/security expertise to all those standards groups out there from the very beginning of their work.

Open Annotations for the Web (with lots of I School connections!) has been contributing to standards for Web annotations, so that we can all share the highlights and underlines and comments we make on web pages; they’re proposing to hire a developer to work with W3C on those standards.

Open Notice & Consent Receipts

A large handful of us I School alumni have been working in some way or another on the idea of privacy icons or standardized privacy notices. Mary Hodder proposes funding that project, to work on these notices and a “consent receipt” so you’ll know what terms you’ve accepted once you do.

Documenting practices, good and bad

Usable Security Guides for Strengthening the Internet

Joe Hall, CDT chief technologist and I School alumnus extraordinaire, has an awesome proposal for writing guides for usable security. Because it doesn’t matter how good the technology is if you don’t learn how to use it.

Transparency Reporting for Beginners: A Starter Kit and Best Practices Guide for Internet Companies, and a Readers’ Guide for Consumers, Journalists, & Advocates

Kevin Bankston (formerly CDT, formerly formerly EFF) suggests a set of best practices for transparency reports, the new hot thing in response to surveillance, but lacking standards and guidelines.

The positive projects in here naturally seem easier to build and less-likely to attract controversy, but these evaluative projects might also be important for encouraging improvement:

Ranking Digital Rights: Holding tech companies accountable on freedom of expression and privacy

@rmack on annual ranking of companies on their free expression and privacy practices.

Exposing Privacy and Security Practices: An online resource for evaluation and advocacy

CDT’s Justin Brookman on evaluating data collection and practices, particularly for news and entertainment sites.

IndieWeb and Self-Hosting

IndieWeb Fellowships for the Independent and Open Web

I’ve been following and participating in this #indieweb thing for a while now. While occasionally quixotic, I think the trend of building working interoperable tools that rely as little as possible on large centralized services is one worth applauding. This proposal from @caseorganic suggests “fellowships” to fund the indie people building these tools.

Idno: a collective storytelling platform that supports the diversity of the web

And @benwerd ( is one of these people building easy-to-use software for your own blog, not controlled by anyone else. Idno is sweet software and Ben and Erin are really cool.


Even if you had your own domain name, would you still forward all your email through GMail or Hotmail or some free webmail service with practices you might not understand or appreciate? This project is for “a one-click, easy-to-deploy SMTP server: a mail server in a box.”

Superuser: Internet homeownership for anyone

Eric Mill (@konlone) has been working on a related project, to make it end-user easy to install self-hosted tools (like Mail-in-a-box, or personal blog software, or IFTTT) on a machine you control, so that it’s not reserved for those of us who naturally take to system administration. (Also, Eric is super cool.)

by at March 22, 2014 11:15 PM

March 21, 2014

Ph.D. alumna

Why Snapchat is Valuable: It’s All About Attention

Most people who encounter a link to this post will never read beyond this paragraph. Heck, most people who encountered a link to this post didn’t click on the link to begin with. They simply saw the headline, took note that someone over 30 thinks that maybe Snapchat is important, and moved onto the next item in their Facebook/Twitter/RSS/you-name-it stream of media. And even if they did read it, I’ll never know it because they won’t comment or retweet or favorite this in any way.

We’ve all gotten used to wading in streams of social media content. Open up Instagram or Secret on your phone and you’ll flick on through the posts in your stream, looking for a piece of content that’ll catch your eye. Maybe you don’t even bother looking at the raw stream on Twitter. You don’t have to because countless curatorial services like digg are available to tell you what was most important in your network. Facebook doesn’t even bother letting you see your raw stream; their algorithms determine what you get access to in the first place (unless, of course, someone pays to make sure their friends see their content).

Snapchat offers a different proposition. Everyone gets hung up on how the disappearance of images may (or may not) afford a new kind of privacy. Adults fret about how teens might be using this affordance to share inappropriate (read: sexy) pictures, projecting their own bad habits onto youth. But this is isn’t what makes Snapchat utterly intriguing. What makes Snapchat matter has to do with how it treats attention.

When someone sends you an image/video via Snapchat, they choose how long you get to view the image/video. The underlying message is simple: You’ve got 7 seconds. PAY ATTENTION. And when people do choose to open a Snap, they actually stop what they’re doing and look.

In a digital world where everyone’s flicking through headshots, images, and text without processing any of it, Snapchat asks you to stand still and pay attention to the gift that someone in your network just gave you. As a result, I watch teens choose not to open a Snap the moment they get it because they want to wait for the moment when they can appreciate whatever is behind that closed door. And when they do, I watch them tune out everything else and just concentrate on what’s in front of them. Rather than serving as yet-another distraction, Snapchat invites focus.

Furthermore, in an ecosystem where people “favorite” or “like” content that is inherently unlikeable just to acknowledge that they’ve consumed it, Snapchat simply notifies the creator when the receiver opens it up. This is such a subtle but beautiful way of embedding recognition into the system. Sometimes, a direct response is necessary. Sometimes, we need nothing more than a simple nod, a way of signaling acknowledgement. And that’s precisely why the small little “opened” note will bring a smile to someone’s face even if the recipient never said a word.

Snapchat is a reminder that constraints have a social purpose, that there is beauty in simplicity, and that the ephemeral is valuable. There aren’t many services out there that fundamentally question the default logic of social media and, for that, I think that we all need to pay attention to and acknowledge Snapchat’s moves in this ecosystem.

(This post was originally published on LinkedIn. More comments can be found there.)

by zephoria at March 21, 2014 03:34 PM

March 20, 2014

Ph.D. student

real talk

So I am trying to write a dissertation prospectus. It is going…OK.

The dissertation is on Evaluating Data Science Environments.

But I’ve been getting very distracted by the politics of data science. I have been dealing with the politics by joking about them. But I think I’m in danger of being part of the problem, when I would rather be part of the solution.

So, where do I stand on this, really?

Here are some theses:

  • There is a sense of “data science” that is importantly different from “data analytics”, though there is plenty of abuse of the term in an industrial context. That claim is awkward because industry can easily say they “own” the term. It would be useful to lay out specifically which computational methods constitute “data science” methods and which don’t.
  • I think that it’s useful analytically to distinguish different kinds of truth claims because it sheds light on the value of different kinds of inquiry. There is definitely a place for rigorous interpretive inquiry and critical theory in addition to technical, predictive science. I think politicing around these divisions is lame and only talk about it to make fun of the situation.
  • New computational science techniques have done and will continue to do amazing work in the physical and biological and increasingly environmental sciences. I am jealous of researchers in those fields because I think that work is awesome. For some reason I am a social scientist.
  • The questions surrounding the application of data science to social systems (which can include environmental systems) are very, very interesting. Qualitative researchers get defensive about their role in “the age of data science” but I think this is unwarranted. I think it’s the quantitative social science researchers who are likely more threatened methodologically. But since I’m not well-trained as a quantitative social scientist really, I can’t be sure of that.
  • The more I learn about research methods (which seems to be all I study these days, instead of actually doing research–I’m procrastinating), the more I’m getting a nuanced sense of how different methods are designed to address different problems. Jockeying about which method is better is useless. If there is a political battle I think is worth fighting any more, it’s the battle about whether or not transdisciplinary research is productive or possible. I hypothesize that it is. But I think this is an empirical question whose answer may be specific: how can different methods be combined effectively? I think this question gets quite deep and answering it requires getting into epistemology and statistics in a serious way.
  • What is disruptive about data science is that some people have dug down into statistics in a serious way, come up with a valid general way of analyzing things, and then automated it. That makes it in theory cheaper to pick up and apply than the quantitative techniques used by other researchers, and usable at larger scale. On the whole this is pretty good, though it is bad when people don’t understand how the tools they are using work. Automating science is a pretty good thing over all.
  • It’s really important for science, as it is automated, to be built on open tools and reproducible data because (a) otherwise there is no reason why it should have the public trust, (b) because it will remove barriers to training new scientists.
  • All scientists are going to need to know how to program. I’m very fortunate to have a technical background. A technical background is not sufficient to do science well. One can use technical skills to assist in both qualitative (visualization) and quantitative work. The ability to use tools is orthogonal to the ability to study phenomena, despite the historic connection between mathematics and computer science.
  • People conflate programming, which is increasingly a social and trade skill, with the ability to grasp high level mathematical concepts.
  • Computers are awesome. The people that make them better deserve the credit they get.
  • Sometimes I think: should I be in a computer science department? I think I would feel better about my work if I were in CS. I like the feeling of tangible progress and problem solving. I think there are a lot of really important problems to solve, and that the solutions will likely come from computer science related work. What I think I get from being in a more interdisciplinary department is a better understanding of what problems are worth solving. I don’t mean that in a way that diminishes the hard work of problem solving, which I think is really where the rubber hits the road. It is easy to complain. I don’t work as hard as computer science students. I also really like being around women. I think they are great and there aren’t enough of them in computer science departments.
  • I’m interested in modeling and improving the cooperation around open scientific software because that’s where I see there some real potential value add. I’ve been and engineer and I’ve managed engineers. Managing engineers is a lot harder than engineering, IMO. That’s because management requires navigating a social system. Social systems are really absurdly complicated compared to even individual organisms.
  • There are three reasons why it might be bad to apply data science to social systems. The first is that it could lead to extraordinarily terrible death robots. My karma is on the line. The second is that the scientific models might be too simplistic and lead to bad decisions that are insensitive to human needs. That is why it is very, very important that the existing wealth of social scientific understanding is not lost but rather translated into a more robust and reproducible form. The third reason is that social science might be in principle impossible due to its self-referential effects. This would make the whole enterprise a collosal waste of time. The first and third reasons frequently depress me. The second motivates me.
  • Infrastructure and mechanism design are powerful means of social change, perhaps the most powerful. Movements are important but civil society is so paralyzed by the steering media now that it is more valuable to analyze movements as sociotechnical organizations alongside corporations etc. than to view them in isolation from the technical substrate. There are a variety of ideological framings of this position, each with different ideological baggage. I’m less concerned with that, ultimately, than the pragmatic application of knowledge. I wish people would stop having issues with “implications for design.”
  • I said I wanted to get away from politics, but this is one other political point I actually really think is worth making, though it is generally very unpopular in academia for obvious reasons: the status differential between faculty and staff is an enormous part of the problem of the disfunction of universities. A lot of disciplinery politics are codifications of distaste for certain kinds of labor. In many disciplines, graduate students perform labor unexpertly in service of their lab’s principal investigators; this labor is a way of paying ones dues that has little to do with the intellectual work of their research expertise. Or is it? It’s entirely unclear, especially when what makes the difference between a good researcher and a great one are skills that have nothing to do with their intellectual pursuit, and when master new tools is so essential for success in ones field. But the PIs are often not able to teach these tools. What is the work of research? Who does it? Why do we consider science to be the reserve of a specialized medieval institution, and call it something else when it is done by private industry? Do academics really have a right to complain about the rise of the university administrative class?

Sorry, that got polemical again.

by Sebastian Benthall at March 20, 2014 05:45 AM

March 17, 2014

Ph.D. alumna

TIME Magazine Op-Ed: Let Kids Run Wild Online

I wrote the following op-ed for TIME Magazine. This was published in the March 13, 2014 issue under the title “Let Kids Run Wild Online.” To my surprise and delight, the op-ed was featured on the cover of the magazine.

Trapped by helicopter parents and desperate to carve out a space of their own, teens need a place to make mistakes.

Bicycles, roller skates and skateboards are dangerous. I still have scars on my knees from my childhood run-ins with various wheeled contraptions. Jungle gyms are also dangerous; I broke my left arm falling off one. And don’t get me started on walking. Admittedly, I was a klutzy kid, but I’m glad I didn’t spend my childhood trapped in a padded room to protect me from every bump and bruise.

“That which does not kill us makes us stronger.” But parents can’t handle it when teenagers put this philosophy into practice. And now technology has become the new field for the age-old battle between adults and their freedom-craving kids.

Locked indoors, unable to get on their bicycles and hang out with their friends, teens have turned to social media and their mobile phones to gossip, flirt and socialize with their peers. What they do online often mirrors what they might otherwise do if their mobility weren’t so heavily constrained in the age of helicopter parenting. Social media and smartphone apps have become so popular in recent years because teens need a place to call their own. They want the freedom to explore their identity and the world around them. Instead of sneaking out (should we discuss the risks of climbing out of windows?), they jump online.

As teens have moved online, parents have projected their fears onto the Internet, imagining all the potential dangers that youth might face–from violent strangers to cruel peers to pictures or words that could haunt them on Google for the rest of their lives.

Rather than helping teens develop strategies for negotiating public life and the potential risks of interacting with others, fearful parents have focused on tracking, monitoring and blocking. These tactics don’t help teens develop the skills they need to manage complex social situations, assess risks and get help when they’re in trouble. Banning cell phones won’t stop a teen who’s in love cope with the messy dynamics of sexting. “Protecting” kids may feel like the right thing to do, but it undermines the learning that teens need to do as they come of age in a technology-soaked world.

The key to helping youth navigate contemporary digital life isn’t more restrictions. It’s freedom–plus communication. Famed urban theorist Jane Jacobs used to argue that the safest neighborhoods were those where communities collectively took interest in and paid attention to what happened on the streets. Safety didn’t come from surveillance cameras or keeping everyone indoors but from a collective willingness to watch out for one another and be present as people struggled. The same is true online.

What makes the digital street safe is when teens and adults collectively agree to open their eyes and pay attention, communicate and collaboratively negotiate difficult situations. Teens need the freedom to wander the digital street, but they also need to know that caring adults are behind them and supporting them wherever they go. The first step is to turn off the tracking software. Then ask your kids what they’re doing when they’re online–and why it’s so important to them.

by zephoria at March 17, 2014 01:31 AM

March 16, 2014


"Slide down my cellar door"

In a 2010 NYT “On Language” column, Grant Barrett traced the claim that “cellar door” is the most beautiful phrase in English back as far as 1905 1903. I posted on the phrase a few years ago ("The Romantic Side of Familiar Words"), suggesting that there was a reason why linguistic folklore fixed  on that particular phrase, when you could make the same point with other pedestrian expressions like linoleum or oleomargarine:

…The undeniable charm of the story — the source of the enchantment that C. S. Lewis reported when he saw cellar door rendered as Selladore — lies the sudden falling away of the repressions imposed by orthography … to reveal what Dickens called "the romantic side of familiar things." … In the world of fantasy, that role is suggested literally in the form of a rabbit hole, a wardrobe, a brick wall at platform 9¾. Cellar door is the same kind of thing, the expression people use to illustrate how civilization and literacy put the primitive sensory experience of language at a remove from conscious experience.

But that doesn't explain why the story emerged when it did. Could it have had to do with the song "Playmates," with its line "Shout down my rain barrel, slide down my cellar door"? There's no way to know for sure, but the dates correspond, and in fact those lines had an interesting life of their own…

"Playmates" was a big hit for Philip Wingate and Henry W. Petrie in in 1894,  in an age swilling in lachrymose sentimentality about childhood. The original lyrics were:

Say, say, oh playmate,
Come out and play with me
And bring your dollies three,
Climb up my apple tree.
Shout down my rain barrel,
Slide down my cellar door,
And we'll be jolly friends forevermore.

Wingate and Petrie followed it up in the same year with an even more popular sequel, “I Don’t Want to Play in Your Yard,” which containted the phrase “You’ll be sorry when you see me sliding down our cellar door." The song figures a couple of times in the 1981 Warren Beatty movie Reds, most unforgettably as sung by Peggy Lee.

In various forms, “slide down my cellar door” became a kind of catchphrase to suggest innocent friendship. In an 1896 letter to a friend, the poet Vaughan Moody wrote “Are n’t [sic] you going to speak to me again? Is my back-yard left irredeemably desolate? Have your rag dolls and your blue dishes said inexorable adieu to my cellar-door? The once melodious rain barrel answers hollow and despairing to my plaints….”

More generally, “You shan’t slide down my cellar door,” and the like were invoked to suggest childish truculence. Google Books and Newspaperarchive turn up numerous hits, which don’t tail off until the 1930s or so.

I would not let an operator that did not have a card, carry my lunch basket or slide down my cellar door: not to say give him a "square" or fix him for a ride over the road. ‪Trans-Communicator, 1895

Commenting on a recent press dispatch Spain has refused the customary permission to the British garrison at Gibraltar to play polo and golf on Spanish territory, the Baltimore Sun says : — " This suggests the stern retaliatory methods of childhood : ' You shan't play in my back yard, you shan't slide down my cellar door.'  National Review, 1898

If you see my friend Prince Krapotpin tell him I should be glad to have him holler down my rain barrel or slide down my cellar door any time. It is a hard thing to be a czar. Oak Park (IL) Argus, 1901

William Waldorf Astor seems to have carried into maturity the youthful feelings so beautifully expressed in ballads of the " you can't slide down my cellar door " school. Munsey’s magazine, 1901

And Greece has said to Roumania, "You can't slide down my cellar-door any more." Religious Telescope, 1906

I am not desirous of having him slide down my cellar door. So far as I am concerned he can stay in his own back-yard, his own puddle or whatever his habitat may be. Louisiana Conservation Review, 1940

The Abbe was gentle and courteous, not to say whimsical, and the very soul of cheerfulness, cordiality, and hospitality, but the blunt fact remained that he wouldn't play ball in my back lot or slide down my cellar door. Wine Journeys 1949

That’s the last instance of the phrase that I can find where it's used that way. The song “Playmates” enjoyed a renewed popularity when it was recorded by Kay Kyser in 1940 and of course remains popular as a children’s clapping song today. (Willie Nelson recorded a version a version a few years ago.) Notably, Kyser substituted “look down my rain barrel” for “shout down my rain barrel,” the acoustic charms of rain barrels having faded from memory along with the containers themselves, even as sloping exterior cellar doors were becoming scarce. A 1968 article in the Lima (Ohio) News began:

“Shout down my rain barrel, Slide down my cellar door, And we'll be jolly friends forever more.”   Modern kids would have a hard time making friends that way, for gone are the rain barrels and outside cellar doors. Lima (Ohio) News 1968

Could the songs have been the immediate inspiration for the claim that “cellar door” is the most beautiful phrase in the English language? Well, the dates are suggestive, particularly given that the phrase was literally in air when the  claim first emerged,  and occasionally, no doubt,  mondagreenized into something else (the way later generations often transform "rain barrel" to "rainbow"). And I think it counts for something that the perception of the phrase's beauty requires a regressive capacity, as I put it in the earlier post, to "transcend not just its semantics but its orthography, to recover the pre-alphabetic innocence that comes when we let 'the years of reading fall away,' in Auden's phrase, and attune ourselves with sonorities that are hidden from the ear behind the overlay of writing"—that is, you have assume, as the songs ask you to, a child's point of view.

But this account of the origin will be have be left speculative—unless, or course, someone digs up a pre-1894 citation for the claim, in which case the theory is toast.

by Geoff Nunberg at March 16, 2014 08:37 PM

March 11, 2014

MIMS 2012

Did you A/B test the redesigned preview tool?

A lot of people have asked me if we A/B tested the redesigned preview tool. The question comes in two flavors: did we use A/B testing to validate impersonation was worth building (a.k.a. fake door testing); and, did we A/B test the redesigned UI against the old UI? Both are good questions, but the short answer is no. In this post I’m going to dig into both and explain why.

Fake Door Testing

Fake door testing (video) is a technique to measure interest in a new feature by building a “fake door” version of it that looks like the final version (but doesn’t actually work) and measuring how many people engage with it. Trying to use the feature gives users an explanation of what it is, that it’s “Coming soon”, and usually the ability to “vote” on it or send feedback (the specifics vary depending on the context). This doesn’t need to be run as an A/B test, but setting it up as one lets you compare user behavior.

We could have added an “Impersonate” tab to the old UI, measured how many people tried to use it, and gathered feedback. This would have been cheap and easy. But we didn’t do this because the feature was inspired by our broader research around personalization. Our data pointed us towards this feature, so we were confident it would be worthwhile.

But more than that, measuring clicks and votes doesn’t tell you much. People can click out of curiosity, which doesn’t tell you if they’d actually use the feature or if it solves a real problem. Even if people send feedback saying they’d love it, what users say and what they do is different. Actually talking to users to find pain points yields robust data that leads to richer solutions. The new impersonate functionality is one such example — no one had thought of it before, and it wasn’t on our feature request list or product roadmap.

However, not everyone has the resources to conduct user research. In that situation, fake door testing is a good way of cheaply getting feedback on a specific idea. After all, some data is better than no data.

A/B Testing the Redesigned UI

The second question is, “Did you A/B test the redesigned preview against the previous one?” We didn’t, primarily because there’s no good metric to use as a conversion goal. We added completely new functionality to the preview tool, so most metrics are the equivalent of comparing apples to oranges. For example, measuring how many people impersonate a visitor is meaningless because the old UI doesn’t even have that feature.

So at an interface level, there isn’t a good measurement. But what about at the product level? We could measure larger metrics, such as the number of targeted experiments being created, to see if the new UI has an effect. There are two problems with this. First, it will take a long time (many months) to reach a statistically significant difference because the conversion rate on most product metrics are low. Second, if we eventually measured a difference, there’s no guarantee it was caused by adding impersonation. The broader the metric, the more factors that influence it (such as other product updates). To overcome this you could freeze people in the “A” and “B” versions of the product, but given how long it takes to reach significance, this isn’t a good idea.

Companies like Facebook and Google have enough traffic that they actually are able to roll out new features to a small percentage of users (say, 5%), and measure the impact on their core metrics. If any take a plunge, they revert users to the previous UI and keep iterating. When you have the scale of Facebook and Google, you can get significant data in a day. Unfortunately, like most companies, we don’t have this scale, so it isn’t an option.

So How Do You Know The Redesign Was Worth It?

What people are really asking is how do we know the redesign was worth the effort? Being at an A/B testing company, everyone wants to A/B test everything. But in this case, there wasn’t a place for it. Like any method, split testing has its strengths and weaknesses.

No, Really — How Do You Know The Redesign Was Worth It?

Primarily via qualitative feedback (i.e. talking to users), which at a high level has been positive (but there are some improvements we can make). We’re also measuring people’s activities in the preview tool (e.g. changing tabs, impersonating visitors, etc.). So far, those are healthy. Finally, we’re keeping an eye on some product-level metrics, like the number of targeted experiments created. These metrics are part of our long-term personalization efforts, and we hope in the long run to see them go up. But the preview tool is just one piece of that puzzle, so we don’t expect anything noticeable from that alone.

The important theme here is that gathering data is important to ensure you’re making the best use of limited resources (time, people, etc.). But there’s a whole world of data beyond A/B testing, such as user research, surveys, product analytics, and so on. It’s important to keep in mind the advantages and disadvantages of each, and use the most appropriate ones at your disposal.

by Jeff Zych at March 11, 2014 04:50 PM

March 08, 2014

MIMS 2014

5 Minutes of Fame

On the Internet, everyone gets a chance to be famous, even if it lasts for all of 5 minutes. Last year, me and a couple of friends from school worked on this visualization for a class project. Two weeks later, we were featured on LifeHacker, and we thought that was our 5 minutes of fame.

Now, almost a year later, the FlowingData blog (which I love), picked it up and featured it yet again. It has set off a domino-like reaction with multiple sites and people talking about it, and also referring to our creation as the ‘Pandora of Beers’.

What I find most interesting however is how there are multiple versions of our story, the most common of which is that we are Stanford students – even though we are Berkeley students who used a Stanford dataset (a fact clearly mentioned on the website). Watching this story get re-tweeted, and republished is an interesting study of viral effects, and how some inaccuracies get pushed far and wide across the web.

Ah, well – at the very least I can say that we did get more than our share of the 5 minutes of fame.

by muchnessofd at March 08, 2014 09:28 PM

March 03, 2014

Ph.D. alumna

What’s Behind the Free PDF of “It’s Complicated” (no, no, not malware…)

As promised, I put a free PDF copy of “It’s Complicated” on my website the day the book officially launched. But as some folks noticed, I didn’t publicize this when I did so. For those who are curious as to why, I want to explain. And I want you to understand the various issues at play for me as an author and a youth advocate.

I didn’t write this book to make money. I wrote this book to reach as wide of an audience as I possibly could. This desire to get as many people as engaged as possible drove every decision I made throughout this process. One of the things that drew me to Yale was their willingness to let me put a freely downloadable CC-licensed copy of the book online on the day the book came out. I knew that trade presses wouldn’t let a first time author pull that one off. Heck, they still get mad at Paulo Coelho for releasing his books online and he’s sold more books worldwide than anyone else!

As I prepared for publication, it became clear that I really needed other people’s help in getting the word out. I needed journalistic enterprises to cover the book. I needed booksellers to engage with the book. I needed people to collectively signal that this book was important. I needed people to be willing to take a bet on me. When one of those allies asked me to wait a week before publicizing the free book, I agreed.

If you haven’t published a book before, it’s pretty unbelievable to see all of the machinery that goes into getting the book out once the book exists in physical form. News organizations want to promote books that will be influential or spark a conversation, but they are also anxious about having their stories usurped by others. Booksellers make risky decisions about how many copies they think they can sell ahead of time and order accordingly. (And then there’s the world of paying for placement which I simply didn’t do.) Booksellers’ orders – as well as actual presales – are influential in shaping the future of a book, just like first weekend movie sales matter. For example, these sales influence bestseller and recommendation lists. These lists are key to getting broader audiences’ attention (and for getting the attention of certain highly influential journalistic enterprises). And, as an author trying to get a message out, I realized that I needed to engage with this ecosystem and I needed all of these actors to believe in my book.

The bestseller aspect of this is the part that I struggle with the most. I don’t actually care whether or not my book _sells_ a lot; I care whether or not it’s _read_ a lot. But there’s no bestread-ed list (except maybe Goodreads). And while many books that are widely sold aren’t widely read, most books that are widely read are widely sold. My desire to be widely read is why I wanted to make the book freely available from the getgo. I get that not everyone can afford to buy the book. I get that it’s not available in certain countries. I get that people want to check it out first. I get that we haven’t figured out how to implement ‘grep’ in physical books. So I really truly get the importance of making the book accessible.

But what I started to realize is that when people purchase the book, they signal to outside folks that the book is important. This is one of the reasons that I asked people who value this book to buy it. For them or for others. I love it when people buy the book and give it away to a poor grad student, struggling parent, or library. I don’t know if I’ll make any bestseller list, but the reason I decided to try is because sales rankings – especially in the first few weeks of a book’s life – really do help attract more attention which is key to getting the word out. And so I’ve begged and groveled, asking people to buy my book even though it makes me feel squeamish, solely because I know that the message that I want to offer is important. So, to be honest, if you are going to buy the book at some point, I’d really really appreciate it if you’d buy a copy. And sooner rather than later. Your purchasing decisions help me signal to the powers that be that this book is important, that the message in the book is valuable.

That said, if you don’t have the resources or simply don’t want to, don’t buy it. I’m cool with that. I’m beyond delighted to give the book away for free to anyone who wants to read it, assign it in their classes, or otherwise engage with it. If you choose to download it, thank you! I’m glad you find it valuable!

If you feel like giving back, I have a request. Please help support all of the invisible people and organizations that helped get word of my book out there. I realize that there are folks out there who want to “support the author,” but my ask of you is to help me support the whole ecosystem that made this possible.

Go buy a different book from Yale University Press to thank them for being willing to publish me. Buy a random book from an independent bookseller to say thank you (especially if you live near Harvard Book Store, Politics & Prose, or Book People). Visit The Guardian and click on their ads to thank them for running a first serial. Donate to NPR for their unbelievable support in getting the word out. Buy a copy or click on the ads of BoingBoing, Cnet, Fast Company, Financial Times, The Globe & Mail, LA Times, Salon, Slate, Technology Review, The Telegraph, USA Today, Wired, and the other journalistic venues whose articles aren’t yet out to thank them for being so willing to cover this book. Watch the ads on Bloomberg and MSNBC to send them a message of thanks. And take the time to retweet the tweets or write a comment on the blogs of the hundreds of folks who have been so kind to write about this book in order to get the word out. I can’t tell you how grateful I am to all of the amazing people and organizations who have helped me share what I’ve learned. Please shower them in love.

If you want to help me, spread the message of my book as wide as you possibly can. I wrote this book so that more people will step back, listen, and appreciate the lives of today’s teenagers. I want to start a conversation so that we can think about the society that we’re creating. I will be forever grateful for anything that you can do to get that message out, especially if you can help me encourage people to calm down and let teenagers have some semblance of freedom.

More than anything, thank *you* soooo much for your support over the years!!! I am putting this book up online as a gift to all of the amazing people who have been so great to me for so long, including you. Thank you thank you thank you.


PS: Some folks have noticed that Amazon seems to not have any books in stock. There was a hiccup but more are coming imminently. You could wait or you could support IndieBound, Powell’s, Barnes & Noble, or your local bookstore.

by zephoria at March 03, 2014 04:52 PM

March 02, 2014

MIMS 2012

When Do You Do User Testing?

In response to my preview redesign post, my uncle asked, “When does the designer go with their own analysis of a design and when do they do a usability test?” I’ve been asked this question a lot, and we often discuss it in product development meetings. It’s also something I wanted to elaborate on more in the post, but it didn’t fit. So I will take this opportunity to roughly outline when we do user testing and why.

The Ideal

In an ideal world, we would be testing all the time. Having a tight feedback loop is invaluable. By which I mean, being able to design a solution to a problem and validating that it solves said problem immediately would be amazing. And in product design, the best validation is almost always user testing.

The Reality

In reality, there’s no such thing as instant feedback. You can’t design something and receive immediate feedback. There’s no automated process that tells you if a UI is good or not. You have to talk to actual humans, which takes time and effort.

This means there’s a trade-off between getting as much feedback as possible, and actually releasing something. The more user testing you do, the longer it will take to release.

What we do at Optimizely

At Optimizely, we weigh the decision to do user testing against deadlines, how important the questions are (e.g. are they core to the experience, or an edge case), how likely we are to get actionable insights (e.g. testing colors will usually give you a bunch of conflicting opinions), and what other research we could be doing instead (i.e. opportunity cost).

With the preview redesign, we started with exploratory research to get us on the right track, but didn’t do much testing beyond that. This was mainly because we didn’t make time for it. It was clear from our generative research that adding impersonation to the preview tool would be a step in the right direction. But it’s only one part of a larger solution, and won’t be used by everyone. We didn’t want to slow down our overall progress by spending too much time trying to perfect one piece of a much larger puzzle.

So with the preview tool, I had to rely on my instincts and feedback from other designers, engineers, and product managers to make decisions. One such example is when I decided to hide the impersonation feature by default, and it would slide out when an icon is clicked. Of this solution I said:

But it worked too well [at solving the problem of the impersonation UI being distracting]. The icon was too cryptic, and overall the impersonate functionality was too hidden for anyone to find.

As my uncle pointed out, I didn’t do any user testing to make this call. I looked at what I had created, and could tell it wasn’t a great solution (which was also confirmed by my teammates). I was confident the decision to keep iterating was right based on established usability heuristics and my own experience.

However, not all design decisions can be guided by general usability guidelines. One such example is when I went in circles designing how a person sets impersonation values. I tried using established best practices and talking to other designers, but neither method led me to an answer. At this point, user testing was my only out.

In this case, we opted for guerrilla usability testing. Recruiting actual users would have required more time and resources than we wanted to spend. So I called in two of our sales guys, who made for good proxies of semi-experienced users that are technically middle-of-the-road (i.e. they have some basic code knowledge, but aren’t developers), which covers the majority of our users. Their feedback made the decision easy, and successfully got me out of this jam.

In Summary

In a perfect world we would be testing all the time, but in reality that just isn’t feasible. So we do our best to balance the time and resources required to test a UI against the overall importance of that feature. But usability testing won’t find every flaw. Eventually you have to ship — you can’t refine forever, and no design is perfect.

by Jeff Zych at March 02, 2014 11:37 PM

February 25, 2014

Ph.D. alumna

want a signed copy of “It’s Complicated”?

Today is the official publication date of “It’s Complicated: The Social Lives of Networked Teens”. While many folks have received their pre-orders already, this is the date in which all U.S. book stores that promised to carry the book at launch should have a copy. It’s also the day in which I officially start my book tour in Cambridge.

In many ways, I’m thinking of my book tour as a thank-you tour. I’m trying to visit as many cities as I can that have been really good to me over the years. Some are cities where I was educated. Some are field site cities. Some are places where there are people who have been angels in my life. But I sadly won’t get everywhere. Although the list of events is not complete, I have discussions underway to be in the following cities this spring: Cambridge, DC, Seattle, Austin, Nashville, Berkeley/SF, Providence, Charlottesville, and Chicago. After that, I’m going to need to take a break. But I’m really hoping to see lots of friends and allies in the cities I visit. And I want to offer a huge apology to those outside of the US who have been so amazing to me. Given my goal of seeing my young son every week amidst this crazy, I simply can’t do an international tour right now.

I know that I won’t be able to get everywhere and I know folks have been asking me for signed copies, so I want to make an offer. Buy a book this week and send it to me and I will sign it and send it back to you signed. You can either buy it from your favorite bookseller and then mail it to me or have it shipped directly from your favorite online retailer.

danah boyd
Microsoft Research
641 6th Ave, 7th Floor
New York, NY 10011

When you send me the book, include a note (“gift note” for online retailers) that includes your name, email (in case something goes wrong), snail mail address for shipment, and anything I should know when signing the book.

If you have a local bookseller that’s selling it, start there. If you’d prefer to use an online retailer, my book is now available at:

IndieBoundPowellsAmazonBarnes & Noble

Thank you soooo much for all of your amazing support over the years! I wrote this book to share all that I’ve learned over the years, in the hopes that it will prompt people to step back and appreciate teens’ lives from their perspective. My goal is to share this book as widely as possible precisely because so many teens are struggling to get their voices heard. Fingers crossed that we can get this book into the hands of many people this week and that this, in turn, will prompt folks to spread the message further!


by zephoria at February 25, 2014 02:44 PM