School of Information Blogs

April 17, 2014

MIMS 2012

If you’re a tech worker in California, make sure you’re not getting screwed

(obligatory disclaimer: I am not an employment lawyer, nor a lawyer of any sort.)

For my first software engineering job out of grad school, I was offered a salary of $80,000 a year. Said job was full-time, non-hourly, and located in San Francisco. I negotiated, because negotiation is Something You’re Supposed to Do, and got a bump up to $82K. Not bad, for 20 minutes of extremely uncomfortable phone conversation and feeling a bit like a bitch!

Today I learned that that the salary I was originally offered was illegally low. My negotiation just pushed it back into the legal range. And when 2013 rolled around, even that amount was below the minimum salary for exempt (meaning non-overtime-paying) computer programming jobs under California law. I wasn’t (just?) paid less relative to other tech workers in the frothy, dysfunctional Bay Area tech sector– I was legally underpaid!

This job wasn’t at a tiny three-person startup without a clue, either. This was a 40+ person company, with actual in-house HR. A generous interpretation is that a lot of tech companies out there are unaware of this law–certainly most individual employees are! A few might be banking on that ignorance, though. Protect yourself.

What the law says

(Again, I am totally not a lawyer. This is just what you get once you learn to google “Computer Software Occupations Exemption”.)

By default, jobs in California entail an hourly wage with time-and-a-half overtime, lunch breaks, and other requirements. Under California law, there are a few classes of job that are exempt from these requirements (hence why on more old-school job sites and payroll tools you see the term Exempt)–for example, taxi drivers and farmers. Among white-collar jobs, “Executive”, “Administrative”, and “Professional” positions are also exempt, subject to a number of requirements, including a minimum salary corresponding to double the local minimum wage times 40 hours a week. These three exceptions do not apply to most tech workers.

Unlike federal law, and most other states to the best of my knowledge, California has an additional class of exempt job: one for “Computer Software Occupations”. For these positions, the minimum salary is higher. It also gets adjusted each fall by the Division of Labor Statistics and Research to keep up with cost of living. What that amounts to in terms of full-time minimum salary for a given year:

Minimum Computer Software Occupation Salary

2008: $75,000
2009: $79,050
2010: $79,050
2011: $79,050
2012: $81,026.25
2013: $83,132.93
2014: $84,130.53

“Computer Software Occupations” is defined as:

California Labor Code §515.5
(a) Except as provided in subdivision (b), an employee in the computer software field shall be exempt from the requirement that an overtime rate of compensation be paid pursuant to Section 510 if all of the following apply:
(1) The employee is primarily engaged in work that is intellectual or creative and that requires the exercise of discretion and independent judgment, and the employee is primarily engaged in duties that consist of one or more of the following:
(A) The application of systems analysis techniques and procedures, including consulting with users, to determine hardware, software, or system functional specifications.
(B) The design, development, documentation, analysis, creation, testing, or modification of computer systems or programs, including prototypes, based on and related to, user or system design specifications.
(C) The documentation, testing, creation, or modification of computer programs related to the design of software or hardware for computer operating systems.

Employees who are still learning (e.g. interns, others unable to work independently without close supervision), IT workers, people who mainly work on hardware instead of software, copywriters, and special effects artists and similar movie industry employees are exceptions to the above. (Check the full snippet of the law.)

If you think your job fits this category, you make less than $84,130.53 working full-time, and you work in California: now would be a nice time to talk to your employer’s HR department and/or consult with an actual lawyer.

If you’re an employer and think this law is weird and overpays some roles: sure, whatever, but this is the freaking law. Your options are: get out of California, make your less-well-paid tech employees non-exempt (with overtime and the myriad other headaches that entails), or bump people’s salaries up. Don’t be a jerk.

by Karen at April 17, 2014 08:16 AM

MIMS 2012

Designing with instinct vs. data

Braden Kowitz wrote a great article exploring the ever increasing tension between making design decisions based on instinct versus data. As he says, “It’s common to think of data and instincts as being opposing forces in design decisions.” This is especially true at an A/B testing company — we have a tendency to quantify and measure everything.

He goes on to say, “In reality, there’s a blurry line between the two,” and I couldn’t agree more. When I’m designing, I’m in the habit of always asking, “What data would help me make this decision?” Sometimes it’s usage logs, sometimes it’s user testing, sometimes it’s market research, and sometimes there isn’t anything but my own intuition. Even when there is data, it’s all just input I use to help me reach a decision. It’s not a good idea to blindly follow data, but it’s equally bad to only use your gut. As Braden said, it’s important to balance the two.

by Jeff Zych at April 17, 2014 05:27 AM

April 14, 2014

Ph.D. alumna

Whether it’s bikes or bytes, teens are teens

(This piece was written for the LA Times, where it was published as an op-ed on April 11, 2014.)

If you’re like most middle-class parents, you’ve probably gotten annoyed with your daughter for constantly checking her Instagram feed or with your son for his two-thumbed texting at the dinner table. But before you rage against technology and start unfavorably comparing your children’s lives to your less-wired childhood, ask yourself this: Do you let your 10-year-old roam the neighborhood on her bicycle as long as she’s back by dinner? Are you comfortable, for hours at a time, not knowing your teenager’s exact whereabouts?

What American children are allowed to do — and what they are not — has shifted significantly over the last 30 years, and the changes go far beyond new technologies.
If you grew up middle-class in America prior to the 1980s, you were probably allowed to walk out your front door alone and — provided it was still light out and you had done your homework — hop on your bike and have adventures your parents knew nothing about. Most kids had some kind of curfew, but a lot of them also snuck out on occasion. And even those who weren’t given an allowance had ways to earn spending money — by delivering newspapers, say, or baby-sitting neighborhood children.

All that began to change in the 1980s. In response to anxiety about “latchkey” kids, middle- and upper-class parents started placing their kids in after-school programs and other activities that filled up their lives from morning to night. Working during high school became far less common. Not only did newspaper routes become a thing of the past but parents quit entrusting their children to teenage baby-sitters, and fast-food restaurants shifted to hiring older workers.

Parents are now the primary mode of transportation for teenagers, who are far less likely to walk to school or take the bus than any previous generation. And because most parents work, teens’ mobility and ability to get together casually with friends has been severely limited. Even sneaking out is futile, because there’s nowhere to go. Curfew, trespassing and loitering laws have restricted teens’ presence in public spaces. And even if one teen has been allowed out independently and has the means to do something fun, it’s unlikely her friends will be able to join her.

Given the array of restrictions teens face, it’s not surprising that they have embraced technology with such enthusiasm. The need to hang out, socialize, gossip and flirt hasn’t diminished, even if kids’ ability to get together has.

After studying teenagers for a decade, I’ve come to respect how their creativity, ingenuity and resilience have not been dampened even as they have been misunderstood, underappreciated and reviled. I’ve watched teenage couples co-create images to produce a portrait of intimacy when they lack the time and place to actually kiss. At a more political level, I’ve witnessed undocumented youth use social media to rally their peers and personal networks to speak out in favor of the Dream Act, even going so far as to orchestrate school walkouts and local marches.

This does not mean that teens always use the tools around them for productive purposes. Plenty of youth lash out at others, emulating a pervasive culture of meanness and cruelty. Others engage in risky behaviors, seeking attention in deeply problematic ways. Yet, even as those who are hurting others often make visible their own personal struggles, I’ve met alienated LGBT youth for whom the Internet has been a lifeline, letting them see that they aren’t alone as they struggle to figure out whom to trust.
And I’m on the board of Crisis Text Line, a service that connects thousands of struggling youth with counselors who can help them. Technology can be a lifesaver, but only if we recognize that the Internet makes visible the complex realities of people’s lives.
As a society, we both fear teenagers and fear for them. They bear the burden of our cultural obsession with safety, and they’re constantly used as justification for increased restrictions. Yet, at the end of the day, their emotional lives aren’t all that different from those of their parents as teenagers. All they’re trying to do is find a comfortable space of their own as they work out how they fit into the world and grapple with the enormous pressures they face.

Viewed through that prism, it becomes clear how the widespread embrace of technology and the adoption of social media by kids have more to do with non-technical changes in youth culture than with anything particularly compelling about those tools. Snapchat, Tumblr, Twitter and Facebook may be fun, but they’re also offering today’s teens a relief valve for coping with the increased stress and restrictions they encounter, as well as a way of being with their friends even when their more restrictive lives keep them apart.

The irony of our increasing cultural desire to protect kids is that our efforts may be harming them. In an effort to limit the dangers they encounter, we’re not allowing them to develop skills to navigate risk. In our attempts to protect them from harmful people, we’re not allowing them to learn to understand, let alone negotiate, public life. It is not possible to produce an informed citizenry if we do not first let people engage in public.
Treating technology as something to block, limit or demonize will not help youth come of age more successfully. If that’s the goal, we need to collectively work to undo the culture of fear and support our youth in exploring public life, online and off.

(More comments can be found over at the LA Times.)

by zephoria at April 14, 2014 03:07 PM

April 07, 2014

Ph.D. student

Why we need good computational models of peace and love

“Data science” doesn’t refer to any particular technique.

It refers to the cusp of the diffusion of computational methods from computer science, statistics, and applied math (the “methodologists”) to other domains.

The background theory of these disciplines–whose origin we can trace at least as far back at cybernetics research in the 1940′s–is required to understand the validity of these “data science” technologies as scientific instruments, just as a theory of optics is necessary to know the validity of what is seen through a microscope. Kuhn calls these kinds of theoretical commitments “instrumental commitments.”

For most domain sciences, instrumental commitment to information theory, computer science, etc. is not problematic. It is more so with some social sciences which oppose the validity of totalizing physics or formalism.

There aren’t a lot of them left because our mobile phones more or less instrumentally commit us to the cybernetic worldview. Where there is room for alternative metaphysics, it is because of the complexity of emergent/functional properties of the cybernetic substrate. Brier’s Cybersemiotics is one formulation of how richer communicative meaning can be seen as a evolved structure on top of cybernetic information processing.

If “software is eating the world” and we don’t want it to eat us (metaphorically! I don’t think the robots are going to kill us–I think that corporations are going to build robots that make our lives miserable by accident), then we are going to need to have software that understands us. That requires building out cybernetic models of human communication to be more understanding of our social reality and what’s desirable in it.

That’s going to require cooperation between techies and humanists in a way that will be trying for both sides but worth the effort I think.

by Sebastian Benthall at April 07, 2014 11:34 PM

April 03, 2014

Ph.D. alumna

Is the Oculus Rift sexist? (plus response to criticism)

Last week, I wrote a provocative opinion piece for Quartz called “Is the Oculus Rift sexist?” I’m reposting it on my blog for posterity, but also because I want to address some of the critiques that I received. First, the piece itself:

Is the Oculus Rift sexist?

In the fall of 1997, my university built a CAVE (Cave Automatic Virtual Environment) to help scientists, artists, and archeologists embrace 3D immersion to advance the state of those fields. Ecstatic at seeing a real-life instantiation of the Metaverse, the virtual world imagined in Neal Stephenson’s Snow Crash, I donned a set of goggles and jumped inside. And then I promptly vomited.

I never managed to overcome my nausea. I couldn’t last more than a minute in that CAVE and I still can’t watch an IMAX movie. Looking around me, I started to notice something. By and large, my male friends and colleagues had no problem with these systems. My female peers, on the other hand, turned green.

What made this peculiar was that we were all computer graphics programmers. We could all render a 3D scene with ease. But when asked to do basic tasks like jump from Point A to Point B in a Nintendo 64 game, I watched my female friends fall short. What could explain this?

At the time any notion that there might be biological differences underpinning computing systems was deemed heretical. Discussions of gender and computing centered around services like Purple Moon, a software company trying to entice girls into gaming and computing. And yet, what I was seeing gnawed at me.

That’s when a friend of mine stumbled over a footnote in an esoteric army report about simulator sickness in virtual environments. Sure enough, military researchers had noticed that women seemed to get sick at higher rates in simulators than men. While they seemed to be able to eventually adjust to the simulator, they would then get sick again when switching back into reality.

Being an activist and a troublemaker, I walked straight into the office of the head CAVE researcher and declared the CAVE sexist. He turned to me and said: “Prove it.”

The gender mystery

Over the next few years, I embarked on one of the strangest cross-disciplinary projects I’ve ever worked on. I ended up in a gender clinic in Utrecht, in the Netherlands, interviewing both male-to-female and female-to-male transsexuals as they began hormone therapy. Many reported experiencing strange visual side effects. Like adolescents going through puberty, they’d reach for doors—only to miss the door knob. But unlike adolescents, the length of their arms wasn’t changing—only their hormonal composition.

Scholars in the gender clinic were doing fascinating research on tasks like spatial rotation skills. They found that people taking androgens (a steroid hormone similar to testosterone) improved at tasks that required them to rotate Tetris-like shapes in their mind to determine if one shape was simply a rotation of another shape. Meanwhile, male-to-female transsexuals saw a decline in performance during their hormone replacement therapy.

Along the way, I also learned that there are more sex hormones on the retina than in anywhere else in the body except for the gonads. Studies on macular degeneration showed that hormone levels mattered for the retina. But why? And why would people undergoing hormonal transitions struggle with basic depth-based tasks?

Two kinds of depth perception

Back in the US, I started running visual psychology experiments. I created artificial situations where different basic depth cues—the kinds of information we pick up that tell us how far away an object is—could be put into conflict. As the work proceeded, I narrowed in on two key depth cues – “motion parallax” and “shape-from-shading.”

Motion parallax has to do with the apparent size of an object. If you put a soda can in front of you and then move it closer, it will get bigger in your visual field. Your brain assumes that the can didn’t suddenly grow and concludes that it’s just got closer to you.

Shape-from-shading is a bit trickier. If you stare at a point on an object in front of you and then move your head around, you’ll notice that the shading of that point changes ever so slightly depending on the lighting around you. The funny thing is that your eyes actually flicker constantly, recalculating the tiny differences in shading, and your brain uses that information to judge how far away the object is.

In the real world, both these cues work together to give you a sense of depth. But in virtual reality systems, they’re not treated equally.

The virtual-reality shortcut

When you enter a 3D immersive environment, the computer tries to calculate where your eyes are at in order to show you how the scene should look from that position. Binocular systems calculate slightly different images for your right and left eyes. And really good systems, like good glasses, will assess not just where your eye is, but where your retina is, and make the computation more precise.

It’s super easy—if you determine the focal point and do your linear matrix transformations accurately, which for a computer is a piece of cake—to render motion parallax properly. Shape-from-shading is a different beast. Although techniques for shading 3D models have greatly improved over the last two decades—a computer can now render an object as if it were lit by a complex collection of light sources of all shapes and colors—what they they can’t do is simulate how that tiny, constant flickering of your eyes affects the shading you perceive. As a result, 3D graphics does a terrible job of truly emulating shape-from-shading.

Tricks of the light

In my experiment, I tried to trick people’s brains. I created scenarios in which motion parallax suggested an object was at one distance, and shape-from-shading suggested it was further away or closer. The idea was to see which of these conflicting depth cues the brain would prioritize. (The brain prioritizes between conflicting cues all the time; for example, if you hold out your finger and stare at it through one eye and then the other, it will appear to be in different positions, but if you look at it through both eyes, it will be on the side of your “dominant” eye.)

What I found was startling (pdf). Although there was variability across the board, biological men were significantly more likely to prioritize motion parallax. Biological women relied more heavily on shape-from-shading. In other words, men are more likely to use the cues that 3D virtual reality systems relied on.

This, if broadly true, would explain why I, being a woman, vomited in the CAVE: My brain simply wasn’t picking up on signals the system was trying to send me about where objects were, and this made me disoriented.

My guess is that this has to do with the level of hormones in my system. If that’s true, someone undergoing hormone replacement therapy, like the people in the Utrecht gender clinic, would start to prioritize a different cue as their therapy progressed. 1
We need more research

However, I never did go back to the clinic to find out. The problem with this type of research is that you’re never really sure of your findings until they can be reproduced. A lot more work is needed to understand what I saw in those experiments. It’s quite possible that I wasn’t accounting for other variables that could explain the differences I was seeing. And there are certainly limitations to doing vision experiments with college-aged students in a field whose foundational studies are based almost exclusively on doing studies solely with college-age males. But what I saw among my friends, what I heard from transsexual individuals, and what I observed in my simple experiment led me to believe that we need to know more about this.

I’m excited to see Facebook invest in Oculus, the maker of the Rift headset. No one is better poised to implement Stephenson’s vision. But if we’re going to see serious investments in building the Metaverse, there are questions to be asked. I’d posit that the problems of nausea and simulator sickness that many people report when using VR headsets go deeper than pixel persistence and latency rates.

What I want to know, and what I hope someone will help me discover, is whether or not biology plays a fundamental role in shaping people’s experience with immersive virtual reality. In other words, are systems like Oculus fundamentally (if inadvertently) sexist in their design?

Response to Criticism

1. “Things aren’t sexist!”

Not surprisingly, most people who responded negatively to my piece were up in arms about the title. Some people directed that at Quartz which was somewhat unfair. Although they originally altered the title, they reverted to my title within a few hours. My title was intentionally, “Is the Oculus Rift sexist?” This is both a genuine question and a provocation. I’m not naive enough to not think that people would react strongly to the question, just as my advisor did when I declared VR sexist almost two decades ago. But I want people to take that question seriously precisely because more research needs to be done.

Sexism is prejudice or discrimination on the basis of sex (typically against women). For sexism to exist, there does not need to be an actor intending to discriminate. People, systems, and organizations can operate in sexist manners without realizing it. This is the basis of implicit or hidden biases. Addressing sexism starts by recognizing bias within systems and discrimination as a product of systems in society.

What was interesting about what I found and what I want people to investigate further is that the discrimination that I identified is not intentional by scientists or engineers or simply the product of cultural values. It is a byproduct of a research and innovation cycle that has significant consequences as society deploys the resultant products. The discriminatory potential of deployment will be magnified if people don’t actively seek to address it, which is precisely why I drudged up this ancient work in this moment in time.

I don’t think that the creators of Oculus Rift have any intentions to discriminate against women (let alone the wide range of people who currently get nauseous in their system which is actually quite broad), but I think that if they don’t pay attention to the depth cue prioritization issues that I’m highlighting or if they fail to actively seek technological redress, they’re going to have a problem. More importantly, many of us are going to have a problem. All too often, systems get shipped with discriminatory byproducts and people throw their hands in the air and say, “oops, we didn’t intend that.”

I think that we have a responsibility to identify and call attention to discrimination in all of its forms. Perhaps I should’ve titled the piece “Is Oculus Rift unintentionally discriminating on the basis of sex?” but, frankly, that’s nothing more than an attempt to ask the question I asked in a more politically correct manner. And the irony of this is that the people who most frequently complained to me about my titling are those who loathe political correctness in other situations.

I think it’s important to grapple with the ways in which sexism is not always intentional but at the vary basis of our organizations and infrastructure, as well as our cultural practices.

2. The language of gender

I ruffled a few queer feathers by using the terms “transsexual” and “biological male.” I completely understand why contemporary transgender activists (especially in the American context) would react strongly to that language, but I also think it’s important to remember that I’m referring to a study from 1997 in a Dutch gender clinic. The term “cisgender” didn’t even exist. And at that time, in that setting, the women and men that I met adamantly deplored the “transgender” label. They wanted to make it crystal clear that they were transsexual, not transgender. To them, the latter signaled a choice.

I made a choice in this essay to use the language of my informants. When referring to men and women who had not undergone any hormonal treatment (whether they be cisgender or not), I added the label of “biological.” This was the language of my transsexually-identified informants (who, admittedly, often shortened it to “bio boys” and “bio girls”). I chose this route because the informants for my experiment identified as female and male without any awareness of the contested dynamics of these identifiers.

Finally, for those who are not enmeshed in the linguistic contestations over gender and sex, I want to clarify that I am purposefully using the language of “sex” and not “gender” because what’s at stake has to do with the biological dynamics surrounding sex, not the social construction of gender.

Get angry, but reflect and engage

Critique me, challenge me, tell me that I’m a bad human for even asking these questions. That’s fine. I want people to be provoked, to question their assumptions, and to reflect on the unintentional instantiation of discrimination. More than anything, I want those with the capacity to take what I started forward. There’s no doubt that my pilot studies are the beginning, not the end of this research. If folks really want to build the Metaverse, make sure that it’s not going to unintentionally discriminate on the basis of sex because no one thought to ask if the damn thing was sexist.

by zephoria at April 03, 2014 11:35 PM

April 01, 2014

MIMS 2004

Imagine We Had No Transaction Receipts...

So, imagine you go to the store, you ask to buy a coffee, there is no cash register, no transaction receipt it given to you, but you are handed the coffee. They don't say anything. You payment is invisible. You don't know how much it will be but you agree to the opaque terms. If you get food poisoning later, it's going to be a huge hassle proving you where there, but it's possible. However, the authorities in charge of checking out food poisoning issues would need some proof. Maybe you threw away the cup, maybe you still have it. Maybe there is video surveillance and maybe not. No receipt for tax purposes, or proving the cost from the vendor, or your expense report, or documentation about what you purchased.. no warranty or food safety proof, no date or time or place or anything. You just have a cup of coffee. That's what it's like to go to a vendor online or on your phone, make an account and share some data. You do get something, but you don't really know what you "paid," you have no receipt after you agreed to get the service, and you have nothing from the vendor, other than maybe the confirmation email you received. Now imagine the opposite: You go to a digital vendor, you see the service's rating on the crowd sourced or professional review of the way the company will treat your personal data, and you see a comparison of how other similar services would treat your data. You pick one, and "consent" to share your information. A consent receipt is built, that shows you the vendor's TOU and Privacy Policy, the Consumer Report's style rating and comparison, from the consent date, the Date, Time and Jurisdiction you are in, your identifier, you terms such as a DNT signal, and the Jurisdictional requirements for treating personal data and consent. And your receipt is sent to you, and the vendor. Some statistics hit the public website, depersonalized but showing the world how vendors are doing with personal data consents. And you have a tweet that thanks the vendors doing good with your data, and asks the ones doing poorly why they aren't doing better. That is the Open Notice and Consent Receipt system from the user perspective. Think something like this:

April 01, 2014 09:51 PM

March 31, 2014

MIMS 2004

"Big Data" if Unspecfic, is Ridiculous

Here is a more specific look at what Big Data means, as a term: There is your data, there is "little data" where when you share it, it's wrapped around you as the user, centralized. And that's "Big Data" that is really a large amount of "Little Data." Then there is Big Data that you as a user co-create with a vendor or service, that is relatable back to you but it's wrapped around objects, data models and identifiers that are first about the object and not about you. And then there is aggregated data that is depersonalized .. though it may still be possible with some detective work to find you. My point in making this distinction is to note that talking about Big Data in an unspecific manner is a great opportunity to misunderstand, to miss potential solutions that apply to parts of this scale, but not all, and to talk past each other when we are discussing problems and solutions in the privacy arena.

March 31, 2014 07:40 PM

March 30, 2014

Ph.D. student

starting with a problem

The feedback I got on my dissertation prospectus draft when I presented it to my colleagues was that I didn’t start with a problem and then argue from there how my dissertation was going to be about a solution.

That was really great advice.

The problem of problem selection is a difficult. “What is a problem?” is a question that basically nobody asks. Lots of significant philosophical traditions maintain that it’s the perception of problems as problems that is the problem. “Just chill out,” say Great Philosophical Traditions. This does not help one orient ones research dissertation.

A lot of research is motivated by interest in particular problems like an engineering challenge or curing cancer. I’m somehow managed to never acquire the kind of expertise that would allow me to address any of these specific useful problems directly. My mistake.

I’m a social scientist. There are a lot of social problems, right? Of course. However, there’s a problem here that identifying any problems as problems in the social domain immediately implicates politics.

Are there apolitical social problems? I think I’ve found some. I had a great conversation last week with Anna Salamon about Global Catastrophic Risks. Those sound terrible! It echoes the work I used to do in support of Distaster Risk Reduction, except that there is more acknowledgment in the GCR space that some of the big risks are man-made.

So there’s a problem: arguably research into the solutions to these problems is good. On the other hand, that research is complicated by the political entanglement of the researchers, especially in the university setting. It took some convincing, but OK, those politics are necessarily part of the equation. Put another way, if there wasn’t the political complexity, then the hard problems wouldn’t be such hard problems. The hard problems are hard partly because they are so political. (This difference in emphasis is not meant to preclude other reasons why these problems are hard; for example, because people aren’t smart or motivated enough.)

Given that the political complexity is getting in the way of the efficiency of us solving hard problems–because these problems require collaboration across political lines, because the inherent politics of language choice and framing create complexity that is orthogonal to the problem solution (is it?), infrastructural solutions that manage that political complexity can be helpful.

(Counterclaim: the political complexity is not illogical complexity, rather scientific logic is partly political logic. We live in the best of all possible worlds. Just chill out. This is an empirical claim.)

The promise of computational methods to interdisciplinary collaboration is that they allow for more efficient distribution of cognitive labor across the system of investigators. Data science methodologists can build tools for investigation that work cross-disciplinarily, and the interaction between these tools can follow an a political logic in a way that discursive science cannot. Teleologically, we get an Internet of Scientific Things, and autonomous scientific aparatus, and draw your own eschatological conclusions.

An interesting consequence of algorithmically mediated communication is that you don’t actually need consensus to coordinate collective action. I suppose this is an argument Hayekians etc. have been making for a long time. However, the political maintenance of the system that ensures the appropriate incentive structures is itself prone to being hacked and herein lies the problem. That and the insufficiency of the total neurological market aparatus (in Hayek’s vision) to do anything like internalize the externalities of e.g. climate change, while the Bitcoin servers burn and burn and burn.

by Sebastian Benthall at March 30, 2014 08:54 PM

March 28, 2014

Ph.D. student


This article is making me doubt some of my earlier conclusions about the role of the steering media. Habermas, I’ve got to concede, is dated. As much as skeptics would like to show how social media fails to ‘democratize’ media (not in the sense of being justly won by elections, but rather in the original sense of being mob ruled), the fragmentation is real and the public is reciprocally involved in its own narration.

What can then be said of the role of new media in public discourse? Here are some hypotheses:

  • As a first order effect, new media exacerbates shocks, both endogenous and exogenous. See Didier Sornette‘s work on application of self-excited Hawkes process to social systems like finance and Amazon reviews. (I’m indebted to Thomas Maillart for introducing me to this research.) This changes the dynamics because rather than being Poisson distributed, new media intervention is strategically motivated.
  • As a second order effect, since new media acting strategically, it must make predictive assessments of audience receptivity. New media suppliers must anticipate and cultivate demand. But demand is driven partly by environmental factors like information availability. See these notes on Dewey’s ethical theory for how taste can be due to environmental adaptation with no truly intrinsic desire–hence, the inappropriateness of modeling these dynamics straightforwardly with ‘utility functions’–which upsets neoclassical market modeling techniques. Hence the ‘social media marketer’ position that engages regularly in communication with an audience in order to cultivate a culture that is also a media market. Microcelebrity practices achieve not merely a passively received branding but an actively nurtured communicative setting. Communication here is transmission (Shannon, etc.) and/or symbolic interaction, on which community (Carey) supervenes.
  • Though not driven be neoclassical market dynamics simpliciter, new media is nevertheless competitive. We should expect new media suppliers to be fluidly territorial. The creates a higher-order incentive for curatorial intervention to maintain and distinguish ones audience as culture. A critical open question here is to what extent these incentives drive endogenous differentiation, vs. to what extent media fragmentation results in efficient allocation of information (analogously to efficient use of information in markets.) There is no a priori reason to suppose that the ad hoc assemblage of media infrastructures and regulations minimizes negative cultural externalities. (What are examples of negative cultural externalities? Fascism, ….)
  • Different media markets will have different dialects, which will have different expressive potential because of description lengths of concepts. (Algorithmic information theoretic interpretation of weak Sapir-Whorf hypothesis.) This is unavoidable because man is mortal (cannot approach convergent limits in a lifetime.) Some consequences (which have taken me a while to come around to, but here it is):
    1. Real intersubjective agreement is only provisionally and locally attainable.
    2. Language use, as a practical effect, has implications for future computational costs and therefore is intrinsically political.
    3. The poststructuralists are right after all. ::shakes fist at sky::
    4. That’s ok, we can still hack nature and create infrastructure; technical control resonates with physical computational layers that are not subject to wetware limitations. This leaves us, disciplinarily, with post-positivist engineering, post-structuralist hermeneutics enabling only provisional consensus and collective action (which can, at best, be ‘society made durable’ via technical implementation or cultural maintenance (see above on media market making), and critical reflection (advancing social computation directly).
  • There is a challenge to Pearl/Woodward causality here, in that mechanistic causation will be insensitive to higher-order effects. A better model for social causation would be Luhmann’s autopoieisis (c.f Brier, 2008). Ecological modeling (Ulanowicz) provides the best toolkit for showing interactions between autopoietic networks?

This is not helping me write my dissertation prospectus at all.

by Sebastian Benthall at March 28, 2014 05:23 PM

March 27, 2014

Ph.D. alumna

Parentology: The first parenting book I actually liked

As a researcher and parent, I quickly learned that I have no patience for parenting books. When I got pregnant, I started trying to read parenting books and I threw more than my fair share of them across the room. I either get angry at the presentation of the science or annoyed at the dryness of the writing. Worse, the prescriptions make me furious because anyone who tells you that there’s a formula to parenting is lying. My hatred of parenting books was really disappointing because I didn’t want to have to do a literature review whenever I wanted to know what research said about XYZ. I actually want to understand what the science says about key issues of child development, childrearing, and parenting. But I can’t stomach the tone of what I normally encounter.

So when I learned that Dalton Conley was writing a book on parenting, my eyebrows went up. I’ve always been a huge fan of his self-deprecating autobiographical book Honky because it does such a fantastic job of showcasing research on race and class. This made me wonder what he was going to do with a book on parenting.

Conley did not disappoint. His new book Parentology is the first parenting book that I’ve read that I actually enjoyed and am actively recommending to others. Conley’s willingness to detail his own failings, neuroses, and foolish logic (and to smack himself upside the head with research data in the process) showcases the trials and tribulations of parenting. Even experts make a mess of everything, but watching them do so so spectacularly lets us all off the hook. If you read this book, you will learn a lot about parenting, even if it doesn’t present the material in a how-to fashion. Instead, this book highlights the chaos that ensues when you try to implement science on the ground. Needless to say, hilarity ensues.

If you need some comedy relief, pick up this book. It’s a fantastic traversal of contemporary research presented in a fashion that will have you rolling on the floor laughing. Lesson #1: If you buy your children pet guinea pigs to increase their exposure to allergens, make sure that they’re unable to mate.

by zephoria at March 27, 2014 07:56 PM

March 22, 2014

Ph.D. student

Knight News Challenge applications

The Knight News Challenge applications are in and I find them a particularly exciting batch this year, perhaps because of a burst of activity spurred on by a handful of surveillance revelations you might have heard about. I read through all 660: below are my list of promising applications from friends and colleagues. I’m sure there are many more awesome ones, including some I already “applauded”, but I thought a starter list would still be useful. Go applaud these and add comments to help them improve.

Which are your favorites that I’ve missed? I’m keeping a running list here:

Encrypt all the things

Mailpile - secure e-mail for the masses!

Making secure email (using the OpenPGP standard) easier by developing an awesome native email client where encryption is built-in. They already have an alpha running that you might have seen on Kickstarter.

Encryption Usability Prize

Peter Eckersley, just over the Bay at EFF, wants to develop criteria for an annual prize for usable encryption software. (Noticing a theme to these encryption projects yet?) Notes SOUPS (CMU’s conference on usable security, happening this summer at Facebook) as a venue for discussion.

LEAP Encryption Access Project: Tools for Creating an Open, Federated and Secure Internet

LEAP ( is a project for developing a set of encryption tools, including proxies, email (with automatic key discovery) and chat, in an effort to make encryption the default for a set of at-risk users. (My colleague Harry Halpin at W3C works with them, and it all sounds very powerful.)

TextSecure: Simple Private Communication For Everyone

TextSecure is likely the most promising protocol and software project for easy-to-use widely adopted asynchronous encrypted messaging. (Android users should be using the new TextSecure already, fyi; it basically replaces your SMS app but allows for easy encryption.) Moxie (formerly of Twitter) is pretty awesome and it’s an impressive team.


Speaking of encryption, there are two proposals for standards work directly related to encryption and security.

Advancing DANE (DNS-Based Authentication of Named Entities) to Secure the Internet’s Transport Layer

This one may sound a little deep in the weeds, but DANE is a standard which promises end-to-end transport security on the Internet via DNSSEC, without relying on the brittle Certificate Authority system. Yay IETF!

Improved Privacy and Security through Web Standards

My colleagues at W3C are working on WebCrypto — a set of APIs for crypto to be implemented in the browser so that all your favorite Web applications can start implementing encryption without all making the same mistakes. Also, and this is of particular interest to me, while we’ve started to do privacy reviews of W3C specs in general via the Privacy Interest Group, this proposal suggests dedicated staff to provide privacy/security expertise to all those standards groups out there from the very beginning of their work.

Open Annotations for the Web (with lots of I School connections!) has been contributing to standards for Web annotations, so that we can all share the highlights and underlines and comments we make on web pages; they’re proposing to hire a developer to work with W3C on those standards.

Open Notice & Consent Receipts

A large handful of us I School alumni have been working in some way or another on the idea of privacy icons or standardized privacy notices. Mary Hodder proposes funding that project, to work on these notices and a “consent receipt” so you’ll know what terms you’ve accepted once you do.

Documenting practices, good and bad

Usable Security Guides for Strengthening the Internet

Joe Hall, CDT chief technologist and I School alumnus extraordinaire, has an awesome proposal for writing guides for usable security. Because it doesn’t matter how good the technology is if you don’t learn how to use it.

Transparency Reporting for Beginners: A Starter Kit and Best Practices Guide for Internet Companies, and a Readers’ Guide for Consumers, Journalists, & Advocates

Kevin Bankston (formerly CDT, formerly formerly EFF) suggests a set of best practices for transparency reports, the new hot thing in response to surveillance, but lacking standards and guidelines.

The positive projects in here naturally seem easier to build and less-likely to attract controversy, but these evaluative projects might also be important for encouraging improvement:

Ranking Digital Rights: Holding tech companies accountable on freedom of expression and privacy

@rmack on annual ranking of companies on their free expression and privacy practices.

Exposing Privacy and Security Practices: An online resource for evaluation and advocacy

CDT’s Justin Brookman on evaluating data collection and practices, particularly for news and entertainment sites.

IndieWeb and Self-Hosting

IndieWeb Fellowships for the Independent and Open Web

I’ve been following and participating in this #indieweb thing for a while now. While occasionally quixotic, I think the trend of building working interoperable tools that rely as little as possible on large centralized services is one worth applauding. This proposal from @caseorganic suggests “fellowships” to fund the indie people building these tools.

Idno: a collective storytelling platform that supports the diversity of the web

And @benwerd ( is one of these people building easy-to-use software for your own blog, not controlled by anyone else. Idno is sweet software and Ben and Erin are really cool.


Even if you had your own domain name, would you still forward all your email through GMail or Hotmail or some free webmail service with practices you might not understand or appreciate? This project is for “a one-click, easy-to-deploy SMTP server: a mail server in a box.”

Superuser: Internet homeownership for anyone

Eric Mill (@konlone) has been working on a related project, to make it end-user easy to install self-hosted tools (like Mail-in-a-box, or personal blog software, or IFTTT) on a machine you control, so that it’s not reserved for those of us who naturally take to system administration. (Also, Eric is super cool.)

by at March 22, 2014 11:15 PM

March 21, 2014

Ph.D. alumna

Why Snapchat is Valuable: It’s All About Attention

Most people who encounter a link to this post will never read beyond this paragraph. Heck, most people who encountered a link to this post didn’t click on the link to begin with. They simply saw the headline, took note that someone over 30 thinks that maybe Snapchat is important, and moved onto the next item in their Facebook/Twitter/RSS/you-name-it stream of media. And even if they did read it, I’ll never know it because they won’t comment or retweet or favorite this in any way.

We’ve all gotten used to wading in streams of social media content. Open up Instagram or Secret on your phone and you’ll flick on through the posts in your stream, looking for a piece of content that’ll catch your eye. Maybe you don’t even bother looking at the raw stream on Twitter. You don’t have to because countless curatorial services like digg are available to tell you what was most important in your network. Facebook doesn’t even bother letting you see your raw stream; their algorithms determine what you get access to in the first place (unless, of course, someone pays to make sure their friends see their content).

Snapchat offers a different proposition. Everyone gets hung up on how the disappearance of images may (or may not) afford a new kind of privacy. Adults fret about how teens might be using this affordance to share inappropriate (read: sexy) pictures, projecting their own bad habits onto youth. But this is isn’t what makes Snapchat utterly intriguing. What makes Snapchat matter has to do with how it treats attention.

When someone sends you an image/video via Snapchat, they choose how long you get to view the image/video. The underlying message is simple: You’ve got 7 seconds. PAY ATTENTION. And when people do choose to open a Snap, they actually stop what they’re doing and look.

In a digital world where everyone’s flicking through headshots, images, and text without processing any of it, Snapchat asks you to stand still and pay attention to the gift that someone in your network just gave you. As a result, I watch teens choose not to open a Snap the moment they get it because they want to wait for the moment when they can appreciate whatever is behind that closed door. And when they do, I watch them tune out everything else and just concentrate on what’s in front of them. Rather than serving as yet-another distraction, Snapchat invites focus.

Furthermore, in an ecosystem where people “favorite” or “like” content that is inherently unlikeable just to acknowledge that they’ve consumed it, Snapchat simply notifies the creator when the receiver opens it up. This is such a subtle but beautiful way of embedding recognition into the system. Sometimes, a direct response is necessary. Sometimes, we need nothing more than a simple nod, a way of signaling acknowledgement. And that’s precisely why the small little “opened” note will bring a smile to someone’s face even if the recipient never said a word.

Snapchat is a reminder that constraints have a social purpose, that there is beauty in simplicity, and that the ephemeral is valuable. There aren’t many services out there that fundamentally question the default logic of social media and, for that, I think that we all need to pay attention to and acknowledge Snapchat’s moves in this ecosystem.

(This post was originally published on LinkedIn. More comments can be found there.)

by zephoria at March 21, 2014 03:34 PM

March 20, 2014

Ph.D. student

real talk

So I am trying to write a dissertation prospectus. It is going…OK.

The dissertation is on Evaluating Data Science Environments.

But I’ve been getting very distracted by the politics of data science. I have been dealing with the politics by joking about them. But I think I’m in danger of being part of the problem, when I would rather be part of the solution.

So, where do I stand on this, really?

Here are some theses:

  • There is a sense of “data science” that is importantly different from “data analytics”, though there is plenty of abuse of the term in an industrial context. That claim is awkward because industry can easily say they “own” the term. It would be useful to lay out specifically which computational methods constitute “data science” methods and which don’t.
  • I think that it’s useful analytically to distinguish different kinds of truth claims because it sheds light on the value of different kinds of inquiry. There is definitely a place for rigorous interpretive inquiry and critical theory in addition to technical, predictive science. I think politicing around these divisions is lame and only talk about it to make fun of the situation.
  • New computational science techniques have done and will continue to do amazing work in the physical and biological and increasingly environmental sciences. I am jealous of researchers in those fields because I think that work is awesome. For some reason I am a social scientist.
  • The questions surrounding the application of data science to social systems (which can include environmental systems) are very, very interesting. Qualitative researchers get defensive about their role in “the age of data science” but I think this is unwarranted. I think it’s the quantitative social science researchers who are likely more threatened methodologically. But since I’m not well-trained as a quantitative social scientist really, I can’t be sure of that.
  • The more I learn about research methods (which seems to be all I study these days, instead of actually doing research–I’m procrastinating), the more I’m getting a nuanced sense of how different methods are designed to address different problems. Jockeying about which method is better is useless. If there is a political battle I think is worth fighting any more, it’s the battle about whether or not transdisciplinary research is productive or possible. I hypothesize that it is. But I think this is an empirical question whose answer may be specific: how can different methods be combined effectively? I think this question gets quite deep and answering it requires getting into epistemology and statistics in a serious way.
  • What is disruptive about data science is that some people have dug down into statistics in a serious way, come up with a valid general way of analyzing things, and then automated it. That makes it in theory cheaper to pick up and apply than the quantitative techniques used by other researchers, and usable at larger scale. On the whole this is pretty good, though it is bad when people don’t understand how the tools they are using work. Automating science is a pretty good thing over all.
  • It’s really important for science, as it is automated, to be built on open tools and reproducible data because (a) otherwise there is no reason why it should have the public trust, (b) because it will remove barriers to training new scientists.
  • All scientists are going to need to know how to program. I’m very fortunate to have a technical background. A technical background is not sufficient to do science well. One can use technical skills to assist in both qualitative (visualization) and quantitative work. The ability to use tools is orthogonal to the ability to study phenomena, despite the historic connection between mathematics and computer science.
  • People conflate programming, which is increasingly a social and trade skill, with the ability to grasp high level mathematical concepts.
  • Computers are awesome. The people that make them better deserve the credit they get.
  • Sometimes I think: should I be in a computer science department? I think I would feel better about my work if I were in CS. I like the feeling of tangible progress and problem solving. I think there are a lot of really important problems to solve, and that the solutions will likely come from computer science related work. What I think I get from being in a more interdisciplinary department is a better understanding of what problems are worth solving. I don’t mean that in a way that diminishes the hard work of problem solving, which I think is really where the rubber hits the road. It is easy to complain. I don’t work as hard as computer science students. I also really like being around women. I think they are great and there aren’t enough of them in computer science departments.
  • I’m interested in modeling and improving the cooperation around open scientific software because that’s where I see there some real potential value add. I’ve been and engineer and I’ve managed engineers. Managing engineers is a lot harder than engineering, IMO. That’s because management requires navigating a social system. Social systems are really absurdly complicated compared to even individual organisms.
  • There are three reasons why it might be bad to apply data science to social systems. The first is that it could lead to extraordinarily terrible death robots. My karma is on the line. The second is that the scientific models might be too simplistic and lead to bad decisions that are insensitive to human needs. That is why it is very, very important that the existing wealth of social scientific understanding is not lost but rather translated into a more robust and reproducible form. The third reason is that social science might be in principle impossible due to its self-referential effects. This would make the whole enterprise a collosal waste of time. The first and third reasons frequently depress me. The second motivates me.
  • Infrastructure and mechanism design are powerful means of social change, perhaps the most powerful. Movements are important but civil society is so paralyzed by the steering media now that it is more valuable to analyze movements as sociotechnical organizations alongside corporations etc. than to view them in isolation from the technical substrate. There are a variety of ideological framings of this position, each with different ideological baggage. I’m less concerned with that, ultimately, than the pragmatic application of knowledge. I wish people would stop having issues with “implications for design.”
  • I said I wanted to get away from politics, but this is one other political point I actually really think is worth making, though it is generally very unpopular in academia for obvious reasons: the status differential between faculty and staff is an enormous part of the problem of the disfunction of universities. A lot of disciplinery politics are codifications of distaste for certain kinds of labor. In many disciplines, graduate students perform labor unexpertly in service of their lab’s principal investigators; this labor is a way of paying ones dues that has little to do with the intellectual work of their research expertise. Or is it? It’s entirely unclear, especially when what makes the difference between a good researcher and a great one are skills that have nothing to do with their intellectual pursuit, and when master new tools is so essential for success in ones field. But the PIs are often not able to teach these tools. What is the work of research? Who does it? Why do we consider science to be the reserve of a specialized medieval institution, and call it something else when it is done by private industry? Do academics really have a right to complain about the rise of the university administrative class?

Sorry, that got polemical again.

by Sebastian Benthall at March 20, 2014 05:45 AM

March 17, 2014

Ph.D. alumna

TIME Magazine Op-Ed: Let Kids Run Wild Online

I wrote the following op-ed for TIME Magazine. This was published in the March 13, 2014 issue under the title “Let Kids Run Wild Online.” To my surprise and delight, the op-ed was featured on the cover of the magazine.

Trapped by helicopter parents and desperate to carve out a space of their own, teens need a place to make mistakes.

Bicycles, roller skates and skateboards are dangerous. I still have scars on my knees from my childhood run-ins with various wheeled contraptions. Jungle gyms are also dangerous; I broke my left arm falling off one. And don’t get me started on walking. Admittedly, I was a klutzy kid, but I’m glad I didn’t spend my childhood trapped in a padded room to protect me from every bump and bruise.

“That which does not kill us makes us stronger.” But parents can’t handle it when teenagers put this philosophy into practice. And now technology has become the new field for the age-old battle between adults and their freedom-craving kids.

Locked indoors, unable to get on their bicycles and hang out with their friends, teens have turned to social media and their mobile phones to gossip, flirt and socialize with their peers. What they do online often mirrors what they might otherwise do if their mobility weren’t so heavily constrained in the age of helicopter parenting. Social media and smartphone apps have become so popular in recent years because teens need a place to call their own. They want the freedom to explore their identity and the world around them. Instead of sneaking out (should we discuss the risks of climbing out of windows?), they jump online.

As teens have moved online, parents have projected their fears onto the Internet, imagining all the potential dangers that youth might face–from violent strangers to cruel peers to pictures or words that could haunt them on Google for the rest of their lives.

Rather than helping teens develop strategies for negotiating public life and the potential risks of interacting with others, fearful parents have focused on tracking, monitoring and blocking. These tactics don’t help teens develop the skills they need to manage complex social situations, assess risks and get help when they’re in trouble. Banning cell phones won’t stop a teen who’s in love cope with the messy dynamics of sexting. “Protecting” kids may feel like the right thing to do, but it undermines the learning that teens need to do as they come of age in a technology-soaked world.

The key to helping youth navigate contemporary digital life isn’t more restrictions. It’s freedom–plus communication. Famed urban theorist Jane Jacobs used to argue that the safest neighborhoods were those where communities collectively took interest in and paid attention to what happened on the streets. Safety didn’t come from surveillance cameras or keeping everyone indoors but from a collective willingness to watch out for one another and be present as people struggled. The same is true online.

What makes the digital street safe is when teens and adults collectively agree to open their eyes and pay attention, communicate and collaboratively negotiate difficult situations. Teens need the freedom to wander the digital street, but they also need to know that caring adults are behind them and supporting them wherever they go. The first step is to turn off the tracking software. Then ask your kids what they’re doing when they’re online–and why it’s so important to them.

by zephoria at March 17, 2014 01:31 AM

March 16, 2014


"Slide down my cellar door"

In a 2010 NYT “On Language” column, Grant Barrett traced the claim that “cellar door” is the most beautiful phrase in English back as far as 1905 1903. I posted on the phrase a few years ago ("The Romantic Side of Familiar Words"), suggesting that there was a reason why linguistic folklore fixed  on that particular phrase, when you could make the same point with other pedestrian expressions like linoleum or oleomargarine:

…The undeniable charm of the story — the source of the enchantment that C. S. Lewis reported when he saw cellar door rendered as Selladore — lies the sudden falling away of the repressions imposed by orthography … to reveal what Dickens called "the romantic side of familiar things." … In the world of fantasy, that role is suggested literally in the form of a rabbit hole, a wardrobe, a brick wall at platform 9¾. Cellar door is the same kind of thing, the expression people use to illustrate how civilization and literacy put the primitive sensory experience of language at a remove from conscious experience.

But that doesn't explain why the story emerged when it did. Could it have had to do with the song "Playmates," with its line "Shout down my rain barrel, slide down my cellar door"? There's no way to know for sure, but the dates correspond, and in fact those lines had an interesting life of their own…

"Playmates" was a big hit for Philip Wingate and Henry W. Petrie in in 1894,  in an age swilling in lachrymose sentimentality about childhood. The original lyrics were:

Say, say, oh playmate,
Come out and play with me
And bring your dollies three,
Climb up my apple tree.
Shout down my rain barrel,
Slide down my cellar door,
And we'll be jolly friends forevermore.

Wingate and Petrie followed it up in the same year with an even more popular sequel, “I Don’t Want to Play in Your Yard,” which containted the phrase “You’ll be sorry when you see me sliding down our cellar door." The song figures a couple of times in the 1981 Warren Beatty movie Reds, most unforgettably as sung by Peggy Lee.

In various forms, “slide down my cellar door” became a kind of catchphrase to suggest innocent friendship. In an 1896 letter to a friend, the poet Vaughan Moody wrote “Are n’t [sic] you going to speak to me again? Is my back-yard left irredeemably desolate? Have your rag dolls and your blue dishes said inexorable adieu to my cellar-door? The once melodious rain barrel answers hollow and despairing to my plaints….”

More generally, “You shan’t slide down my cellar door,” and the like were invoked to suggest childish truculence. Google Books and Newspaperarchive turn up numerous hits, which don’t tail off until the 1930s or so.

I would not let an operator that did not have a card, carry my lunch basket or slide down my cellar door: not to say give him a "square" or fix him for a ride over the road. ‪Trans-Communicator, 1895

Commenting on a recent press dispatch Spain has refused the customary permission to the British garrison at Gibraltar to play polo and golf on Spanish territory, the Baltimore Sun says : — " This suggests the stern retaliatory methods of childhood : ' You shan't play in my back yard, you shan't slide down my cellar door.'  National Review, 1898

If you see my friend Prince Krapotpin tell him I should be glad to have him holler down my rain barrel or slide down my cellar door any time. It is a hard thing to be a czar. Oak Park (IL) Argus, 1901

William Waldorf Astor seems to have carried into maturity the youthful feelings so beautifully expressed in ballads of the " you can't slide down my cellar door " school. Munsey’s magazine, 1901

And Greece has said to Roumania, "You can't slide down my cellar-door any more." Religious Telescope, 1906

I am not desirous of having him slide down my cellar door. So far as I am concerned he can stay in his own back-yard, his own puddle or whatever his habitat may be. Louisiana Conservation Review, 1940

The Abbe was gentle and courteous, not to say whimsical, and the very soul of cheerfulness, cordiality, and hospitality, but the blunt fact remained that he wouldn't play ball in my back lot or slide down my cellar door. Wine Journeys 1949

That’s the last instance of the phrase that I can find where it's used that way. The song “Playmates” enjoyed a renewed popularity when it was recorded by Kay Kyser in 1940 and of course remains popular as a children’s clapping song today. (Willie Nelson recorded a version a version a few years ago.) Notably, Kyser substituted “look down my rain barrel” for “shout down my rain barrel,” the acoustic charms of rain barrels having faded from memory along with the containers themselves, even as sloping exterior cellar doors were becoming scarce. A 1968 article in the Lima (Ohio) News began:

“Shout down my rain barrel, Slide down my cellar door, And we'll be jolly friends forever more.”   Modern kids would have a hard time making friends that way, for gone are the rain barrels and outside cellar doors. Lima (Ohio) News 1968

Could the songs have been the immediate inspiration for the claim that “cellar door” is the most beautiful phrase in the English language? Well, the dates are suggestive, particularly given that the phrase was literally in air when the  claim first emerged,  and occasionally, no doubt,  mondagreenized into something else (the way later generations often transform "rain barrel" to "rainbow"). And I think it counts for something that the perception of the phrase's beauty requires a regressive capacity, as I put it in the earlier post, to "transcend not just its semantics but its orthography, to recover the pre-alphabetic innocence that comes when we let 'the years of reading fall away,' in Auden's phrase, and attune ourselves with sonorities that are hidden from the ear behind the overlay of writing"—that is, you have assume, as the songs ask you to, a child's point of view.

But this account of the origin will be have be left speculative—unless, or course, someone digs up a pre-1894 citation for the claim, in which case the theory is toast.

by Geoff Nunberg at March 16, 2014 08:37 PM

March 11, 2014

MIMS 2012

Did you A/B test the redesigned preview tool?

A lot of people have asked me if we A/B tested the redesigned preview tool. The question comes in two flavors: did we use A/B testing to validate impersonation was worth building (a.k.a. fake door testing); and, did we A/B test the redesigned UI against the old UI? Both are good questions, but the short answer is no. In this post I’m going to dig into both and explain why.

Fake Door Testing

Fake door testing (video) is a technique to measure interest in a new feature by building a “fake door” version of it that looks like the final version (but doesn’t actually work) and measuring how many people engage with it. Trying to use the feature gives users an explanation of what it is, that it’s “Coming soon”, and usually the ability to “vote” on it or send feedback (the specifics vary depending on the context). This doesn’t need to be run as an A/B test, but setting it up as one lets you compare user behavior.

We could have added an “Impersonate” tab to the old UI, measured how many people tried to use it, and gathered feedback. This would have been cheap and easy. But we didn’t do this because the feature was inspired by our broader research around personalization. Our data pointed us towards this feature, so we were confident it would be worthwhile.

But more than that, measuring clicks and votes doesn’t tell you much. People can click out of curiosity, which doesn’t tell you if they’d actually use the feature or if it solves a real problem. Even if people send feedback saying they’d love it, what users say and what they do is different. Actually talking to users to find pain points yields robust data that leads to richer solutions. The new impersonate functionality is one such example — no one had thought of it before, and it wasn’t on our feature request list or product roadmap.

However, not everyone has the resources to conduct user research. In that situation, fake door testing is a good way of cheaply getting feedback on a specific idea. After all, some data is better than no data.

A/B Testing the Redesigned UI

The second question is, “Did you A/B test the redesigned preview against the previous one?” We didn’t, primarily because there’s no good metric to use as a conversion goal. We added completely new functionality to the preview tool, so most metrics are the equivalent of comparing apples to oranges. For example, measuring how many people impersonate a visitor is meaningless because the old UI doesn’t even have that feature.

So at an interface level, there isn’t a good measurement. But what about at the product level? We could measure larger metrics, such as the number of targeted experiments being created, to see if the new UI has an effect. There are two problems with this. First, it will take a long time (many months) to reach a statistically significant difference because the conversion rate on most product metrics are low. Second, if we eventually measured a difference, there’s no guarantee it was caused by adding impersonation. The broader the metric, the more factors that influence it (such as other product updates). To overcome this you could freeze people in the “A” and “B” versions of the product, but given how long it takes to reach significance, this isn’t a good idea.

Companies like Facebook and Google have enough traffic that they actually are able to roll out new features to a small percentage of users (say, 5%), and measure the impact on their core metrics. If any take a plunge, they revert users to the previous UI and keep iterating. When you have the scale of Facebook and Google, you can get significant data in a day. Unfortunately, like most companies, we don’t have this scale, so it isn’t an option.

So How Do You Know The Redesign Was Worth It?

What people are really asking is how do we know the redesign was worth the effort? Being at an A/B testing company, everyone wants to A/B test everything. But in this case, there wasn’t a place for it. Like any method, split testing has its strengths and weaknesses.

No, Really — How Do You Know The Redesign Was Worth It?

Primarily via qualitative feedback (i.e. talking to users), which at a high level has been positive (but there are some improvements we can make). We’re also measuring people’s activities in the preview tool (e.g. changing tabs, impersonating visitors, etc.). So far, those are healthy. Finally, we’re keeping an eye on some product-level metrics, like the number of targeted experiments created. These metrics are part of our long-term personalization efforts, and we hope in the long run to see them go up. But the preview tool is just one piece of that puzzle, so we don’t expect anything noticeable from that alone.

The important theme here is that gathering data is important to ensure you’re making the best use of limited resources (time, people, etc.). But there’s a whole world of data beyond A/B testing, such as user research, surveys, product analytics, and so on. It’s important to keep in mind the advantages and disadvantages of each, and use the most appropriate ones at your disposal.

by Jeff Zych at March 11, 2014 04:50 PM

March 08, 2014

MIMS 2014

5 Minutes of Fame

On the Internet, everyone gets a chance to be famous, even if it lasts for all of 5 minutes. Last year, me and a couple of friends from school worked on this visualization for a class project. Two weeks later, we were featured on LifeHacker, and we thought that was our 5 minutes of fame.

Now, almost a year later, the FlowingData blog (which I love), picked it up and featured it yet again. It has set off a domino-like reaction with multiple sites and people talking about it, and also referring to our creation as the ‘Pandora of Beers’.

What I find most interesting however is how there are multiple versions of our story, the most common of which is that we are Stanford students – even though we are Berkeley students who used a Stanford dataset (a fact clearly mentioned on the website). Watching this story get re-tweeted, and republished is an interesting study of viral effects, and how some inaccuracies get pushed far and wide across the web.

Ah, well – at the very least I can say that we did get more than our share of the 5 minutes of fame.

by muchnessofd at March 08, 2014 09:28 PM

March 03, 2014

Ph.D. alumna

What’s Behind the Free PDF of “It’s Complicated” (no, no, not malware…)

As promised, I put a free PDF copy of “It’s Complicated” on my website the day the book officially launched. But as some folks noticed, I didn’t publicize this when I did so. For those who are curious as to why, I want to explain. And I want you to understand the various issues at play for me as an author and a youth advocate.

I didn’t write this book to make money. I wrote this book to reach as wide of an audience as I possibly could. This desire to get as many people as engaged as possible drove every decision I made throughout this process. One of the things that drew me to Yale was their willingness to let me put a freely downloadable CC-licensed copy of the book online on the day the book came out. I knew that trade presses wouldn’t let a first time author pull that one off. Heck, they still get mad at Paulo Coelho for releasing his books online and he’s sold more books worldwide than anyone else!

As I prepared for publication, it became clear that I really needed other people’s help in getting the word out. I needed journalistic enterprises to cover the book. I needed booksellers to engage with the book. I needed people to collectively signal that this book was important. I needed people to be willing to take a bet on me. When one of those allies asked me to wait a week before publicizing the free book, I agreed.

If you haven’t published a book before, it’s pretty unbelievable to see all of the machinery that goes into getting the book out once the book exists in physical form. News organizations want to promote books that will be influential or spark a conversation, but they are also anxious about having their stories usurped by others. Booksellers make risky decisions about how many copies they think they can sell ahead of time and order accordingly. (And then there’s the world of paying for placement which I simply didn’t do.) Booksellers’ orders – as well as actual presales – are influential in shaping the future of a book, just like first weekend movie sales matter. For example, these sales influence bestseller and recommendation lists. These lists are key to getting broader audiences’ attention (and for getting the attention of certain highly influential journalistic enterprises). And, as an author trying to get a message out, I realized that I needed to engage with this ecosystem and I needed all of these actors to believe in my book.

The bestseller aspect of this is the part that I struggle with the most. I don’t actually care whether or not my book _sells_ a lot; I care whether or not it’s _read_ a lot. But there’s no bestread-ed list (except maybe Goodreads). And while many books that are widely sold aren’t widely read, most books that are widely read are widely sold. My desire to be widely read is why I wanted to make the book freely available from the getgo. I get that not everyone can afford to buy the book. I get that it’s not available in certain countries. I get that people want to check it out first. I get that we haven’t figured out how to implement ‘grep’ in physical books. So I really truly get the importance of making the book accessible.

But what I started to realize is that when people purchase the book, they signal to outside folks that the book is important. This is one of the reasons that I asked people who value this book to buy it. For them or for others. I love it when people buy the book and give it away to a poor grad student, struggling parent, or library. I don’t know if I’ll make any bestseller list, but the reason I decided to try is because sales rankings – especially in the first few weeks of a book’s life – really do help attract more attention which is key to getting the word out. And so I’ve begged and groveled, asking people to buy my book even though it makes me feel squeamish, solely because I know that the message that I want to offer is important. So, to be honest, if you are going to buy the book at some point, I’d really really appreciate it if you’d buy a copy. And sooner rather than later. Your purchasing decisions help me signal to the powers that be that this book is important, that the message in the book is valuable.

That said, if you don’t have the resources or simply don’t want to, don’t buy it. I’m cool with that. I’m beyond delighted to give the book away for free to anyone who wants to read it, assign it in their classes, or otherwise engage with it. If you choose to download it, thank you! I’m glad you find it valuable!

If you feel like giving back, I have a request. Please help support all of the invisible people and organizations that helped get word of my book out there. I realize that there are folks out there who want to “support the author,” but my ask of you is to help me support the whole ecosystem that made this possible.

Go buy a different book from Yale University Press to thank them for being willing to publish me. Buy a random book from an independent bookseller to say thank you (especially if you live near Harvard Book Store, Politics & Prose, or Book People). Visit The Guardian and click on their ads to thank them for running a first serial. Donate to NPR for their unbelievable support in getting the word out. Buy a copy or click on the ads of BoingBoing, Cnet, Fast Company, Financial Times, The Globe & Mail, LA Times, Salon, Slate, Technology Review, The Telegraph, USA Today, Wired, and the other journalistic venues whose articles aren’t yet out to thank them for being so willing to cover this book. Watch the ads on Bloomberg and MSNBC to send them a message of thanks. And take the time to retweet the tweets or write a comment on the blogs of the hundreds of folks who have been so kind to write about this book in order to get the word out. I can’t tell you how grateful I am to all of the amazing people and organizations who have helped me share what I’ve learned. Please shower them in love.

If you want to help me, spread the message of my book as wide as you possibly can. I wrote this book so that more people will step back, listen, and appreciate the lives of today’s teenagers. I want to start a conversation so that we can think about the society that we’re creating. I will be forever grateful for anything that you can do to get that message out, especially if you can help me encourage people to calm down and let teenagers have some semblance of freedom.

More than anything, thank *you* soooo much for your support over the years!!! I am putting this book up online as a gift to all of the amazing people who have been so great to me for so long, including you. Thank you thank you thank you.


PS: Some folks have noticed that Amazon seems to not have any books in stock. There was a hiccup but more are coming imminently. You could wait or you could support IndieBound, Powell’s, Barnes & Noble, or your local bookstore.

by zephoria at March 03, 2014 04:52 PM

March 02, 2014

MIMS 2012

When Do You Do User Testing?

In response to my preview redesign post, my uncle asked, “When does the designer go with their own analysis of a design and when do they do a usability test?” I’ve been asked this question a lot, and we often discuss it in product development meetings. It’s also something I wanted to elaborate on more in the post, but it didn’t fit. So I will take this opportunity to roughly outline when we do user testing and why.

The Ideal

In an ideal world, we would be testing all the time. Having a tight feedback loop is invaluable. By which I mean, being able to design a solution to a problem and validating that it solves said problem immediately would be amazing. And in product design, the best validation is almost always user testing.

The Reality

In reality, there’s no such thing as instant feedback. You can’t design something and receive immediate feedback. There’s no automated process that tells you if a UI is good or not. You have to talk to actual humans, which takes time and effort.

This means there’s a trade-off between getting as much feedback as possible, and actually releasing something. The more user testing you do, the longer it will take to release.

What we do at Optimizely

At Optimizely, we weigh the decision to do user testing against deadlines, how important the questions are (e.g. are they core to the experience, or an edge case), how likely we are to get actionable insights (e.g. testing colors will usually give you a bunch of conflicting opinions), and what other research we could be doing instead (i.e. opportunity cost).

With the preview redesign, we started with exploratory research to get us on the right track, but didn’t do much testing beyond that. This was mainly because we didn’t make time for it. It was clear from our generative research that adding impersonation to the preview tool would be a step in the right direction. But it’s only one part of a larger solution, and won’t be used by everyone. We didn’t want to slow down our overall progress by spending too much time trying to perfect one piece of a much larger puzzle.

So with the preview tool, I had to rely on my instincts and feedback from other designers, engineers, and product managers to make decisions. One such example is when I decided to hide the impersonation feature by default, and it would slide out when an icon is clicked. Of this solution I said:

But it worked too well [at solving the problem of the impersonation UI being distracting]. The icon was too cryptic, and overall the impersonate functionality was too hidden for anyone to find.

As my uncle pointed out, I didn’t do any user testing to make this call. I looked at what I had created, and could tell it wasn’t a great solution (which was also confirmed by my teammates). I was confident the decision to keep iterating was right based on established usability heuristics and my own experience.

However, not all design decisions can be guided by general usability guidelines. One such example is when I went in circles designing how a person sets impersonation values. I tried using established best practices and talking to other designers, but neither method led me to an answer. At this point, user testing was my only out.

In this case, we opted for guerrilla usability testing. Recruiting actual users would have required more time and resources than we wanted to spend. So I called in two of our sales guys, who made for good proxies of semi-experienced users that are technically middle-of-the-road (i.e. they have some basic code knowledge, but aren’t developers), which covers the majority of our users. Their feedback made the decision easy, and successfully got me out of this jam.

In Summary

In a perfect world we would be testing all the time, but in reality that just isn’t feasible. So we do our best to balance the time and resources required to test a UI against the overall importance of that feature. But usability testing won’t find every flaw. Eventually you have to ship — you can’t refine forever, and no design is perfect.

by Jeff Zych at March 02, 2014 11:37 PM

February 25, 2014

Ph.D. alumna

want a signed copy of “It’s Complicated”?

Today is the official publication date of “It’s Complicated: The Social Lives of Networked Teens”. While many folks have received their pre-orders already, this is the date in which all U.S. book stores that promised to carry the book at launch should have a copy. It’s also the day in which I officially start my book tour in Cambridge.

In many ways, I’m thinking of my book tour as a thank-you tour. I’m trying to visit as many cities as I can that have been really good to me over the years. Some are cities where I was educated. Some are field site cities. Some are places where there are people who have been angels in my life. But I sadly won’t get everywhere. Although the list of events is not complete, I have discussions underway to be in the following cities this spring: Cambridge, DC, Seattle, Austin, Nashville, Berkeley/SF, Providence, Charlottesville, and Chicago. After that, I’m going to need to take a break. But I’m really hoping to see lots of friends and allies in the cities I visit. And I want to offer a huge apology to those outside of the US who have been so amazing to me. Given my goal of seeing my young son every week amidst this crazy, I simply can’t do an international tour right now.

I know that I won’t be able to get everywhere and I know folks have been asking me for signed copies, so I want to make an offer. Buy a book this week and send it to me and I will sign it and send it back to you signed. You can either buy it from your favorite bookseller and then mail it to me or have it shipped directly from your favorite online retailer.

danah boyd
Microsoft Research
641 6th Ave, 7th Floor
New York, NY 10011

When you send me the book, include a note (“gift note” for online retailers) that includes your name, email (in case something goes wrong), snail mail address for shipment, and anything I should know when signing the book.

If you have a local bookseller that’s selling it, start there. If you’d prefer to use an online retailer, my book is now available at:

IndieBoundPowellsAmazonBarnes & Noble

Thank you soooo much for all of your amazing support over the years! I wrote this book to share all that I’ve learned over the years, in the hopes that it will prompt people to step back and appreciate teens’ lives from their perspective. My goal is to share this book as widely as possible precisely because so many teens are struggling to get their voices heard. Fingers crossed that we can get this book into the hands of many people this week and that this, in turn, will prompt folks to spread the message further!


by zephoria at February 25, 2014 02:44 PM

MIMS 2011

Review of ‘code/space: software and everyday life’

This review was published in Environment and Planning B last year. I really loved the book and think that it’s a powerful reminder of the importance of context in thinking about how code does work in the world. 

Code/space: Software and Everyday Life By Rob Kitchin and Martin Dodge; MIT Press, Cambridge, London, 2011, 290 pages, ISBN: 978-0262042482

codepsaceKitchen and Dodge’s important new book,  Code/space: Software and Everyday Life,  opens with the crucial phrase – “software matters”. It matters, they argue, because software increasingly mediates our everyday lives – from the digital trail that extends just that little bit further when we order our morning coffee, to the data which is sent to a remote location about our electricity and gas usage from so-called “smart meters”, and the airport security databases that determine whether we are allowed to travel or not. The power of software is its ability to make our lives easier, improve efficiency and productivity; but such efficiencies come at the cost of pervasive surveillance, a feature that is producing a society that “never forgets”.

The key premise of the book is that there are two key gaps in the way that we talk about technology and society. The first critique is aimed at social science and humanities approaches that deal too much with the technologies that software enables rather than explaining the particular code that affects activity and behavior in different contexts. This is akin, the authors argue, to looking only at the effects of ill health on society, rather than also considering ‘the specifics of different diseases, their etiology (causes, origins, evolution, and implications), and how these manifest themselves in shaping social relations’ (p13).

Software studies, on the other hand, is a nascent research field that seeks ‘to open the black box of processors and arcane algorithms to understand how software – its lines and routines of code – does work in the world by instructing various technologies how to act’ (p13). The problem, they write, is that the majority of software studies are aspatial, presuming that space is merely a neutral backdrop against which human activity occurs. Here, Kitchin and Dodge’s critique is directed at scholars such as Lawrence Lessig, whose book Code and Other Laws of Cyberspace (Lessig, 1999) refers to code (in the form of software) as having the ability to automatically regulate activity online. But code, argue Kitchin and Dodge, is not law. It is neither universal nor deterministic, but rather contingent and relational. And space is not simply a container in which things happen but rather ‘subtly evolving layers of context and practices that fold together people and things and actively shape social relations’ (p13).

Kitchin and Dodge formulate two important concepts in arguing for a spatial approach to software studies. The first is ‘code/space’, or the moment when space and code are mutually dependent on one another (a check-in desk at an airport, for example) and ‘coded spaces’, on the other hand, which do not entirely depend on code to function (the use of Powerpoint slides during a presentation, for example). After detailing the types of software employed in the areas of home, travel and consumption, Kitchin and Dodge conclude with a Manifesto for Software Studies that sets out an agenda for software studies to produce ‘detailed case studies of how software does work in the world’ as well as ‘theoretical tools for describing how and explaining why, and the effects, of that work’ (p249). Here they propose studies comparing effects of code in rural Ireland and urban Manchester, for example, where code is analyzed in a manner that is sensitive to place and scale, cultural histories, and modes of activity (p249).

The concept that code is not universal or immutable but contingent and contextual is powerful, but Kitchen and Dodge have a tendency to analyze code in home, travel and consumption without any reference to differences in code’s impact in different places or spaces (the use of international passenger record databases when traveling from Rio de Janeiro, or trying to buy books on from Johannesburg, for example). Although they maintain that their intention is to provide a broad field for future research, the book would have been stronger with some analysis of the different ways in which code codes differing practices and how they are understood, used, and the different moments of its instantiation in places and/or spaces.

Where the book succeeds, and makes its most useful contribution, is in its lucid explanation of how detailed analyses of code are essential to understanding how software is becoming ingrained into our everyday lives, and how it has both an empowering and disciplining effect. Kitchin and Dodge are charting new disciplinary territory here, bringing together the fields of computer science, social science and spatial studies in highly promising ways. Theirs is an inspiring approach to how we might come to unveil the hidden choices behind the code that governs our everyday lives, and how we might come to understand a phenomenon that has an increasingly powerful role in society.


Lessig, L, 1999 Code: And Other Laws of Cyberspace (Basic Books, New York)

by Heather Ford at February 25, 2014 02:31 PM

MIMS 2014

Of Designing for People

A friend recently shared this ( article with me, and I found myself agreeing with every word in it. One of my biggest reasons for moving away from advertising, and into marketing was because I felt there was too much creativity for the sake of it. Trying to sell something to someone involves telling them why they need it, not pandering to a bunch of critics at an awards festival.

The same thing seems to be happening with UX now that it’s such a buzzword. It’s not uncommon to hear people interchange UX with UI (a fundamental mistake because then you are going to hire someone who is a great visual designer, but maybe not a great experience designer). Usable is not defined by beautiful, but by how well it works. Some of the most usable products are also arguably ugly. Steve Jobs said it best when he said that design is not about how something looks, but about how it works. UX cannot head down this path of pandering to beauty. It has to be based on an understanding of psychology, of human needs. We need to ensure that we make products that are usable and that do not create a cognitive overhead for our users. If that means we sacrifice our artistic sensibilities, then so be it.

by muchnessofd at February 25, 2014 02:14 AM

February 24, 2014

MIMS 2014

Are Airplane Delays Contagious?


Given a publicly available dataset on airplane departures and arrivals, I had a question: are airplane delays contagious? When one plane is late, what is the effect on the next flight to leave? Will it be less late? More late? Quadratically more late? In other words, how many cinnabons can I expect to get through waiting at the gate for my plane to finally leave the airport?

Answering this question is tricky. It’s not like I can track an airport to see one late flight, then the next, and infer that the first really infected the second. What if there was a storm that made both planes late? Using this dataset, it would be difficult to figure out if and how a delayed flight infects subsequent flights at a given airport. But what about the airport where the delayed airplane lands? Would that be any easier?

It’s not a perfect solution. For one thing, there are circumstances that can affect both airports (a nationwide security scare, for example). Also, oftentimes there are conditions at the destination airport that cause a delayed airplane to be delayed in the first place. In that case, the infection is really running in reverse. The truth is–familiar to many–that with observational data, one can never control all the levers that are driving the behavior of the data. I focus on the destination airport in the hope that by and large, delays at the two airports are unrelated. By looking at a lot of airports over a long period of time (6 months), I hope that potential confounders add only noise to the results–rather than systematic bias.

I also devise a strategy to deal with the reverse infection problem. If on a given day, an airport is experiencing delays before the arrival of a delayed flight, I exclude that airport from the analysis on that day. Having prior delays is an indication that the destination airport might be having issues that could cause delays–unrelated to the delayed airplane that I’m studying. By looking only at airports without prior delays, I keep the waters as calm as I can before the ‘splash’ of the delayed aircraft.

Good Ol' Scatterplot

Having decided my approach, it was time to process (i.e. wrangle in R) the data into the form required to answer the question at hand. I’ll spare you the gory details and fast-forward to the interesting part–when I could finally put the variables I was interested in into a scatter plot.  When I did that, however, I was left scratching my head…

My data looked like a a strange multi-armed monster. It seemed as though there were two (even three?) trends happening at the same time–one shooting horizontally out along the x-axis,  another strange unicorn horn sticking out diagonally, and then (maybe) another trend along the y-axis. My first move was to pore over my code to make sure I did everything correctly. As far as I could tell, the code was working fine. So I was left with a puzzle–how to account for this strange pattern? The explanation turned out to be rather simple. Can you guess the answer without scrolling down for the spoiler? If you can, let’s work together next time :)

Thankfully, the dataset includes a variable for the tail number of each aircraft. At first, I didn’t think I was going to use this variable, but after a lot of time staring at the unicorn horn, I thought why not give it a try. It turned out that all the data points in that diagonal trend occur when the two flights are the same plane. In retrospect, it seems rather obvious. And I realize I definitely have a bias in my thinking about airports. Namely, I always think of airports as busy places where tons of planes are landing and taking off all the time. Maybe it’s because I’m from Chicago, which laid claim to having the busiest airport for many years (damn you, Atlanta). It appears that quite often, however, a plane lands and the next plane to take off from that airport is. the. same. plane. Mind = blown.

simulationsSo the flights that involved the same plane were clearly related, but what about the ones that weren’t? I don’t think they are. I didn’t do anything terribly rigorous to prove this, but I did run several simulations where I shuffled around the x and y values of the scatter plot around at random. When two random variables are independent of each other, you can do this to their values and it shouldn’t change the appearance of the plot very much. I ran nine simulations using the data, which are displayed in the graph to the left. The one on the top-left is the original data–with the diagonal unicorn horn removed.

While it didn’t seem that flight delays were contagious in general, there was a clear trend when consecutive flights involved the same aircraft. Now, this is quite obvious, but what is interesting is that we can comment on the nature of the relationship and try to estimate the size of the effect. In other words, for each minute your aircraft was delayed coming out of its prior airport, how long can you expect to wait at the gate?

From the graph below, we can see that the nature of the relationship is clearly linear. That is good news for air travelers. Why? You’d prefer a linear relationship as opposed to one that was parabolic–or even worse–exponential. At least with a linear relationship, you know that for each minute the aircraft is delayed, you’re paying a constant price.


Since the relationship is linear, I fit a linear model to estimate that price. I controlled for as many factors as I could using fixed effects for airport, day, carrier, and the difference in schedule between arrival and next departure (in 5 minute time blocks). The last effect was important to account for the variation in time between the two flights. Even when the next airplane scheduled to take off from an airport is the same airplane, there are still varying lengths of time until the plane is scheduled to take off again. The other fixed effects in the model can account for idiosyncratic behavior on a particular day across all airports, at a particular airport across all days, or for a particular carrier (across all days and airports). The model I fit included no interaction effects. To cut down on the risk for confounding, I only had one observation per airport per day. As a tradeoff, I couldn’t have interaction effects, because there was no additional variation within an airport on a given day.

In the end, the linear model estimated a significant effect for the variable of interest with a coefficient equal to 0.89. This again is good news for air travelers. Granted, it would be better if there was no effect (or better yet a negative effect) on the delay of the next flight. But as it stands, it’s good that the coefficient is less than 1. That means that for every minute your airplane is late, if your plane is the next plane scheduled to leave the airport, you can expect to be delayed less than a minute (about 53.4 seconds to be exact). Could be worse!

One disclaimer to all of this–which I haven’t investigated further–is how generalizable these findings are to all airports. As I noted, this linear relationship holds when the next aircraft to take off at an airport is the same as the delayed plane that just landed. One might think this tends to happen more at smaller, less busy airports. Therefore, it’s a further question the extent to which intermediate flights (between the arrival and departure of the identical aircraft) might affect the subsequent delay.

If it ends up being longer than 53.4 seconds/minute, I hope your airport has plenty of these:outlet-image

by dgreis at February 24, 2014 09:26 PM

February 21, 2014


More Metadata Muddles on Google Books

Mark's discovery of a mistitled Google Books entry—a book on experimental theater filed as a 2009 book on management—is entertaining but not that unusual. Like the other metadata mixups at Google books (involving authorship, genre classification and publication date, among other things) that I enumerated in a 2009 post "Google Books: A Metadata Train Wreck," there are probably thousands of cases in which the metadata for one book is associated with an entirely different work. Or at least that's what induction suggests; Paul Duguid and I have happened on quite a number of these, some as inadvertantly comical as Mark's example. Clicking on the entry for a book called Tudor Historical Thought turns up the text of a book on tattoo culture, the entry for an 1832 work on the question of whether the clergy of the Church of England can receive tithes turns up a work by Trotzky, the entry for Last Year at Marienbad turns up the text of Sam Pickering's Letters to a Teacher, and so on (see more examples below the fold). What's particularly interesting about Mark's example, though, is that the work is similarly misidentified on Amazon and Abe Books, which indicates that for many modern titles, at least, the error is likely due not to "some (perhaps algorithmic) drudge on the Google assembly line," as Mark suggests, but to one of the third-party offshore cataloguers on which Google and others rely for their metadata.


by Geoff Nunberg at February 21, 2014 09:27 PM

Ph.D. student

Data Science: It Gets the Truth!

What follows is the first draft of the introduction to my upcoming book, Data Science: It Gets the Truth! This book will be the popularized version of my dissertation, based on my experiences at the School of Information and UC Berkeley. I’m really curious to know what you think!

There are two kinds of scientists in the world: the truth haters, and the truth getters.

You can tell who is a truth hater by asking them: “With your work, are you trying to find something that’s true?”

A truth hater will tell you that there is no such thing as truth, or that the idea of truth is a problematic bourgeois masculinist social construct, or that truth is relative and so no, not exactly, they probably don’t mean the same thing as you do when you say ‘truth’.

Obviously, these people hate the truth. Hence, “truth haters.”

Then there are the truth getters. You ask a truth getter whether they are trying to discover the truth, and they will say “Hell yeah!” Or, more simply, “yes, that is correct.”

Truth getters love the truth. The truth is great; it’s the point of science. They get that. Hence, “truth getters.”

We are at an amazing, unique time in history. Here, at the dawn of the 21st century, we have very powerful computers and extraordinary networks of communication like never before. This means science is going through some unprecedented changes.

One of those changes is that scientists are realizing that they’ve been fighting about nothing for a long time. Scientists used to think they had to be different from each other in order to study different things. But now, we know that there is only one good way to study anything, and that is machine learning. Soon, all scientists are going to be data scientists, because science is discovering that all things can be represented as data and studied with machine learning.

Well, not all scientists. I should be more precise. I was just talking about the truth getters. Because machine learning is how we can discover the truth about everything, and truth getters get that.

Truth haters, on the other hand, hate how good machine learning is at discovering the truth about everything. Silly truth haters! One day, they will get their funding cut.

In this book, Data Science: It Gets the Truth! you will learn how you too can be a data scientist and learn the truth about things. Get it? Great! Let’s go!

by Sebastian Benthall at February 21, 2014 03:36 PM

February 20, 2014

Ph.D. alumna

Can someone explain WhatsApp’s valuation to me?

Unless you were off the internet yesterday, it’s old news that WhatsApp was purchased by Facebook for a gobsmacking $16B + $3B in employee payouts. And the founder got a board seat. I’ve been mulling over this since the news came out and I can’t get past my initial reaction: WTF?

Messaging apps are *huge* and there’s little doubt that WhatsApp is the premier player in this scene. Other services – GroupMe, Kik, WeChat, Line, Viber – still have huge user numbers, but nothing like WhatsApp (although some of them have even more sophisticated use cases). 450M users and growing is no joke. And I have no doubt that WhatsApp will continue on its meteoric rise, although, as Facebook knows all too well, there are only so many people on the planet and only so many of them have technology in their pockets (even if it’s a larger number than those who have bulky sized computers).

Unlike other social media genres, messaging apps emerged in response to the pure stupidity and selfishness of another genre: carrier-driven SMS. These messaging apps solve four very real problems:

  • Carriers charge a stupidly high price for text messaging (especially photo shares) and haven’t meaningfully lowered that rate in years.

  • Carriers gouge customers who want to send texts across international borders.
  • Carriers often require special packages for sending group messages and don’t inform their customers when they didn’t receive a group message.
  • Carriers have never bothered innovating around this cash cow of theirs.

So props to companies building messaging apps for seeing an opportunity to route around carrier stupidity.

I also get why Facebook would want to buy WhatsApp. They want to be the company through which consumers send all social messages, all images, all chats, etc. They want to be the central social graph. And they’ve never managed to get people as passionate about communicating through their phone app as other apps, particularly in the US. So good on them for buying Instagram and allowing its trajectory to continue skyrocketing. That acquisition made sense to me, even if the price was high, because the investment in a photo sharing app based on a stream and a social graph and mechanism for getting feedback is huge. People don’t want to lose those comments, likes, and photos.

But I must be stupid because I just can’t add up the numbers to understand the valuation of WhatsApp. The personal investment in the app isn’t nearly as high. The photos get downloaded to your phone, the historical chats don’t necessarily need to stick around (and disappear entirely if a child accidentally hard resets your phone as I learned last week). The monitization play of $.99/year after the first year is a good thing and not too onerous for most users (although I’d be curious what kind of app switching happens then for the younger set or folks from more impoverished regions). But that doesn’t add up to $19B + a board seat. I don’t see how advertising would work without driving out users to a different service. Sure, there are some e-commerce plays that would be interesting and that other services have been experimenting with. But is that enough? Or is the plan to make a play that guarantees that no VC will invest in any competitors so that all of those companies wither and die while WhatsApp sits by patiently and then makes a move when it’s clearly the only one left standing? And if that’s the play, then what about the carriers? When will they wake up and think for 5 seconds about how their greed is eroding one of their cash cows?

What am I missing? There has to be more to this play than I’m seeing. Or is Facebook just that desperate?

(Originally posted at LinkedIn. More comments there.)

by zephoria at February 20, 2014 01:50 PM

MIMS 2004

Who says kids don't value privacy? And who says they won't pay for it? WhatsApp and Privacy

One of the interesting elements for me here is that kids were okay giving WhatsApp their data, for then (for now?), knowing there would be no ads, because it created "parent privacy" though the app, and reduced their costs sending TXT messages through the telcos. I pay $20 a month for a flat rate of unlimited TXT msgs, SMS, *and* unlimited free cell-to-cell calls. I did it for the calls.. which anytime are 10cents during the day. I moved my plan from the 4th highest minutes, to the lowest, because almost all my calls are to other cells. However, because I went from 500 texts (and 25cents for each additional) to unlimited, I now use about 2k texts. But every text is listed, time, date, phone number, on my bill, and that's easily sortable online if you log into the cell company's website. And my telco and many other apps have access to those messages. Parents that want to track their kids, just sort the calls, track the times, etc. Kids are paying $1 to both stop any additional costs for texting, and to stop the tracking. I think this is a very interesting development. What data does WhatsApp see in your phone? Your phone has more intimate data about you than Facebook, in many ways because it's implicit, not explicit. WhatsApp doesn't need you to tell them your favorite movies or where you live; they know through the discussions, they know your real friends list based upon contacts and activity in your phone. Here is the list of the data you agree to give WhatsApp for an Android install: Your SMS messages Storage -- contents of your USB storage System tools: all shortcuts -- plus modify shortcuts including installing them and uninstalling them Your location: AGPS and GPS Microphone: record audio Camera: take pictures and video, see your photos and video Your application information: retrieve any running app, find all apps Your personal information: read your own contact card Your accounts: add or remove accounts, create accounts and set passwords, use accounts on the device Network communications: connect and disconnect from wi-fi, full network access Phone calls: direct call phone numbers, read phone status and identity of phone Your social information: modify your contacts, read your contacts Sync settings: read sync settings, read sync statistics, toggle sync on and off System tools: modify system settings, test access to protected storage Affects Battery: control vibration, prevent phone from sleeping Your applications information: run at startup Network Communications (a second listing): Google play billing service, receive data from Internet, view Wi-fi connections, view network connections Your accounts (second listing): Find accounts on device, read Google service configuration That's a lot of info. I would argue that this is more personal information that what you post voluntarily on FB. But I think the kids were looking for Parent-Privacy, not Privacy from Telcos, the government or data aggregators mostly. And WhatsApp gives it to them, and reduces the costs of text messaging on the phone to $1 year. Brilliant, and worth every penny of the $16-19b Facebook paid, What'sApp is reported to have 450m active users.. divide that into 19b and you get $45 a user.. or $16b is $35 a user. When Flickr was bought, Yahoo paid $111 a user. With revenue of $25 a person x 60,000 paid users. Myspace was $36. Instagram was $28. Skype was a whopping $264. See more at Statista. I don't know how many paid users WhatsApp has, but the service is free the first year, then $.99 a year after that. I suspect we'll find out how many at the next quarterly call Facebook has, because I can't find anything with that number out there now. But WhatsApp sold for an amount that is comparable for a "consumer" service. And reasonable, even if $19b is a mind-blowing number in the scheme of things.

February 20, 2014 06:20 AM

February 14, 2014

Ph.D. student

Kolmogorov Complexity is fucking brilliant

Dear Everybody Who Thinks,

Have you ever heard of Kolmogorov Complexity?

It is one of the most amazing concepts ever invented. I shit you not.

But first: if you think about it, we are all basically becoming part of one big text, right? Like, that is what we mean when we say “Big Data”. It’s that we are rapidly turning everything into a string of letters or numbers (they are roughly the same) which we then reinterpret.

Sometimes those strings are people’s creation/expression. Some of them are sensed from nature.

We interpret them by processing them through a series of filters, or lenses. Or, algorithms. Or what some might call “perceptual concepts.” They result in taking action. You could call it cybernetic action. Just as the perception of pain causes through the nerves a reflexive leap away from the stimulus, the perception of spam through an algorithmic classifier can trigger an algorithmic parry.

Turns out, if everything is text that we process with computers, there’s a really amazing metric of complexity that applies to strings of characters. That is super. Because, let’s face it, most of the time when people talk about “complexity” they aren’t using the term very precisely at all. That is too bad, because everything is really complicated, but some things are more complicated than others. It could be useful–maybe absurdly useful–to have a good way to talk about how differently complicated things are.

There is a good way. A good way called Kolmogorov Complexity. This is the definition of it provided by Wolfram Mathworld:

The complexity of a pattern parameterized as the shortest algorithm required to reproduce it. Also known as bit complexity.

I am–and this is totally a distraction but whatever–totally flummoxed as to how this definition could ever have been written. Because half of it, the first sentence, is the best humanese definition I’ve ever heard, maybe, until I reread it and realized I don’t really get the sense in which they are using “parameterized.” Do you?

But essentially, a pattern’s (or text’s) Kolmogorov complexity is the length of the shortest algorithm that produces it. Length, here, as in length of that algorithm represented as another text–i.e., in software code. (Of course you need a reader/observer/language/interpreter that can understand the code and react by writing/showing/speaking the pattern, which is where it gets complicated, but…) Where was I? (Ever totally space out come back to wondering whether or not you closed a parenthesis?)

No seriously, where was I?

Ok, so here’s the thing–a pattern’s Kolmogorov complexity is always less than or equal to its length plus a small constant. So for example, this string:


is a string of 100 random characters and there is no good way to summarize it. So, the Kolmogorov complexity of is just over 100, or the length of this string:


If you know anything about programming at all (and I’m not assuming you do) you should know that the program print("Hello World!") returns as an output the string “Hello World!”. This is traditionally the first program you learn to write in any language. If you have never written a “Hello World!” program, I strongly recommend that you do. It should probably be considered a requirement for digital literacy. Here is a tutorial on how to do this for JavaScript, the language that runs in your web browser.

Now take for example this totally different string of the same length as the first one:


You might notice something about this string of numbers. Can you guess what it is?

If you guessed, “It’s really simple,” you’re right! That is the correct intuition.

A way to formalize this intuition is to talk about the shortest length of its description. You can describe it in a lot fewer than 100 characters. In English, you could write “A hundred zeros”, which is 15 characters. In the pseudocode language I’m making up for this blog post, you could write:


Which is pretty short.

It turns out, understanding Kolmogorov complexity is like understanding one of the fundamental secrets of the universe. If you don’t believe me, ask the guy who invented BitCoin, whose website links to the author of the best textbook on the subject. Side note: Isn’t it incredible that the person who doxedo Satoshi Nakamoto did it through automated text analysis? Yes.

If you are unconvinced that Kolmogorov complexity is one of the most amazing ideas ever, I encourage you look further into how it plays into computational learning theory through the Minimum Description Length principle, or read this helpful introduction to Solomonoff Induction a mathematical theory of inductive inference that is as powerful as Bayesian updating.

by Sebastian Benthall at February 14, 2014 03:56 AM

February 12, 2014

Ph.D. student

thinking about computational social science

I’m facing a challenging paradox in how to approach my research.

On the one hand, we have the trend of increasing instrumentation of society. From quantified self to the Internet of things to Netflix clicks to the fully digitized archives of every newspaper, we have more data than we’ve ever had before to ask fundamental social scientific questions.

That should make it easier to research society and infer principles about how it works. But there is a long-standing counterpoint in the social sciences that claims that all social phenomena are sui generis and historically situated. If no social phenomenon generalizes, then it shouldn’t be possible to infer anything from the available data, no matter how much of it there is.

One view is that we should only be able to infer stuff that isn’t very interesting at all. One name for this view is “punctuated equilibrium.” The national borders of countries don’t move around…until they do. Regimes don’t change…until they do. It’s the ability to predict these kinds of political events that Philip Tetlock has called “expert political judgment.” The Good Judgment Project is a test to see what properties make a person or team of people good at this kind of task.

What now seems like many years ago I wrote a book review of Tetlock’s book. In that review, I pointed out a facet of Tetlock’s research I found most compelling but underdeveloped: that the best predictors he found were algorithmic predictors that drew their conclusions from linear regressions drawn from just the top three or so salient features in the data.

Six or so years later, Big Data is a powerful enough industrial and political phenomenon academic social science feels it needs to catch up. But to a large extent industrial data science is still about using pretty basic statistical models drawn from physics (that assume that everything stands in Gaussian relations to everything else, say), or otherwise applying a broad range of modeling techniques and aggregating them under statistical boosting. This is great for edge out the competition on selling ads.

But it tells us nothing about the underlying structure of what’s going on in society. And it’s possible that the fact that we haven’t done any better is really a condemnation of the whole process of social science in general. The data we are getting, rather than making us understand what’s going on around us better, is perhaps just proving to us that it’s a complex chaotic system. If so, the better we understand it, the more we will lose our confidence in our ability to predict it.

Historically, we’ve been through all this before. The mid-20th century saw the expansion of scope of Norbert Weiner’s cybernetics from electrical engineering of homeostatic machines to modeling of the political system and the economy as complex feedback systems. Indeed, cybernetics was intended as a theory of steering systems by thinking about their communications mechanisms. (Wikipedia: “The word “cybernetics” comes from the Greek word κυβερνητική (kyverni̱tikí̱, “government”), i.e. all that are pertinent to κυβερνώ (kyvernó̱), the latter meaning to “steer,” “navigate” or “govern,” hence κυβέρνησις (kyvérni̱sis, “government”) is the government while κυβερνήτης (kyverní̱ti̱s) is the governor or the captain.”) These models were on some level interesting and intuitive, even beautiful in their ambition. But they failed in their applications because social systems did not obey the kind of regularity that systems engineered for reliable equilibria did.

The difficulty with applying these theories that acknowledge the complexity of the social system to reality is that they are only explanatory in retrospect because other the path dependence of history. That’s pretty close to rendering them pseudoscientific.

Nevertheless, there are countless pressing societal challenges–climate change, unfair crime laws, war, political crisis, public health policy–on which social scientific research must be brought to bear, because there is a dimension to them which is a problem of predicting social action.

It is possible (I wonder if it’s necessary) that there are laws–perhaps just local laws–of social activity. Most people certainly believe their are. Business strategy, for example, depends on so much theorizing about the market and the relationships between different companies and their products. If these laws exist, they must be operationalizable and discoverable in the data itself.

But there is the problem of the researcher’s effect on the system being observed and, even more confounding, the result of the researcher’s discovery on the system itself. When a social system becomes self-aware through a particular theoretical lens, it can change its behavior. (I’ve heard that Milton Friedman’s monetarist economics are fantastically predictive of economic growth in the United States right up until he published them.)

If reflexivity contributes to social entropy, then it’s not clear what the point of any social research agenda is.

The one exception I can think of is if an empirical principle of social organization is robust under social reflection. The goal would be to define an equilibrium state worth striving for, so that the society in question can accept it harmoniously as a norm.

This looks like relevant prior work–a lucky google hit.

by Sebastian Benthall at February 12, 2014 08:08 AM

February 11, 2014

MIMS 2012

Re-Designing Optimizely's Preview Tool

We just released a major addition to Optimizely’s preview tool: the ability to impersonate visitors. This lets our customers see content they’re targeting to different audiences. Designing and developing this was a larger undertaking than any of us expected. In this post, I am going to describe our design process and the design decisions we made along the way. My hope is this post will serve as a record of the work that went into it, as well as provide a glimpse into the black box of design.

Impersonate what? Preview who?

Optimizely’s Preview tool has been around for many years as a QA tool for our customers to view draft experiments on their live site. Its proven itself useful, but until now it wasn’t possible to view experiments targeted to users whose conditions you didn’t match, such as people in another country. The new impersonation functionality lets you do just that.

Optimizely's redesigned preview tool

Optimizely’s newly redesigned preview tool

The problem

Our broad goal was to make it easy for our customers to personalize content to their audiences. Like any good design process, we started by asking questions and doing research. We sought to answer questions such as: “Do our customers even want to do this at all? Do they do this now? Why do they do this?” We began by looking at usage metrics, which taught us that a small number of people already target content to audiences.

But why do they do this? What are their difficulties? Why aren’t more people doing this? To answer these questions, we switched into qualitative mode and spoke with customers. After a few interviews, the most consistent and actionable insight we had was that the biggest pain point is managing these targeted experiments. Among other issues, it’s especially hard to preview and test content that’s served to different audiences.

A seedling of an idea

With our research in hand, we brainstormed a bunch of different ways of tackling this problem. The most obvious improvement that everyone agreed on was augmenting the preview tool with the ability to impersonate different users. Adding this to the preview tool was a natural fit, since it’s meant to be a QA tool. It doesn’t solve every problem, but it’s a good first step.

During the brainstorming sessions we sketched a (very) rough mockup on a whiteboard (see below). The functionality of the previous preview tool would remain the same, and we’d just add the ability to view your site as a different audience member.

First sketch of the new preview

The first sketch outlining the new impersonate functionality.

Better sketches

After we had a basic idea of how this would work, it became my job to fully design the feature. This meant thinking through all the user interactions, layout, and visual design. My first step was to sketch more detailed interfaces. The original sketch relied on an overly simplistic assumption of how impersonation would work: users would select an audience to view the page as. Unfortunately, Optimizely doesn’t (yet) have the concept of an “audience” in the product. Since an audience is basically a collection of targeting criteria (location, cookies, browser, etc.), impersonating a visitor morphed into setting values for targeting attributes. These values are evaluated against each experiment’s targeting conditions, and the experiments that match are displayed on the page.

This is a pretty major change from the original concept. Instead of a simple dropdown to select a predefined audience, a person would have to set targeting attribute values to define their audience (for example, set browser to “Firefox” and location to “Bulgaria”). This change made the interface and user interactions significantly more complicated than the original sketch.

As a result, my early wireframes only focused on how people would set their targeting values. I threw down a bunch of different ways of doing this on a big sheet of poster paper (see picture below). My goal here was to figure out where this new UI would go, and how people would interact with it. It wasn’t meant to think through every corner of the tool. It simply got me to the point where I felt confident putting impersonation in a sidebar alongside the other panes. The form didn’t need to be very wide, and the different targeting attributes fit pretty easily in this constrained space.

More detailed preview sketches

More refined sketches of the new preview tool.

Interactive mockups

Now that I felt comfortable with the basic layout of the new preview, I moved into high fidelity interactive mockups. At this stage it was important for me to get my ideas into a form that I could actually interact with. It’s one thing to look at a static sketch, but it’s completely different to actually click through and interact with it. I could have sketched more, but I couldn’t be sure of my design decisions until I could use them.

My goal with the interactive mockup was to decide the major user interactions that shape how the product works. This would essentially act as a skeleton, to which I’d add the “skin” of visual design later. To achieve this, I coded a static HTML page that would look and act like the final product, but not actually do anything. I decided to build an HTML prototype because:

  • It’s quick and easy to make changes (for me),
  • It runs in the browser, the medium through which actual users will interact with it, and
  • Large portions of the HTML, CSS, and JS can be re-used in the final version.

While working on the interactive mockup, I proceeded in a very iterative fashion. I would identify the worst part of the UI, work on it until it was no longer the worst part of the UI, and then move on to the new worst part. Eventually, the original worst part would be the worst again, and I would improve it more. Working in this manner, nothing is ever “done”, and I’m continuously in the mindset of asking, “what can be better?”.

The other advantage of this approach is it lets me sit with a portion of the UI in an imperfect state for awhile. This lets me focus my attention elsewhere, which is helpful because it’s easy to get dragged down in details and lose objectivity. I’ve noticed that by focusing on another part of the UI, problems with what I was working on naturally pop out. And just as often, some detail I was sweating becomes a distant memory. A design decision ends up being more minor than I thought, or the solution just works in the overall context of the UI.

Although my process consisted of lots of iterations on every part of the interface, the rest of this post will focus on each part of the UI in isolation, as if I worked on each section straight from start to finish (even though I didn’t).

Warning: this post isn’t even halfway done, so if you don’t care to read through a bunch of design decisions, I recommend you skip through my work-in-progress screenshots all the way to the end.

Where to put the new impersonation feature?

The worst part that first jumped out at me was the location of the impersonation feature — it was too front and center. This is the new feature we were building, so naturally my sketches focused primarily on that. But in doing so, I lost sight of the larger picture: users will be coming from the editor to view a draft variation on their live site. Impersonating visitors is a secondary (maybe even tertiary) activity. And today, most users won’t find it useful (from our previous research, we knew only about 12% of running experiments are targeted).

My first attempt to solve this problem was to hide the impersonation sidebar by default. To open it, users would need to click a “targeting” icon. At the core of it, this approach worked. It got the impersonation pane out of people’s faces, while still being just a click away. But it worked too well. The icon was too cryptic, and overall the impersonate functionality was too hidden for anyone to find.

Screenshot of another early interactive mockup

This early screenshot shows how prominent the new impersonation feature was. It also shows one possible solution to this problem — toggling it by clicking a “targeting” icon (the cross hairs).

It also had another, more subtle problem. By being next to the experiments list, it implies a strong relationship between the two. For a long time, I thought this relationship was important. After all, changing targeting values affects which experiments are visible. And the targeting conditions of your experiments influence the custom visitor you’re trying to impersonate. Therefore, experiments must be next to the impersonation attributes!

For a long time, that reasoning made sense to myself and the larger group. I even tried to get cute and show a warning icon next to experiments that would not be visible after loading th epage as the custom visitor (with an option to pre-emptively ignore the targeting conditions — see screenshot above. Foreshadowing: I recycle this idea later). But after using this UI for awhile, I realized the relationship between these two is not as important as I originally thought. In general, having the list of experiments next to the impersonation UI (with icons turning on and off) was actually more distracting than helpful. It was complex and clever in a way that wasn’t benefiting users.

So I decided to simplify things by putting the impersonation feature in its own tab. This proved to be a good decision that solved a lot of problems beyond this immediate one. For one thing, it kept the user’s focus on setting targeting values, rather than splitting their attention between targeting values and experiments. And putting the UI behind a tab provided a label in the UI that hinted at its functionality (instead of an obtuse icon). Next, it allowed for some extra room to provide in-context help and an explanation of this new feature’s use. And finally, it provided a place to communicate when a person was in “impersonation mode”. I could use the tab for double duty by updating the text to say when a person was viewing the page as themselves, or impersonating a custom visitor.

Custom visitor attributes

The next issue to solve was how to set custom visitor attributes. I saw two major branches for this UI: have all of them available to set from the beginning; or, start with a blank slate and choose the attributes to set from a list. My sketches had potential solutions, but this was a question I could only answer by interacting with each interface.

There were advantages to each method. Having everything already in the UI laid all the cards on the table: all 11 possible attributes were already there, and you can see the fields you’ll need to fill in before interacting with anything. You just need to set the ones relevant for the custom visitor you want to impersonate. But it had some serious drawbacks: it looked visually cluttered and overwhelming; some attributes will be used a lot, while others will barely be used at all; and not all users understand what all the attributes are (e.g. custom tags? query parameters?).

Screenshot of an early interactive mockup

This early interactive mockup shows one attempt at placing all the targeting attributes in the UI from the start.

The other branch — inserting attributes by selecting them from a list — was also a mixed bag. The advantages are that the interface starts with a blank slate, which puts the user in control of what’s in the UI. It also doesn’t look overwhelming, and is less distracting (since only the attributes they want to see are present). The interaction is also split into two simple, discrete steps: first, choose the attribute you want to set; and second, set its value. But this is more clicks (at least 2, versus 1). And starting with a blank slate means a person must figure out how to insert the attribute(s) they want (since they’re not already on the screen).

Then I went in circles for weeks trying out different variations of each branch. Nothing felt quite right, and I couldn’t find a creative way to minimize the disadvantages of either idea. At this point I was pretty deep in the weeds, and my usual technique of focusing on another aspect of the UI for awhile wasn’t yielding any insights. So I decided I needed a fresh perspective. Using the mockup below, I ran some guerrilla usability studies with two of our sales people to see how they reacted to the UI. The results were pretty clear: they found it intimidating. One of them said, “it looks developer-y”. And in fact, my inspiration for this version came from the Chrome Developer tools.

Screenshot of another early interactive mockup

This screenshot illustrates one possible method of setting custom visitor attributes – have all of them available in the UI from the start.

Getting qualitative data at this point was immensely valuable. Additionally, I thought back to some of the quantitative data we had gathered, and knew that most targeted experiments use fewer than 4 targeting criteria. This means most people will only need to set between 1 and 3 attributes when impersonating a visitor. From this I felt confident that inserting attributes from a list was the best approach. And after using the version pictured below for a few days, it felt about right. (But as you can see, it still needed a lot of refinement before going live).

Almost final screenshot of adding custom visitor attributes

This version of adding custom visitor attributes is close to the final version, but still needed a lot of refinement.

Experiment activation

The original preview had “Activate” links next to each of the inactive experiments in the experiment list, which was a big source of confusion. What does “activate” mean? Does clicking it mean the experiment is now “Running”? Is it a permanent change? In reality, it just makes it visible on the page for the duration of the preview session. I knew this interaction had to improve, but it required more than just a simple name change and better visual design (both of which were also necessary).

To find a solution, I started with two key questions: first, why do people click that button; and two, why do experiments need to be activated at all? I started with the second question, which turned out to be more complicated than I expected. An experiment may need to be activated for the following reasons:

  1. It’s running on a different page (the most obvious reason, but also the least likely scenario in which a person would want to make it visible)
  2. It’s paused or draft (second most obvious, and also the most likely reason for a person to make it visible)
  3. It’s manually activated (i.e. a piece of javascript on the user’s site triggers the experiment to run via our javascript API)
  4. The user doesn’t match the experiment’s targeting conditions

To make things more confusing, an experiment may not be visible for all of these reasons, or any combination of them. For example, a paused experiment running on this page might also fail the targeting conditions, making it not run for two separate reasons. When clicking “Activate”, which action is supposed to happen? Ignore the experiment’s status, or the targeting conditions, or both?

Initially I assumed anytime “Activate” was clicked, the experiment should be visible, regardless of the reason it’s not visible. After all, the user’s intent was to make the experiment visible, right? Well, after some discussions with the larger group, the answer was “maybe” for the old UI, but “probably not” for the new one.

To understand why, we need to start with the goal of the new preview tool: give people impersonation capabilities, so they can preview targeted experiments. Put another way, it’s a testing tool to help you be confident that your targeting conditions are set correctly, and that your end-users are seeing the proper content. Viewed in this light, an experiment that’s not visible because it fails the targeting criteria becomes a lot more important. For example, lets say you have a draft experiment that is also targeted to an audience. While impersonating a visitor, you want to make sure this experiment’s targeting conditions are set properly. If clicking “Activate” were to override everything and make the experiment visible, there would be no way to know if you actually met the targeting conditions.

At this point there was clear motivation for separating out the actions, but there are still a lot of reasons an experiment may not be visible. Luckily, having a well-defined use case helped guide me. Rather than have buttons for every possibility, I could group the user’s intent into two categories: override targeting conditions (i.e. targeted URL and visitor targeting criteria), and everything else (e.g. ignore a paused experiment’s status). This meant I only needed two buttons.

Button labels

Next, I had to figure out what to label the buttons, and where to put them. What to label each button took a lot of discussion, and we ended up with three verbs to use (for only two buttons). The first was to use “Activate” to view manually activated experiments. Ironically, this is the same as the old UI, which I said needed to change, but in this case it’s consistent with the way we talk about manually activated experiments throughout the product (and is, in fact, the reason the original UI used “Activate” in the first place).

The action to view paused and draft experiments was more difficult. It couldn’t conflict with existing product terminology (e.g. running or active), but it needed to indicate you’re taking a non-permanent, temporary action. We felt “Stage” hit both of those requirements.

For the second button, my mockups used “Ignore targeting conditions” since I needed something for development purposes, and in the end that’s what we shipped. Everyone agreed it was clear and we felt there was no reason to over think it. Usability studies further confirmed that these label choices made sense.

Button placement

Now the only issue remaining was where to put the buttons. An obvious approach is to put them next to each experiment in the list (as the old UI did). I didn’t like this for two reasons: first, it’s visually heavy and repetitive (the same button is repeated again and again); and second, it’s not a common enough activity to have them always present (remember, the core use case is still coming from the editor to view the variation you’re working on).

I decided to hide them in a popover that appears when you click the name of an experiment. I tried a similar idea earlier in my iterations (remember the warning icon popover?), and liked the concept of only exposing secondary functionality when the user asks for it. It also provided another advantage: a place to put contextual information about the experiment. I could state the experiment status (paused, running, draft), and describe why an experiment was visible (or not).

Finding a solution that solves multiple problems always feels good. This is also a good example of how a scrapped idea can still be valuable. Rather than being a waste of time, it came back later to inspire me.

But it wasn’t perfect yet. My first iteration had both buttons in the popover, leaving it up to the user to decide which action they wanted (activate/stage an experiment, or ignore targeting conditions). But I could tell immediately that having two buttons of equal importance was confusing. As Steve Krug says, “Don’t make me think”, and this definitely made the user think.

In thinking through this problem, I knew both buttons were not required for every experiment (e.g. a paused, un-targeted experiment will never need to ignore targeting conditions). So for the complex case where an experiment needs both buttons to make it visible, I decided to make the actions sequential. First, the popover explains the experiment isn’t visible because it’s paused or draft, and to make it visible you must click “Stage”. Then, if it’s still not visible, the popover informs you why, and provides you an override button.

Although this is technically more work (i.e. more clicks), it’s less of a cognitive burden on the user. They are presented with one decision at a time. And they don’t need to figure out what’s happening – the system tells them.

After using this UI for awhile, an obvious improvement popped out at me: adding a label next to experiments that are staged/actived, and ignoring targeting conditions. This information was previously only available in the popover, which makes it cumbersome to access. With labels, it’s easy to see at a glance which experiments were forced to be visible. I also added a label for the experiment being previewed from the editor, which is really nice for orienting the user when they first arrive in the preview tool.

Path field interaction

One of the last things I worked on was the URL path field. This field exists to give people the ability to continue impersonating visitors and previewing experiments on other pages of their site. We considered just letting a person click through their site and have the preview tool follow them around, but this behavior isn’t always desirable (e.g. what if they navigate to a new domain?). Plus, there were technical issues with implementing this. And since it isn’t a common action, we went with giving people the ability to type new URL paths, even though it’s not ideal.

So at a minimum I knew there had to be an input field with a button to load the page. And that’s exactly what it was during the prototyping phase and most of the implementation phase. But after awhile it became clear to me that it was going to need a bit more polish. The main problem was it was too squished between the tabs and rightmost edge. If you’re typing a really long path, you won’t be able to see the whole thing at once, which can lead to errors. Plus, it just “feels” cramped to use.

My first attempt was to move it out of the header completely — give it a dedicated space to live and breath. And since it’s not something that will be used frequently, it’s better to put it in a less prominent spot. Unfortunately, it didn’t make sense in any of the tabs. It’s a global action, unspecific to anything in the tabs. And putting it in its own dedicated tab was overkill since it would be the only field there.

So with that option off the table, I had to find a way to make it work in the header. The only reasonable approach I came up with was to keep it small and unobtrusive until someone started interacting with it. I experimented with a few different ways of doing this, but the one that felt best was hiding the domain name and keeping the field small until the user clicked into it, at which point the field expands, the tabs disappear, and the domain name pops out. This put all of the focus on the URL field, and provided a lot of room for editing the path.

Animated gif of the path interaction

When clicking into the path field, the input expands and the tabs duck down out of the way.

Ship it!

If you made it this far, congratulations! What I just described only scratches the surface of all the design decisions that were made. In writing this post, a lot of little decisions and early, incorrect assumptions came back to me. But if I went any deeper this post would be a novel, so instead I focused on the most difficult aspects of the impersonation feature. Design is often seen as a black box, and as writing this post has shown me, that’s because documenting the thought process and decision making can be exhausting. It’s amazing how some little detail you obsessed over for days can be so quickly forgotten. So this post acts as a historical record to help remind myself and the larger team all the work that went into this feature, in addition to providing a glimpse into the process of design.

by Jeff Zych at February 11, 2014 06:14 PM

February 10, 2014

Ph.D. student

on-line actors in political theater

It is recent news (sort of, in the limited sense that the Internet reporting on something that has happened on the Internet is “news”) that Justine Tunney, a former Occupy leader who is now a Google employee, has occupied the @OccupyWallSt twitter account and is using it to advocate for anarcho-capitalism. She seems to be adopting the California ideology hook, line, and sinker, right up to its solutionist limits.

It might be sincere advocacy. It might be performance art. She might be trolling. Hard to tell the difference nowadays.

No, wait, she’s definitely trolling:

My theory is that this is designed to draw attention to the failure of political theater, including the political theater that happens on-line. She uses Yudkowsky’s anti-politics manifesto “Politics is the Mind-Killer,” not to recommend defection from politics, but rather to recommend material political action over merely rhetorical action on public opinion through ‘slacktivist’ Facebook memes.

Personally, I think this is a damn good point for 5:30am. It appears to have been missed by the press coverage of the event. This is not surprising because it is actually politically disruptive to the press rather than just pretending to be disruptive in a way that benefits the press financially by driving hits.

She’s been openly grappling with the problem of the relationship between automation and economic inequality that Peter Norvig and others have been talking up lately. Like Norvig, she doesn’t really address how intellectual property is a source of the economic contradictions that make resolving automation and equality impossible. I don’t know if that’s because this position is truly so esoteric that I’m the only person who thinks it or if it’s because of her professionalization as a Googler. Either way, a capitalist tech institution has indirectly coopted the dormant networking infrastructure of Occupy, which makes perfect sense because the networking infrastructure of Occupy was built by capitalist tech institutions in the first place, except for the open parts.

I’ve come to the conclusion that it’s inappropriate to consider either anarchist rhetoric or anarcho-capitalist rhetoric (or any other political rhetoric) as a totalizing political theory. Rather, every available political ideology is a dimensionality reduction of the actually existing complex sociopolitical system and expresses more than anything a subnetwork’s aspirations for power. Occupy was one expression of the powerless against the powerful. It “failed,” mainly because it wasn’t very powerful. But it did generate memes that could be coopted by other more powerful political agents. The 99% turned into the 47% and was deployed by the left-wing media and democratic party against the Republican party during the 2012 election. Now the 99% is something Googlers talk about addressing with engineering solutions.

The public frustration with Tunney over the weekend misses the point. Horkheimer and Adorno, in Dialectic of Enlightement, (1987) may shed light:

The view that the leveling and standardization of people in general, on the one hand, is matched, on the other, by a heightening of individuality in the so-called leader figures, in keeping with their power, is erroneous and itself a piece of ideology. The fascist masters of today are not so much supermen as functions of their own publicity apparatus, intersections of the identical reactions of countless people. If, in the psychology of the present-day masses, the leader no longer represents the father so much as the collective, monstrously enlarged projection of the impotent ego of each individual, then the leader figures do indeed correspond to what they represent. Not by accident do they resemble hairdressers, provincial actors, and gutter journalists. A part of their moral influence lies precisely in the fact that, while in themselves as powerless as all the rest, they embody on the latter’s behalf the whole abundance of power without being anything more than the blank spaces which power has happened to occupy. It is not so much that they are exempt from the decay of individuality as that decayed individuals triumph in them and are in some way rewarded for their decay. The leaders have become fully what they always were slightly throughout the bourgeois era, actors playing leaders.

To express a political view that is not based on the expression of impotent ego but rather on personal agency and praxis is to violate the internal logic of late capitalist political opinion control. So, duly, the institutions that perform this opinion control will skewer Tunney and discredit her based on alleged hypocrisy, instead of acknowledging that her hypocrisy–the internal contradictions between her rhetorical moves and actions in her career–is a necessity given the contradictions of the society she is in and the message she is trying to convey.

It’s instructive to contrast Tunney as a deliberate provocateur triggering an implosion of political discourse with the more sustained and powerful role of the on-line political micro-celebrity. “Micro-celebrity” was coined by Terri Senft in her 2001 study of camgirls–young women who deliberately strove for attention on-line by publishing their lives through video, still images, and blogs. Senft‘s contributions are for some reason uncited in more contemporary on-line musings on microcelebrity.

Perhaps ironically given the history of the term, many of today’s prominent micro-celebrities are feminists who have staked out a position on the Internet as spokespeople for critiques of more institutionally connected feminism. Michelle Goldman’s article in The Nation documents the developments of on-line feminism in much detail, describing how it has devolved into harsh and immature practices of language policing and ritualized indignation.

Online, however, intersectionality is overwhelmingly about chastisement and rooting out individual sin. Partly, says Cooper, this comes from academic feminism, steeped as it is in a postmodern culture of critique that emphasizes the power relations embedded in language. “We actually have come to believe that how we talk about things is the best indicator of our politics,” she notes. An elaborate series of norms and rules has evolved out of that belief, generally unknown to the uninitiated, who are nevertheless hammered if they unwittingly violate them. Often, these rules began as useful insights into the way rhetorical power works but, says Cross, “have metamorphosed into something much more rigid and inflexible.” One such rule is a prohibition on what’s called “tone policing.” An insight into the way marginalized people are punished for their anger has turned into an imperative “that you can never question the efficacy of anger, especially when voiced by a person from a marginalized background.”

Similarly, there’s a norm that intention doesn’t matter—indeed, if you offend someone and then try to explain that you were misunderstood, this is seen as compounding the original injury. Again, there’s a significant insight here: people often behave in bigoted ways without meaning to, and their benign intention doesn’t make the prejudice less painful for those subjected to it. However, “that became a rule where you say intentions never matter; there is no added value to understanding the intentions of the speaker,” Cross says.

I’d argue that this behavior, which less sympathetic or politically correct observers might describe as carefucking, is the result of political identity and psychological issues being calcified by microcelebrity practice into a career move. A totalizing political ideology (in this case a vulgarized version of academic feminism) becomes a collective expression for libidinous utopianism. The collective raise a few spokespeople to the level of “decayed individuals.” As the self-branding practices of micro-celebrity transform these people from citizens to corporations, they become unable to engage authentically or credibly in public discourse, because the personal growth that comes inevitably from true discourse would destroy their brand.

Tragically, it is those that succeed at this gamified version of public discourse that eventually dominate it, which explains why in the face of racially motivated and populist anger, better situated feminist intellectuals are retreating from grassroots platforms into traditional venues of media power that are more regulated. Discussing Mikki Kendall, one of the people who started the hashtag #Solidarityisforwhitewomen, and Anna Holmes, founder of Jezebel, Goldberg writes:

The problem, as [Kendall] sees it, lies in mainstream white feminists’ expectations of how they deserve to be treated. “Feminism has a mammy problem, and mammy doesn’t live here anymore,” Kendall says. “I know The Help told you you was smart, you was important, you was special. The Help lied. You’re going to have to deal with anger, you’re going to have to deal with hurt.” And if it all gets to be too much? “Self-care comes into this. Sometimes you have to close the Internet.”

Few people are doing that, but they are disengaging from online feminism. Holmes, who left Jezebel in 2010 and is now a columnist for The New York Times Book Review, says she would never start a women’s website today. “Hell, no,” she says. The women’s blogosphere “feels like a much more insular, protective, brittle environment than it did before. It’s really depressing,” she adds. “It makes me think I got out at the right time.”

Sarah Kendzior has critiqued Goldberg, saying her article is an attack on Twitter as an activist platform by a more powerful feminist establishment that controls who gets a voice.

Social media is viewed by gatekeepers as simultaneously worthless and a serious threat. Balancing these opposing views requires a hypocrisy that can be facilitated only by the assurance of power.

Gatekeepers to mainstream feminist venues, like Jezebel founder Anna Holmes, proclaim that tweeting is not really activism. In contrast, the women behind hashtag activism argue that Twitter is one of the few outlets they have in a world that denies them opportunities.

Kendzior is right on about how dismissing your opposition as powerless while expressing anxiety about them as a threat is typical of powerful threatened people, and especially those whose power depends on their control of communication networks, brands, and audiences. In the culture industry, this is a matter of the bottom line. Competition for market share among, say, women on the Internet requires defining a kind of feminism that is both empowering and sale-able as a commodity. This in turn relies on the commodification of female identity available as a collection of literature and talking points. There are of course different market segments and products. Lean In targets a particular demographic–upper middle class women in large tech corporations. Naturally, the feminist writers in the culture industry hate Lean In because it is a competing product that ultimately exposes their own vulnerability to the vicissitudes of the technology companies they depend on for distribution. By preventing women from achieving technical efficacy over their own lives through career success, they maintain their market of women who need vocalization of the unfairness that their immaterial affective labor is unrewarded by the market. By cultivating and promoting only those mico-celebrities who can be reliably counted on to attack their political and economic enemies, this “feminist” establishment reproduces itself as a cultural force.

Truly radical voices like Justine Tunney’s which critique the cultural control of the media establishment without with dogged and unreflective monotony of micro-celebrity practice will fail at being actors playing leaders in the political theater. That is because they are the kind of leaders that don’t want to be leaders but rather are making a call to action.

by Sebastian Benthall at February 10, 2014 01:26 AM

February 09, 2014

MIMS 2012

Response to CSS Regions Considered Harmful

Håkon Wium Lie, the father of CSS, wrote a great critique of the CSS Regions spec. His article pointed out many of the same flaws that I’ve noticed.

My biggest complaint is it requires empty, dummy HTML elements to flow content into. I can think of nothing more un-semantic than having to place empty elements in your document purely for presentational purposes. As Håkon says, “it’s not regions per se that are harmful to web semantics, it’s the fact that they are encoded as presentational HTML elements.” Exactly. HTML is supposed to provide a meaningful, semantic structure to the content. Empty elements provide no meaning or structure whatsoever.

Håkon goes on to say, “If we want regions on the web, we should find a way to write them in CSS and not in HTML.” Ideally, the spec would allow us to break text up into multiple regions through CSS alone. The spec, as it’s written today, breaks HTML and CSS’s separation of presentation and content.

A few people have proposed using pseudo elements to overcome the empty element issue. While this is a good idea, it only covers a limited number of use cases. Complex layouts, like the sample provided in the spec, are not possible with pseudo-elements alone.

Empty HTML elements are the result of another issue I have with the spec: it’s applying a desktop publishing solution to the problem of creating rich magazine layouts on the web. The spec is analogous to laying out a page in InDesign by creating text boxes and flowing content from one box to the next. This works great when you control the page size and typography, and can add and resize text boxes as necessary. Unfortunately, the web is more dynamic than this. There’s no given page size, and browsers render pages differently depending on a person’s type preferences and device. Many of the demos I’ve seen, such as this example from Sara Soueidan (pictured below), have a design that is tightly coupled to the amount of text, the typeface, and the font size. If the reader set their type size to large, or the fall back font is loaded, this layout will not render as designed.

Complex CSS Regions example

This regions example presents a design that is tightly coupled to the typeface and font size.

Of course, as web developers part of our job is to find solutions to these problems. But I used this example to illustrate my larger point, which is the spec awkwardly applies a desktop publishing solution to the web, the result of which is unsemantic HTML.

The web would benefit from more complex layouts, and I commend the goal of the proposal. Unfortunately, it encourages unsemantic markup, and tries to use a solution from the static desktop publishing world to the dynamic web. We need a solution designed for the fluid nature of the web that maintains HTML and CSS’s separation of concerns.

by Jeff Zych at February 09, 2014 08:49 PM

MIMS 2004

Data Privacy Legal Hack-A-thon, Day 2: Projects

UPDATED: As we get down to the wire on presentations tonight at 5pm, the room is quiet and everyone is working hard. One of our judges, K. Waterman is walking around, conversing with whomever has a minute. And we have settled out to these project teams: Safe Sign-up: This will encrypt volunteer signups for events, especially protests, so that there is not one place that would have all the people at the event. Event organizers would have 5th Amendment protection for this information. By: Zaki Manian, Restore the Fourth, SF. Bring your Own Chat: A secure zero-knowledge chat application using only Drop Box. By: Daniel Roesler, Restore the Fourth, SF. The project can be found here at Github: Privacy Enhancing Toolkit: A toolkit for encrypted communications, file storage and sharing. By Judi Clark & Jenny Fang. Visual Privacy Policy: Creating a culture of informed consent by visualizing privacy policies. By: Dan Garen, Puneet Kishor, Nick Doty, Lysle Buchbinder, Beth MacArthy, Herrick Fang, and yesterday, Nancy Frishberg. Bitcoin Privacy Documentation: Developing a framework for thinking about the privacy of financial transactions using Bitcoin. By: Alice Townes, Richard Down. Mobile Privacy Shield: Intercept and display all the async calls for websites using a Firefox add-on. By: @nyceane. However, there is a chance that the Visual Privacy Policy group, which includes a browser extension, will split into two groups to present. Stay tuned. I'm working on a presentation for tonight at the closing for the ON project and consent receipt.. not to be judged... just to show the concept to the room.

February 09, 2014 07:28 PM

Data Privacy Legal Hack-A-thon, Day 1

We have five (5) projects going in San Francisco at the Data Privacy Legal Hackathon. After an initial introduction phase, and discussions, teams broke out and are all quietly working away. We have 3 groups and 2 individuals who are working on projects.. The largest group is leading group interested in privacy icons and terms and data policies work on his part, which is to make a privacy policy generator and some icons that represent what the then-structured policy would represent to make it easy for users to see what a privacy policy says and does to the user. After we talked a bit, he realized the value of the parts I'm working on with the Consent Map, Consent Receipt and various tools to make that happen, like the API project to the map. We went over the whole ecosystem we all propose and he sees the complementarity. Here is a diagram of that shows some of the different products that we discussed above: But that group is more interested in getting privacy policies structures and visualized than the other side of the transaction which would look at terms an individual would submit, like Do Not Track. However, they recognized that there is a need for a consent receipt at the end of either side setting a term. There is also a bitcoin thing for more private transactions for identity privacy (ie, taking things outside the financial networks, where you still have some kind of identity inside bitcoin, to taking things outside the identity systems in bitcoin..). I don't totally understand it but that's what they are talking about and trying to figure it out. There is an https server project, and another individual project that I haven't yet discussed with the maker. I'm working on the consent receipt. Other groups are likely want to hook into the consent receipt when they have their pieces.

February 09, 2014 01:46 AM

February 06, 2014

Ph.D. alumna

“It’s Complicated” is dedicated to Peter Lyman

As people start to get copies of my book, I want to offer more context to the brief dedication at the front of the book.

“It’s Complicated” is dedicated to my beloved advisor, Peter Lyman, who passed away before I finished my degree at Berkeley. It was Peter who initially helped me conceive of this project and it was Peter who took my limited ethnographic training and helped me develop deep reflexive instincts. As much joy as it brings me to see this book born into the world, it also saddens me that Peter couldn’t get to see the project completed.

After I left MIT, my undergraduate mentor and friend Andy van Dam sent me to Peter, confident that we’d get along like a house on fire. I begrudgingly agreed to meet Peter but when I showed up in California, he had been sent to jury duty and was unable to make the meeting. So he invited me to his home where he was hosting a dinner for graduate students. There, I got to meet a kind, gentle soul who not only inspired me with his intellect but revealed the beauty and pleasure of nurturing junior scholars. I was sold.

Once at Berkeley, Peter and I devised all sorts of plots to collaborate and, more importantly, to play institutional good-cop, bad-cop. In my spastic, obnoxious way, I would throw a fit at some injustice, inevitably piss off someone, and then he’d intervene and negotiate a truce. When we realized how effective this could be, we started scheming.

Peter’s illness was devastating. Always a brilliant orator, his cancer ate away at his ability to communicate. Even his serene peacefulness was tried over and over again by the frustration he felt not being able to express himself. And, for all that he adored and supported his students, his love of his children – and sadness in not being able to get to know his grandchildren – were what really weighed on him.

It’s been over six years since the world lost an amazing man. Nothing that I can do can bring him back but I wrote this book in part to honor him and all that he taught me. Peter showed me that there’s more to being a scholar than producing important works. Always gracious and warm, Peter did more to cultivate others than to advance his own career. He always said that, at Berkeley, he was paid to attend meetings so that he could keep up his hobby of teaching. His brilliance emerged through those that he empowered. I can only hope to have as much impact on those around me as he did on me.

Every time I look at my book, I smile thinking about Peter’s influence on me, my work, and my sense of what it means to be a scholar and intellectual. We all build on the shoulders of giants but sometimes those giants aren’t so visible to others. Peter was a huge giant in my life and I hope that others reading this book can see his influence in it.

by zephoria at February 06, 2014 05:11 PM

January 28, 2014

Ph.D. alumna

blatant groveling: please buy my book

In less than a month, my new book – “It’s Complicated: The Social Lives of Networked Teens” - will be published.  This is the product of ten years worth of research into how social media has inflected American teen life.  If you’ve followed this blog for a while, you’ve seen me talk about these issues over the years. Well, this book is an attempt to synthesize all of that work into one tangible artifact.

Now I have a favor…. please consider pre-ordering a copy (or two <grin>).  Pre-sales and first week sales really matter in terms of getting people’s attention. And I’m really hoping to get people’s attention with this book. I’ve written it to be publicly accessible in the hopes that parents, educators, journalists, and policy makers will read it and reconsider their attitude towards technology and teen practices.  The book covers everything from addiction, bullying, and online safety to privacy, inequality, and the digital natives debate.

If you have the financial wherewithal to buy a copy, I’d be super grateful.  If you don’t, I *totally* understand.  Either way, I’d be super super super appreciative if you could help me get the word out about the book. I’m really hoping that this book will alter the public dialogue about teen use of social media.


You can pre-order it at:
Barnes & Noble

by zephoria at January 28, 2014 07:36 PM

MIMS 2004

The New American Radical: Upholding the Status Quo in Law (IE the Constitution)

So what does that mean... the Status Quo? What I mean by that is the body of law we count on, that we base everything on, already in place: the Constitution, the Bill of Rights (amendments 1-10) and the rest of the Constitutional Amendments. That status quo. And wanting to just maintain the Status Quo, uphold and use it, as our standard of law, as the basis for what we do in the US? Yea, supporting that is the New American Radical act amongst the New American Radicals (you can count me amongst them as that's the system I signed up for... the one with the Constitution). How can this be? Asking for such should be a traditionalist thing, leaving the radicals to ask for new amendments, change 'you can believe in' yada yada and other controversial innovations to the law? But no.. it's a radical act in America these days to just ask that we uphold the Constitution, the Bill of Rights and the Amendments. I realized this is true, the other night, when I went to hear Daniel Ellsberg speak, along with Cindy Cohn of EFF, Shahid Buttar and Norman Soloman, along with Bob Jaffe moderating. And yes.. Ellsberg's an American Radical, but not just because he got the Pentagon Papers out 40 years ago. It's because he believes in the Constitution, the Bill of Rights, our other Amendments to be the rule of law. He had some very interesting things to share as well. Ellsberg talked about how years ago, "Richard" Cheney (as he called him.. I'm so used to "Dick") communicated a desire to change the constitution because he thought it was wrong, and that it should be different. Ellsberg said that that's okay, but then you have to change things through the system. Instead, Cheney and Bush and others have been corrupt, because they got elected, swore an oath to "defend the Constitution of the United States against all enemies, foreign and domestic" but then subverted the rules they swore to uphold. (I knew they weren't honorable men, but I never thought about it in these terms.) So in this case, they are the enemies, these corrupt parties, who subvert the Constitution, by taking, ".. your tax dollars, taken in secret, and spent in secret, to spy on everyone." Ellsberg's example of a founding father who parallels the whistleblower / leaker of today is Nathan Hale, the man who was caught by the British and hanged in 1776 for trying to share information with his own countrymen, Americans, about what the British were doing. Hale's famous line is: "I only regret that I have but one life to give for my country." What if we hanged people like that today, the people who leaked the full breadth of what was happening at Abu Ghraib instead of the public just seeing the sanitized, reduced version that claimed it was just a few isolated incidents, when in fact the torture at Abu Ghraib was huge and widespread and very shameful for us and our government? Or the Extraordinary Rendition program? Or Warrantless Wiretapping? All these secretive activities changed when they became public. And they changed as a result of whistleblower-leakers sharing information the government didn't want to get out, with the exception of Congress legalizing Warrantless Wiretaps once that activity became public. And now things are changing again because of Edward Snowden and the NSA surveillance information he let out. Ellsberg said, "To have knowledge of every private communication, every location, every credit card charge, everything.. to have one branch have power over the other two (executive, over legislative and judicial).. Snowden has confronted us with something that we could change.... But Obama is part of the problem. He just assures us that there is nothing to worry about. But who is to be trusted? The people who kept the secrets and lied to us? Diane Feinstein? Or do we trust Snowden? Snowden has done more to support the Constitution than any Senator, Congressman, the NSA ... " Ellsberg also talked about how when he was in trial, 40 years ago, he was out on bail, and could speak freely with the press. Today, if Snowden were on trial, he'd be in a hole, like Chelsea Manning. We wouldn't hear his thoughts on the issues in the trial, because the government would stop it, in trial and outside. During Ellsberg's trial, his lawyer tried about 5 times to get motive into the questioning, but the prosecution kept objecting. Motive didn't matter they said, and the judge agreed. The same thing would happen to Snowden, who would never be able to say, on the stand, why he did what he did. Cindy Cohn who has heroically been bringing law suit after law suit to stop some of these illegal practices, talked about how originally the FISA court started out approving targeted warrants -- so at least they knew who was targeted. But things have devolved, to where the FISA court is now presented with massively expanded, abstract warrants that don't even have the FISA court knowing who specifically is targeted. Smith vs Maryland, which ruled on the pen register method of an unwarranted wiretapping of a single land line, "..doesn't even pass the giggle test" when applied to the massive surveillance we undergo now. In fact, she said that, "Technology is our friend, encryption is our friend." That while major companies have been compromised, we need to develop technologies to help us, as much as we need to use legislative policy and the judicial system to fix this. Even companies, 5 large tech companies, had to get together last week and tell the government to stop hacking them, or they would lose customers and be severely affected. Cindy recommended we tell legislators to vote against the sham FISA Improvement Act, and instead support the USA Freedom ACt and the Surveillance State Repeal Acts, which have bi-partisan congressional support. "The days in which you can separate corporate surveillance and government surveillance are over.... The 3rd party doctrine undermines privacy, because *we all* give our data to 3rd parties." She went on to say that the tools for organizing against each type of collection are different, but the issues are similar. Lastly she noted that for 9/11, collection wasn't the gap. They knew about the guys. Sharing between agencies was the gap. Yet we haven't solved for that but we are collecting like mad! One other mention, Shahid Buttar spoke, but also performed a prose rap he's written, and he's running a Kickstarter to raise money (it's up Feb 6 so donate now) to do a professional video. (Reminds me a bit of Eddan Katz's Revolution is Not an AOL Keyword). Note also that we are doing the Data Privacy Legal Hackthon in 12 days !! Join us to work on this problem technically in SF, NYC and London, or join us online if you can't make it in person. Whether you support the artistic, legal or technical ways of addressing massive government surveillance and the subversion of the Constitution, stand up for your rights under the constitution. Feel what it's like to be a Radical American! Because you probably are a Radical American! Just like our forefathers and foremothers. If you believe in the Rule of Law and the Constitution.

January 28, 2014 03:06 AM

January 23, 2014

MIMS 2014

#141 – The Colbert Report / Conan O’Brien Show

I recently came across a hilarious article by the Atlantic about how another Chinese production studio has copied verbatim a United States intro sequence. This one is about the Colbert Report, and it’s among the best we’ve ever seen.

BEIJING — He may be an award-winning satirist in the United States, but in China, even Stephen Colbert is not beyond parody: A provincial TV channel in the country has produced a show that borrows rather liberally from the popular American program.

The Banquet, broadcast on Ningxia Satellite TV, lifted the entire opening credits and other graphics from The Colbert Report. Everything from the host’s entrance—flying down the screen as English words buzz past—to the star-spangled background is mimicked, and even the show’s theme music, the guitar riff from “Baby Mumbles” by Cheap Trick, is reproduced note for note.

When Chinese TV Rips Off The Colbert Report–

Earlier this year, Conan O’Brien had a heartwarming response to a Chinese copycat of his program, even re-designing and providing them with a new introduction for their show!


Youku video for those in China after the jump. Previously: Conan calls out Da Peng, Da Peng responds.


©2014 Stuff Asian People Like. All Rights Reserved.


by Peter at January 23, 2014 05:54 PM

January 21, 2014

Ph.D. student

Decentralization, democracy, and rationality

I got into a conversation involving @billmon1, Brian Keegan, and Nate Matias on Twitter the other day. I got so much out of it that I resolved to write up thoughts on it in a blog post. It’s appropriate that I get around to it on Martin Luther King Jr. day, since the conversation was about whether technology can advance social and democratic ideals, a topic I’ve been interested in for a long time.

Billmon (I don’t know his real name–not sure how I got into this conversation in the first place, to be honest) opened by expressing what’s becoming an increasingly popular cynicism towards those that look to social media technology for political solutions.

Now, I’m aware of the critiques of the idea that social media is inherently democratic. But I this claim, as stated, is wrong in lots of interesting ways.

The problem is that while social media technologies can’t be guaranteed to foster liberal, democratic values, decentralization and inter-connectivity of communications infrastructure are precisely the material conditions for liberal democracy. State power depends on communications, if the state can control the means of communication, or if citizens are so disconnected that they cannot come together to deliberate, then there is no place for an independent public to legitimize the state institutions that control it.

This isn’t a novel or radical position. It’s a centrist argument about the conditions of liberal democracy. However, it’s a position that’s presently under attack from all sides. Will Wilkinson has recently written that “liberal” statist critiques of Snowden and Wikileaks, who were essentially acting in accord with interests of independent public media, are being cast as “libertarian” boogeymen for what are essentially liberal principles. Meanwhile, the left is eager to attack this position as not democratic or radical enough.

A number of these critiques came through in our conversation. And since I’m writing the blog post summarizing the conversation, naturally I won all the arguments. For example, Billmon pointed out that Facebook is itself a mega-corporation controlling and monetizing communication with political interests. But that’s just it: Facebook is not inherently democratic because it is under centralized control and does not promote inter-connectivity (EdgeRank is great for filter bubbles). Contrast this with Diaspora, and you have a technology that supports a very different politics.

Brian Keegan came through with a more conservative argument:

Keegan is way right about this. Because I’m a pedant, I pointed out that populism is in accordance with the democratic ideal. But I hate the tribalist mob as much as anybody. That’s why I’m looking for ways to design infrastructure to enable communicative rationality–the kind of principled communication that leads to legitimate consensus among its diverse constituents. Keegan pointed me to Living Voters Guide as an existing example of this. It’s a cool example, but I’m looking for something that could integrate better with communications infrastructure already used on a massive scale, like email or Twitter.

The problem with bringing up Habermasian rationality in today’s academic or quasi-academic environment is that you immediately get hit by the left-wing critique that came up in the late 80′s and early 90′s. Cue Nate Matias:

He’s right of course. He also pointed me to this excellent actual by Nancy Fraser from 1990 articulating ways in which Habermas idealized bourgeois masculinist notions of the public sphere and ignored things like the exclusion of women and the working class counterpublics.

Reading the Fraser piece, I note that she doesn’t actually dismiss the idea of communicative rationality in its entirety. Rather, she simply doesn’t want it to be used in a way that falsely “brackets” (leaves out of the conversation) status differences:

Now, there is a remarkable irony here, one that Habermas’s account of the rise of the public sphere fails fully to appreciate.8 A discourse of publicity touting accessibility, rationality, and the suspension of status hierarchies is itself deployed as a strategy of distinction. Of course, in and of itself, this irony does not fatally compromise the discourse of publicity; that discourse can be, indeed has been, differently deployed in different circumstances and contexts. Nevertheless, it does suggest that the rela- tionship between publicity and status is more complex than Habermas intimates, that declaring a deliberative arena to be a space where extant status distinctions are bracketed and neutralized is not sufficient to make it so.

(This is her only use of the word “rationality” in the linked piece, though poking around I gather that she has a more comprehensive critique elsewhere.)

So there is plenty of room for a moderate position that favors decentralized communications organized under a more inclusive principle of rationality–especially if that principle of rationality allows for discussion of status differences.

I’m personally happy with the idea of keeping irrational people out of my town hall. Otherwise, as Billmon points out, you can get people using decentralized communication channels to promote bad things like ethnic violence. This is already in fact the status quo, as every major social media host invests heavily in spam prevention, effectively excluding a class of actors who are presumed to be acting in bad faith. I’ve suggested elsewhere that we should extend our definition of spam to exclude more bad actors.

This opens up some really interesting questions. If we are willing to accept that there is an appropriate middle ground between centralized control of communications on the one hand and demagogue-prone chaos on the other, where should we draw the line? And how would we want to design, distribute, and organize our communications technology and our use of it to hit that sweet spot?

by Sebastian Benthall at January 21, 2014 12:55 AM

January 16, 2014

MIMS 2004

Data Privacy Legal Hack-A-thon

This is an unprecedented year documenting our loss of Privacy. Never before have we needed to stand up and team up to do something about it. In honour of Privacy Day, the Legal Hackers are leading the charge to do something about it, inspiring a two-day international Data Privacy Legal Hackathon. This is no ordinary event. Instead of talking about creating privacy tools in theory, the Data Privacy Legal Hackathon is about action! A call to action for tech & legal innovators who want to make a difference! We are happy to announce a Data Privacy Legal Hackathon and invite the Kantara Community to get involved and participate. We are involved in not only hosting a Pre-Hackathon Project to create a Legal Map for consent laws across jurisdictions, but the CISWG will also be posting a project for the Consent Receipt Scenario that is posted in on the ISWG wiki. The intention is to hack Open Notice with a Common Legal Map to create consent receipts that enable ‘customisers’ to control personal information If you would like to get involved in the hackathon, show your support, or help build the consent receipt infrastructure please get involved right away — you can get intouch with Mark (dot) Lizar (at)gmail (dot) com, Hodder (at) gmail (dot) com, or join the group pages that are in links below. Across three locations on February 8th & 9th, 2014, get your Eventbrite Tickets Here: * New York City * London, UK * San Francisco * This two-day event aims to mix the tech and legal scenes with people and companies that want to champion personal data privacy. Connecting entrepreneurs, developers, product makers, legal scholars, lawyers, and investors. Each location will host a two-day “judged” hacking competition with a prize awarding finale, followed by an after-party to celebrate the event. The Main Themes to The Hackathon Are: Crossing the Pond Hack Do Not Track Hack Surveillance & Anti-Surveillance Transparency Hacks Privacy Policy Hack Revenge Porn Hack Prizes will be Awarded! 1st Prize: $1000 2nd Prize: $500 3rd Prize: $250 There are pre-hackathon projects and activities. Join the Hackerleague to participate in these efforts and list your hack: A Consent Legal Map & Schema Project to create a legal map of the consent laws as a legal hackers tool for the event and projects posted at the event (many volunteers needed) Brainstorming List of Hacks - Add your ideas Share Tech and Links Page – Share your Knowledge Hacks (Project) Page – Propose or Join a project IRC Channel for Discussion Sponsorship Is Available & Needed Any organization or company seeking to show active support for data privacy and privacy technologies is invited to get involved. Sponsor: prizes, food and event costs by becoming a Platinum, Gold or Silver Sponsor Participate: at the event by leading or joining a privacy hack project Mentor: projects or topics that arise for teams, and share your expertise.   Contact NYC sponsorship: Phil Weiss email or @philwdjjd Contact Bay Area sponsorship: Mary Hodder – Hodder (at) gmail (dot) com - Phone: 510 701 1975 Contact London sponsorship: Mark Lizar – Mark (dot) Lizar (at)gmail (dot) com - Phone: +44 02081237426 - @smarthart

January 16, 2014 04:36 PM

January 13, 2014

MIMS 2012

Using Data for Social Proof

I recently ran a test on Optimizely’s free trial signup page that’s a great example of using social proof to increase conversions. I’ve written about social proof before, but the gist of the idea is that telling a person other people are already using your product or service makes them more likely to also use it. Knowing other people have already taken a particular action reduces perceived risk.

The free trial signup page is a search engine marketing page that people land on when coming from an ad. The page started out as a simple signup form with an image to illustrate split testing, whose primary goal is to get people to sign up for Optimizely. I created a variation that displayed the total number of visitors tested and experiments run with Optimizely, which are live counters that tick up in real time. I wanted to see if using data to demonstrate how many people are already using Optimizely was an effective form of social proof.

Original page with no social proof

Original page with no social proof.

Variation with social proof

Variation with the total visitors tested and experiments run as social proof.

The test was a definitive win that resulted in an 8.7% lift in sign ups. The change itself was relatively simple (i.e. easy to design, easy to implement), but the test further proved the power of social proof.

by Jeff Zych at January 13, 2014 12:07 AM

January 08, 2014

Ph.D. student

Dear Internet Ladies, please report harassment as spam

Dear Ladies of the Internet,

In light of Amanda Hess’s recent article about on-line harassment of women, in particular feminist writers, I have to request that you make a simple adjustment to your on-line habits.

Please start reporting threats and harassment on Twitter as spam.

Screenshot from 2014-01-08 10:57:44

No doubt many of you will immediately grasp what I’m getting at and to you nothing more need be said. I admit my judgment may be mistaken, as I am not the sort of person who receives many threats either on-line or off it and so do not have that vantage point. However, I am also privileged with modest expertise in spam detection that I believe is relevant to the problem. It is because I’d like to share that privilege with a diverse audience that I will elaborate here.

If you are unfamiliar with the technical aspects of social media, you may be skeptical that something like spam prevention can have any bearing on the social problem of on-line death threats. Hess herself frames Internet harassment as a civil rights issue, suggesting that in the United States federal legislation could be a solution. She also acknowledges the difficulty of enforcing such legislation against anonymous Internet users. Despite these difficulties, she argues that just ignoring the threats is not enough.

She is skeptical about Twitter’s ‘report abuse’ feature, noting that Twitter doesn’t take legal responsibility for harassment and that if an abuse report is successful, then the offending account is shut down. When an account is shut down, it becomes difficult to recover the offending tweets for record keeping. Hess keeps records of harassing messages in case she needs to contact law enforcement, though reports that the police she talks to don’t know what Twitter is and are unsympathetic.

To be honest, I shake my head a bit at this. I do see the problem as important and action necessary, but a simpler (though admittedly only partial) solution has always been in reach. I’m not optimistic about the prospects of passing federal legislation protecting Internet feminists any time soon. But blocking large classes of unsolicited obnoxious messages from social media is a solved problem. This is precisely what spam reporting is for. I have to take its omission from Hess’s otherwise comprehensive report as an indication that this is not widely understood.

Everyone should know that spam is a social construct. Twitter lists many of the factors used to determine whether an account should be suspended as spam. One of the most important of these factors is “If a large number of spam complaints have been filed against you.”

Twitter can list the myriad features used to identify spam because they are describing an automated, adaptive system that they have engineered. The goal of the system is to quickly learn from users’ spam reports what rapidly evolving tactics spammers are using, and squelch them before they can become a widespread nuisance. This is an interesting and active research area I encourage you to investigate further. UC Berkeley’s Security Research Lab works on this. This paper by Thomas et. al (2011) is a great example of it.

There are still plenty of spammers on Twitter, but far fewer than if Twitter did not make a significant investment in researching and implementing anti-spam techniques. They are up against some truly diabolical opponents. The cyber-security battle between service providers and spammers is a never-ending arms race because it turns out there’s a lot of money in spam. Because it operates on massive scale, this work is driven by performance metrics derived from things like your spam reports.

What does this have to do with online harassment? Well, my educated guess is that compared to seasoned cybercriminals, the vast majority of jerks sending death and rape threats to feminist bloggers are morons. Their writing is typical of what you see scrawled on the walls of a poorly attended men’s restroom. They are probably drunk. Though I cannot be certain, I suspect they are no match for the sophisticated algorithms that find and block enormous amounts of creatively designed spam every day, as long as those algorithms know what to look for.

There are exceptions: Hess documents several cases of dedicated stalkers that could be dangerous beyond the psychological harm of receiving a lot of stupid verbal abuse. I don’t claim to know the right thing to do about them. But I think it’s worth distinguishing them from the rest, if only because they need to be handled differently. If Twitter can resolve most of the low risk cases automatically, that will help give them the bandwidth to be accountable in the more exceptional and serious cases.

In the grand scheme of things, this is way better than going to unresponsive police, trying to pass legislation in a broken Congress, or going through the legal trouble of suing somebody for every threat. The equivalent in off-line life would be a system by which anonymously reported threats and harassment train a giant robot that shoots offenders with stun lasers from the sky. What’s cool is that when you report a spam account, you simultaneously block a user, prevent them from following or replying to you, and also report the incident to Twitter, where they will use the data to stop the kinds of things that bother you. Rather than keep a private record of harassment, by filing this report you can automatically make things a little better for everybody.

All this takes is a collective decision to consider most forms of on-line harassment a kind of spam. I have decided this already for myself, and I hope you will consider it as well.

Sincerely yours,

Sebastian Benthall
PhD Student
University of California at Berkeley, School of Information

by Sebastian Benthall at January 08, 2014 07:56 PM

January 06, 2014

Ph.D. student

Bots, bespoke code, and the materiality of software platforms

This is a new article published in Information, Communication, and Society as part of their annual special issue for the Association of Internet Researchers (AoIR) conference. This year’s special issue was edited by Lee Humphreys and Tarleton Gillespie, who did a great job throughout the whole process.

Abstract: This article introduces and discusses the role of bespoke code in Wikipedia, which is code that runs alongside a platform or system, rather than being integrated into server-side codebases by individuals with privileged access to the server. Bespoke code complicates the common metaphors of platforms and sovereignty that we typically use to discuss the governance and regulation of software systems through code. Specifically, the work of automated software agents (bots) in the operation and administration of Wikipedia is examined, with a focus on the materiality of code. As bots extend and modify the functionality of sites like Wikipedia, but must be continuously operated on computers that are independent from the servers hosting the site, they involve alternative relations of power and code. Instead of taking for granted the pre-existing stability of Wikipedia as a platform, bots and other bespoke code require that we examine not only the software code itself, but also the concrete, historically contingent material conditions under which this code is run. To this end, this article weaves a series of autobiographical vignettes about the author’s experiences as a bot developer alongside more traditional academic discourse.

Official version at Information, Communication, and Society

Author’s post-print, free download [PDF, 382kb]


by stuart at January 06, 2014 09:22 PM

January 02, 2014

MIMS 2014

#140 Linsanity (The Movie)

Synopsis: In February 2012, an entire nation of basketball fans unexpectedly went ‘Linsane.’ Stuck in the mire of a disappointing season, the New York Knicks did what no other NBA team had thought about doing. They gave backup point guard Jeremy Lin an opportunity to prove himself. He took full advantage, scoring more points in his first five NBA starts than any other player in the modern era, and created a legitimate public frenzy in the process.  Linsanity is a moving and inspirational portrait of Jeremy Lin.  It chronicles his path to international stardom, the adversities he faced along the way, his struggles to overcome stereotypes and how he drew strength from his faith, family and culture.

Available on DVD: January 7, 2014 


Director: Evan Leong
Cast:  Narrator Daniel Dae Kim (Lost, Hawaii Five-0), Ming Yao, Jeremy Lin
Running Time: 88 Minutes

Rated: PG for some thematic elements and language.
Genre: Documentary
Aspect Ratio: 2.40
Audio Format: DD5.1
UPC#: 796019827485
SRP: $20.99

©2014 Stuff Asian People Like. All Rights Reserved.


by Peter at January 02, 2014 08:06 AM

December 30, 2013

MIMS 2012

Love, life


I used to think that life is complicated; it consists of unknowns, unpredictables and mysteries. I used to panic at times since unknowns are scary; unpredictables are miserable and mysteries are frightening.

I changed. This epiphany may take me over a course of 10 years; it may just happen overnight. It doesn’t matter. Now, I think life is simple. Life is still unknown, unpredictable and mysterious; but unknowns are exciting, unpredictable must be rewarding and mysterious are navigable. What really make this tick: love, and will.

With love and will, I get unconditional support for no matter what I do; with love and will, I will be the person I want to be; with love and will, I gain the courage to overcome any obstacle that anyone could imagine; with love and will, the impossibles become possible.

Life is simple:

- A person to be

- A work to do

- Someone to love.

by admin at December 30, 2013 05:53 PM

December 19, 2013

MIMS 2014

#139 Bitcoin, Litecoin, Dogecoin, & Altcoins

China’s obsession with Bitcoin, a decentralized currency, ballooned the price of BTC up to 1200 in a short few months. 

Since the taper announcement from the Bank of China, there has been a panic-driven selloff from Chinese speculators and weaker traders, but the price currently stands at $600 USD / BTC. That doesn’t stop Asians from investing in Litecoin and other alternate cryptocurrencies in the near future. There’s a new wave of trusted and reliable alternate currencies springing up. One we recommend is Dogecoin, which is based on the popular meme…


Here are 5 Reasons Why You Should Purchase Some Dogecoin (via

Dogecoin has risen in value, attention, and adoption since its December 8th debut. Here’s 10 Facts You Need to Know about this exciting cryptocurrency based on what is variously classified as thedogedogge, or shibe memes. For background information on the coin, see the original article here.

1. Dogecoin Was Briefly The Most Valuable Cryptocurrency in the World

As you can see above, Dogecoin was the most valuable cryptocurrency by market capitalization earlier today. This happened because a buyer at CoinedUp placed a bid that valued Dogecoin at .5BTC for a single Dogecoin (also known simply as doge). While the value of the trade is unknown, it’s unlikely that it was a large purchase because the buyer would stand to lose much money at such a tremendous valuation of DOGE (currently, anyway). As of 12/16/13, 5:00PM EST, Doge is worth a more modest .03BTC per 100,000 dogecoin, a drop of -99.99994%.

2. Due to the Price Jump, The Market Capitalization of Dogecoin was Briefly 2.8 Trillion Dollars

HowStuffWorks estimates that the amount of United States dollars in circulation is around 2.5 trillion dollars, so for a brief moment, all of the cash in the United States could not have purchased all of the Dogecoin, at what was then, the present market value.

3. Despite the Massive Fall, Dogecoin is Still Gaining Massively in Price

While it’s no surprise that Dogecoin isn’t worth 2.8 Trillion dollars, the price has still raised dramatically. From the time of writing our first article about Dogecoin, the price was roughly $1 per 10,000 Dogecoin, now real, substanial purchases have been made at the rate of $1 per 1000 Dogecoin, as seen in this ebay purchase:

4. Dogecoin is Also Receiving Massive Attention on Reddit and the Web

Redditor JayQuery took a screencap of other popular cryptocurrency subreddits and noted that Dogecoin was among the most popular. Since our last article published about a week ago, many pieces have been written about Dogecoin, including a Wikipedia article.

Respected Tech giant, The Verge has even covered Dogecoin saying, “Bitcoin is so 2013: Dogecoin is the new cryptocurrency on the block.” Wow.

5. Dogecoin Can Now Be Easily Exchanged for Bitcoin and is Rumored to be Coming to Cryptsy

Marking the first sign of a legitimate cryptocurrency, Dogecoin can now be easily traded for BTC, previously mentioned as the source of the trade where Bitcoin was traded for Dogecoin at the rate of .5BTC for 1 Doge.

Momentum is also building on the Dogecoin subreddit, r/dogecoin, toward putting the cryptocurrency on, an exchange with many cryptocurrencies that is praised for its potential and lambasted for its often glitchy, coin-disappearing interface (but don’t worry, they will return your money).

As the above image mentions, Dogecoin is now the fourth most-mined Scrypt-based coin. In broken down terms, Dogecoin cannot be mined the same way as Bitcoin, using specialized equipment known as ASICs. Scrypt-based currencies like Dogecoin and Litecoin, can be mined using consumer hardware and this is enhanced by using ATI (and to some extent Nvidia) graphics cards. This has caused a shortage of the graphics cards. The significance here then is that Dogecoin is attracting the attention of both the general public, and mobilizing both rookies and power-users to engage in the technical task of mining, a true testament to the coin’s hotness and approachability. Wow.

©2014 Stuff Asian People Like. All Rights Reserved.


by Peter at December 19, 2013 09:47 AM

December 16, 2013

Ph.D. student

Reflections on the Berkeley Institute for Data Science (BIDS) Launch

Last week was the launch of the Berkeley Institute for Data Science.

Whatever might actually happen as a result of the launch, what was said at the launch was epic.

Vice Chancellor of research Graham Flemming introduced Chancellor Nicholas Dirks for the welcoming remarks. Dirks is UC Berkeley’s 10th Chancellor. He succeeded Robert Birgeneau, who resigned gracefully shortly after coming under heavy criticism for his handling of Occupy Cal, the Berkeley campus’ chapter of the Occupy movement. He was distinctly unsympathetic to the protesters, and there was a widely circulated petition declaring a lack of confidence in his leadership. Birgeneau is a physicist. Dirks is an anthropologist who has championed postcolonial approaches. Within the politics of the university, which are a microcosm of politics at large, this signalling is clear. Dirks’ appointment was meant to satisfy the left wing protesters, most of whom have been trained in softer social sciences themselves. Critical reflection on power dynamics and engagement in activism–which is often associated with leftist politics–are, at least formally, accepted by the university administration as legitimate. Birgeneau would subsequently receive awards for his leadership in drawing more women into the sciences and aiding undocumented students.

Dirks’ welcoming remarks were about the great accomplishments of UC Berkeley as a research institution the vague but extraordinary potential of BIDS. He is grateful, as we all are, for the funding from the Moore and Sloan foundations. I found his remarks unspecific, and I couldn’t help but wonder what his true thoughts were about data science in the university. Surely he must have an opinion. As an anthropologist, can he consistently believe that data science, especially in the social sciences, is the future?

Vicki Chandler, Chief Program Officer from the Moore Foundation, was more lively. Pulling no punches, she explained that the purpose of BIDS is to shake up scientific culture. Having hung out in Berkeley in the 60′s and attended it as an undergraduate in the 70′s, she believes we are up for it. She spoke again and again of “revolution”. There is ambiguity in this. In my experience, faculty are divided on whether they see the proposed “open science” changes as imminent or hype, as desirable or dangerous. More and more I see faculty acknowledge that we are witnessing the collapse of the ivory tower. It is possible that the BIDS launch is a tipping point. What next? “Let the fun begin!” concluded Chandler.

Saul Perlmutter, Nobel laureate physicist and front man of the BIDS co-PI super group, gave his now practiced and condensed pitch for the new Institute. He hit all the high points, pointing out not only the potential of data science but the importance of changing the institutions themselves. Rethinking the peer-review journal from scratch, he said, we should focus more on code reuse. Software can be a valid research output. As much as open science is popular among the new generation of scientists, this is a bold statement for somebody with such credibility within the university. He even said that the success of open source software is what gives us hope for the revolutionary new kind of science BIDS is beginning. Two years ago, this was a fringe idea. Perlmutter may have just made it mainstream.

Notably, he also engaged with the touchy academic politics, saying that data science could bring diversity to the sciences (though he was unspecific about the mechanism for this). He expounded on the important role of ethnography in evaluating the Institute to identify the bottlenecks to its unlocking its potential.

The man has won at physics and is undoubtedly a scientist par excellance. Perhaps Perlmutter sees the next part of his legacy as the bringing of the university system into the 21st century.

David Culler, Chair of the Electrical Engineering and Computer Science department, then introduced a number of academic scientists, each with impressive demonstrations about how data science could be applied to important problems like climate change and disaster reduction. Much of this research depends on using the proliferation of hand-held mobile devices as sensors. University science, I realized while watching this, is at its best when doing basic research about how to save humanity from nature or ourselves.

But for me the most interesting speakers in the first half of the launch were luminaries Peter Norvig and Tim O’Reilly, each giants in their own right and welcome guests to the university.

Culler introduced Norvig, Director of Research at Google, by crediting him as one of the inventors of the MOOC. I know his name mainly as a co-author of “Artificial Intelligence: A Modern Approach,” which I learned and taught from as an undergraduate. Amazingly, Norvig’s main message is about the economics of the digital economy. Marginal production is cheap, cost of communication is cheap, and this leads to an accumulation of wealth. Fifty percent of jobs are predicted to be automated away in the coming decades. He is worried about the 99%–freely using Occupy rhetoric. What will become of them? Norvig’s solution, perhaps stated tongue in cheek, is that everyone needs to become a data scientist. More concretely, he has high hopes for hybrid teams of people and machines, that all professions will become like this. By defining what academic data science looks like and training the next generation of researchers, BIDS will have a role in steering the balance of power between humanity and the machines–and the elite few who own them.

His remarks hit home. He touched on anxieties that are as old as the Industrial Revolution: is somebody getting immensely rich off of these transformations, but not me? What will my role be in this transformed reality? Will I find work? These are real problems and Norvig was brave to bring them up. The academics in the room were not immune from these anxieties either, as they watch the ivory tower crumble around them. This would come up again later in the day.

I admire him for bringing up the point, and I believe he is sincere. I’d heard him make the same points when he was on a panel with Neil Stephenson and Jaron Lanier a month or so earlier. I can’t help but be critical of Norvig’s remarks. Is he covering his back? Many university professors are seeing MOOCs themselves as threatening to their own careers. It is encouraging that he sees the importance of hybrid human/machine teams. If the machines are built on Google infrastructure, doesn’t this contribute to the same inequality he laments, shifting power away from teachers to the 1% at Google? Or does he foresee a MOOC-based educational boom?

He did not raise the possibility that human/machine hybridity is already the status quo–that, for example, all information workers tap away at these machines and communicate with each other through a vast technical network. If he had acknowledged that we are all cyborgs already, he would have had to admit that hybrid teams of humans and machines are as much the cause of as solution to economic inequality. Indeed, this relationship between human labor and mechanical capital is precisely the same as the one that created economic inequality in the Industrial Revolution. When the capital is privately owned, the systems of hybrid human/machine productivity favor the owner of the machines.

I have high hopes that BIDS will address through its research Norvig’s political concern. It is certainly on the mind of some of its co-PI’s, as later discussion would show. But to address the problem seriously, it will have to look at the problem in a rigorous way that doesn’t shy away from criticism of the status quo.

The next speaker, Tim O’Reilly, is a figure who fascinates me. Culler introduced him as a “God of the Open Source Field,” which is poetically accurate. Before coming to academia, I worked on Web 2.0 open source software platforms for open government. My career was defined by a string of terms invented and popularized by O’Reilly, and to a large extent I’m still a devotee of his ideas. But as a practitioner and researcher, I’ve developed a nuanced view of the field that I’ve tried to convey in the course on Open Collaboration and Peer Production I’ve co-instructed with Thomas Maillart this semeser.

O’Reilly came under criticism earlier this year from Evgeny Morozov, who attacked him for marketing politically unctuous ideas while claiming to be revolutionary. He focuses on his promotion of ‘open source’ over and against Richard Stallman’s explicitly ethical and therefore contentious term ‘free software‘. Morozov accuses O’Reilly of what Tom Scocca has recently defined as rhetorical smarm–dodging specific criticism by denying the appropriateness of criticism in general. O’Reilly has disputed the Morozov piece. Elsewhere he has presented his strategy as a ‘marketer of big ideas‘, and his deliberate promoting of more business-friendly ‘open source’ rhetoric. This ideological debate is itself quite interesting. Geek anthropologist Chris Kelty observes that it is participation in this debate, more so than an adherence to any particular view in it, that characterizes the larger “movement,” which he names the recursive public.

Despite his significance to me, with an open source software background, I was originally surprised when I heard Tim O’Reilly would be speaking at the BIDS launch. O’Reilly had promoted ‘open source’ and ‘Web 2.0′ and ‘open government’, but what did that have to do with ‘data science’?

So I was amused when Norvig introduced O’Reilly by saying that he didn’t know he was a data scientist until the latter wrote an article in Forbes (in November 2011) naming him one of “The World’s 7 Most Powerful Data Scientists.” Looking at the Google Trends data, we can see that November 2011 just about marks the rise of ‘data science’ from obscurity to popularity. Is Tim O’Reilly responsible for the rise of ‘data science’?

Perhaps. O’Reilly’s explained that he got into data science by thinking about the end game for open source. As open source software becomes commodified (which for him I think means something like ‘subject to competitive market pressure), what becomes valuable is the data. And so he has been promoting data science in industry and government, and believes that the university can learn important lessons from those fields as well. He held up his Moto X phone, explained how it is ‘always listening’ and so can facilitate services like Google Now. All this would go towards a system with greater collective intelligence, a self-regulating system that would make regulators obsolete.

Looking at the progression of the use of maps, from paper to digital to being embedded in services and products like self-driving cars, O’Reilly agrees with Norvig about the importance of human-machine interaction. In particular, he believes that data scientists will need to know how to ask the right questions about data, and that this is the future of science. “Others will be left behind,” he said, not intending to sound foreboding.

I thought O’Reilly presented the combination of insight and boosterism I expected. To me, his presence at the BIDS launch meant to me that O’Reilly’s significance as a public intellectual has progressed from business through governance and now to scientific thinking itself. This is wonderful for him but means that his writings and influence should be put under the scrutiny we would have for an academic peer. It is appropriate to call him out for glossing over the privacy issues around a mobile phone that is “always listening,” or the moral implications of the obsolescence of regulators for equality and justice. Is his objectivity compromised by the fact that he runs a publishing company that sells complementary goods to the vast supply of publicly available software and data? Does his business agenda incentivize him to obscure the subtle differences between various segements of his market? Are we in the university victims of that obscurity as we grapple with multiple conflated meanings of “openness” in software and science (open to scrutiny and accountability, vs. open for appropriation by business, vs. open to meritocratic contribution)? As we ask these questions, we can be grateful to O’Reilly for getting us this far.

I’ve emphasized the talks given by Norvig and O’Reilly because they exposed what I think are some of the most interesting aspects of BIDS. One way or another, it will be revolutionary. Its funders will be very disappointed if it is not. But exactly how it is revolutionary is undetermined. The fact that BIDS is based in Berkeley, and not in Google or Microsoft or Stanford, guarantees that the revolution will not be an insipid or smarmy one which brushes aside political conflict or morality. Rather, it promises to be the site of fecund political conflict. “Let the fun begin!” said Chandler.

The opening remarks concluded and we broke for lunch and poster sessions–the Data Science Faire (named after O’Reilly’s Maker Faire…

What followed was a fascinating panel discussion led by astrophysicist Josh Bloom, historian and university administrator Cathryn Carson, computer science professor and AMP Lab director Michael Franklin, and Deb Agrawal, a staff computer scientist for Lawrence Berkeley National Lab.

Bloom introduced the discussion jokingly as “just being among us scientists…and whoever is watching out there on the Internet,” perhaps nodding to the fact that the scientific community is not yet fully conscious that their expectations of privileged communication are being challenged by a world and culture of mobile devices that are “always listening.”

The conversation was about the role of people in data science.

Carson spoke as a domain scientist–a social scientist who studies scientists. Noting that social scientists tend to work in small teams lead by graduate students motivated by their particular questions, she said her emphasis was on the people asking questions. Agrawal noted that the number of people needed to analyze a data set does not scale with the size of data, but the complexity of data–a practical point. (I’d argue that theoretically we might want to consider “size” of data in terms of its compressibility–which would reflect its complexity. This ignores a number of operational challenges.) For Franklin, people are a computational resource that can be part of a crowd-sourced process. In that context, the number of people needed does indeed scale with the use of people as data processors and sensors.

Perhaps to follow through on Norvig’s line of reasoning, Bloom then asked pointedly if machines would ever be able to do the asking of questions better than human beings. In effect: Would data science make data scientists obsolete?

Nobody wanted to be the first to answer this question. Bloom had to repeat it.

Agrawal took a first stab at it. The science does not come from the data; the scientist chooses models and tests them. This is the work of people. Franklin agreed and elaborated–the wrong data too early can ruin the science. Agrawal noted that computers might find spurious signals in the noise.

Personally, I find these unconvincing answers to Bloom’s question. Algorithms can generate, compare, and test alternative models against the evidence. Noise can, with enough data, be filtered away from the signal. To do so pushes the theoretical limits of computing and information theory, but if Franklin is correct in his earlier point that people are part of the computational process, then there is no reason in principle why these tasks too might not be performed if not assisted by computers.

Carson, who had been holding back her answer to listen to the others, had a bolder proposal: rather than try to predict the future of science, why not focus on the task of building that future?

In another universe, at that moment someone might have asked the one question no computer could have answered. “If we are building the new future of science, what should we build? What should it look like? And how do we get there?” But this is the sort of question disciplined scientists are trained not to ask.

Instead, Bloom brought things back to practicality: we need to predict where science will go in order to know how to educate the next generation of scientists. Should we be focusing on teaching them domain knowledge, or on techniques?

We have at the heart of BIDS the very fundamental problem of free will. Bloom suggests that if we can predict the future, then we can train students in anticipation of it. He is an astrophysics and studies stars; he can be forgiven for the assumption that bodies travel in robust orbits. This environment is a more complex one. How we choose to train students now will undoubtedly affect how science evolves, as the process of science is at once the process of learning and training new scientists. His descriptive question then falls back to the normative one: what science are we trying to build toward?

Carson was less heavy-handed than I would have been in her position. Instead, she asked Bloom how he got interested in data science. Bloom recalled his classical physics training, and the moment he discovered that to answer the kinds of questions he was asking, he would need new methods.

Franklin chimed in on the subject of education. He has heard it said that everyone in the next generation should learn to code. With marked humility for his discipline, he said he did not agree with this. But he said he did believe that everyone in the next generation should learn data literacy, echoing Norvig.

Bloom opened the discussion to questions from the audience.

The first was about the career paths for methodologists who write software instead of papers. How would BIDS serve them? It was a soft ball question which the panel hit out of the park. Bloom noted that the Moore and Sloan funders explicitly asked for the development of alternative metrics to measure the impact of methodologist contributions. Carson said that even with the development of metrics, as an administrator she knew it would be a long march through the institution to get those metrics recognized. There was much work to be done. “Universities got to change,” she rallied. “If we don’t change, Berkeley’s being great in the past won’t make it great in the future,” referring perhaps to the impressive history of research recounted by Chancellor Dirks. There was applause. Franklin pointed out that the open source community has its own metrics already. In some circles some of his students are more famous than he is for developing widely used software. Investors are often asking him when his students will graduate. The future, it seems, is bright for methodologists.

At this point I lost my Internet connection and had to stop livetweeting the panel; those tweets are the notes from which I am writing these reflections. Recalling from memory, there was one more question from Kristina Kangas, a PhD student in Integrative Biology. She cited research about how researchers interpreting data wind up reflecting back their own biases. What did this mean for data science?

Bloom gave Carson the last word. It is a social scientific fact, she said, that scientists interpret data in ways that fit their own views. So it’s possible that there is no such thing as “data literacy”. These are open questions that will need to be settled by debate. Indeed, what then is data science after all? Turning to Bloom, she said, “I told you I would be making trouble.”

by Sebastian Benthall at December 16, 2013 10:44 PM

December 13, 2013

Ph.D. alumna

Email Sabbatical: December 13-January 10

I call this year my year of triplets. Over the last few months, I had my first child, finished my book, and kickstarted a research institute.

In planning this year, one of the things that I promised myself was that when Ziv was giggly and smiley, I would take a proper holiday and get to know him better. That time has come. Ziv, Gilad, and I are off to Argentina for a month of trekking and rejuvenation.

Those who know me know that I take vacations very seriously. They’re how I find center so that I can come back refreshed enough to take things to the next level. 2014 promises to be an intense year. It’ll begin with a book tour and then I’ll transition into launching Data & Society properly.

Before I jump into the awesome intensity of what’s to come, I need a break. A real break. The kind of break where I can let go of all of my worries and appreciate the present. To do this, I’m taking one of my email sabbaticals. This means that my email will be turned off. No emails will get through and none will be waiting for me when I return. I know that this seems weird to those who don’t work with me but I’ve worked hard to close down threads and create backup plans so that I can come home without needing to wade through digital hell.

If you’re hoping to reach me, here are four options:

  1. Resend your email after January 10. Sorry for the inconvenience.
  2. If you want it waiting for me, send me a snail mail: danah boyd / Microsoft Research / 641 6th Ave, 7th Floor, NY NY 10011
  3. For Data & Society inquiries: contact Seth Young at info [at]
  4. For “It’s Complicated” questions: contact Elizabeth Pelton at lizpelton [at] gmail

The one person that I will be in touch with while on vacation is my mom. Mom’s worry and that’s just not fair.

I’m deeply grateful for all of the amazing people who have made 2013 such a phenomenal year. With a bit of R&R, I hope to make 2014 just as magical. Have a fantastic holiday season! Lots of love and kisses!

by zephoria at December 13, 2013 04:35 AM

December 12, 2013

Ph.D. alumna

Data & Society: Call for Fellows

Over the last six months, I’ve been working to create the Data & Society Research Institute to address the social, technical, ethical, legal, and policy issues that are emerging because of data-centric technological development.  We’re still a few months away from launching the Institute, but we’re looking to identify the inaugural class of fellows. If you know innovative thinkers and creators who have a brilliant idea that needs a good home and are excited by the possibility of helping shape a new Institute, can you let them know about this opportunity?

The Data & Society Research Institute is a new think/do tank in New York City dedicated to addressing social, technical, ethical, legal, and policy issues that are emerging because of data-centric technological development.

Data & Society is currently looking to assemble its inaugural class of fellows. The fellowship program is intended to bring together an eclectic network of researchers, entrepreneurs, activists, policy creators, journalists, geeks, and public intellectuals who are interested in engaging one another on the key issues introduced by the increasing availability of data in society. We are looking for a diverse group of people who can see both the opportunities and challenges presented by access to data and who have a vision for a project that can inform the public or shape the future of society.

Applications for fellowships are due January 24, 2014. To learn more about this opportunity, please see our call for fellows.

On a separate, but related note, I lurve my employer; my ability to create this Institute is only possible because of a generous gift from Microsoft.

by zephoria at December 12, 2013 02:45 AM

December 08, 2013

Ph.D. alumna

how “context collapse” was coined: my recollection

Various academic folks keep writing to me asking me if I coined “context collapse” and so I went back in my record to try to figure it out. I feel the need to offer up my understanding of how this term came to be in an artifact that is more than 140 characters since folks keep asking anew. The only thing that I know for certain is that, even if I did (help) coin the term, I didn’t mean to. I was mostly trying to help explain a phenomenon that has long existed and exists in even more complicated ways as a result of social media.

In 2002, I wrote a thesis at the MIT Media Lab called “Faceted Id/entity” that drew heavily on the works of Erving Goffman and Joshua Meyrowitz. In it, I wrote an entire section talking about “collapsed contexts” and I kept coming back to this idea (descriptively without ever properly defining it). My thesis was all about contexts and the ways of managing identity in different contexts. I was (am) absolutely in love with Meyrowitz’s book “No Sense of Place” which laid out the challenges of people navigating multiple audiences as a result of media artifacts (e.g., stories around vacation photos).

Going back through older files, I found powerpoints from various talks that I gave in 2003 and 2004 that took the concept of “collapsed contexts” to Friendster to talk about what happened when the Burners and gay men and geeks realized they were on the site together. And an early discussion of how there are physical collapsed contexts that are addressed through the consumption of alcohol. In a few of my notes in these, I swapped the term to “context collapse” when referring to the result but I mostly used “collapsed contexts.”

Articles that I was writing from 2005-2008 still referred to “collapsed contexts.” (See: Profiles as Conversation and Friends, Friendsters, and Top 8, and my dissertation.) My dissertation made “collapsed contexts” a central concept.

In 2009, Alice Marwick and I started collaborating. She was fascinated by the arguments I was making in my dissertation on collapsed contexts and imagined audiences and started challenging me on aspects of them through her work on micro-celebrity. She collected data about how Twitter users navigated audiences and we collaborated on a paper called “I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience” which was submitted in 2009 and finally published in 2011. To the best that I can tell, this is the first time that I used “context collapse” instead of “collapsed contexts” in published writing, but I have no recollection as to why we shifted from “collapsed contexts” to “context collapse.”

Meanwhile, in 2009, Michael Wesch published an article called “YouTube and You: Experiences of Self-awareness in the Context Collapse of the Recording Webcam” that goes back to Goffman. While we ran in the same circles, I’m not sure that either one of us was directly building off of the other but we were clearly building off of common roots. (Guiltily, I must admit that I didn’t know about or read this article of his until much later and long after Alice and I wrote our paper. And I have no idea whether or not he read my papers where I discussed “collapsed contexts.”)

When I refer to context collapse now, I often point back to Joshua Meyrowitz because he’s the one that helped that concept really click in my head, even if he didn’t call it “collapsed contexts” or “context collapse.” As with many academic concepts, I see the notion of “context collapse” as being produced iteratively through intellectual interaction as opposed to some isolated insight that just appeared out of nowhere. I certainly appreciate the recognition that I’ve received for helping others think about these issues, but I’m very much hand-in-hand with and standing on the shoulders of giants.

If others have more insights into how this came into being, please let me know and I will update accordingly!

by zephoria at December 08, 2013 11:33 PM

November 27, 2013

MIMS 2010

Our New Zealand Blog

If you seek our blog about our journey on the Te Araroa trail, it lies down this path.

by mlissner at November 27, 2013 06:19 PM

November 26, 2013

Ph.D. student

Reflexive data science

In anticipation of my dissertation research and in an attempt to start a conversation within the emerging data science community at Berkeley, I’m working on a series of blog posts about reflexive data science. I will update this post with an index of them and related pieces as they are published over time.

“Reflexive data science: an overview”, UC Berkeley D-Lab Blog.
Explaining how the stated goals of the Berkeley Institute of Data Science–open source, open science, alt-metrics, and empirical evaluation–imply the possibility of an iterative, scientific approach to incentivizing scientists.

by Sebastian Benthall at November 26, 2013 08:53 PM

November 19, 2013

Ph.D. alumna

Upcoming Email Sabbatical: December 13-January 10

It’s about that time of the year for me. The time when I escape from the digital world into the wilderness in order to refresh. As many of you know, I am a firm believer in the power of vacations. Not to escape work, but to enable my brain to reboot. I purposefully seek boredom so that my brain starts itching. This, for me, is the root of my creativity and ability to be productive.

2014 is going to be an intense year. I’m ecstatic that my book – “It’s Complicated: The Social Lives of Networked Teens” – will be published in February. I can’t wait to share this with y’all and I’m in the process of setting up a whirlwind tour to accompany the launch (more will be posted on my book website shortly). Additionally, I’m starting an exciting new project that I can’t wait to tell you about. But before throwing myself head first into these activities, I’m going to take some time to get my head in the game.

This post is intended to be a pre-warning that I will be offline and taking an email sabbatical from December 13-January 10. What this means is that during this period, I will not be reachable and my INBOX will be set to not receive emails. If you need anything from me during this period, now is the time to ask.

For those who aren’t familiar with my email sabbaticals, check out this post. The reason that I do sabbaticals is because I’ve found that closing down everything and starting fresh is key. Coming home to thousands of emails that require sorting through has proven to be impossible, overwhelming, and disappointing for everyone who expects a response. So I shut it all down and start fresh. During this period, you can still send me snail mail if you’d like to get it off your plate. And if it’s uber uber urgent, you can track down my mom; I’ll touch base with her every few days. But my goal will be to refresh. And that way, we can have a magically exciting 2014!

by zephoria at November 19, 2013 03:54 PM

November 11, 2013

MIMS 2012

Why we built the Optimizely Styleguide

A few months ago, myself and the design team created the Optimizely Stylguide. It’s a living document of our brand standards and the visual style of our site. Designing, building, and writing the content for it took a significant amount of time, but it was worth the effort. In this post I’m going to explain why we did it.

Why build a styleguide?

A branding one-stop-shop

The original reason we wanted to build a styleguide was because we often get requests from people around the company for our logo. They also like to ask us which blue is our official blue. (It’s #1964af). Providing one place for everyone to access this information, and explaining when and how to use our logos, saves both parties time.

To achieve this goal, we put the full Optimizely logo, the Optimizely O, and our official brand colors on the styleguide’s home page. We provided the logos in png, psd, svg, and eps formats in black, white, and Optimizely blue, which covers the widest use cases. Since this is the primary information that most people are looking for, it makes sense to have it be the most immediately available.

Document our visual style

The second biggest driving force was to provide developers (including myself) documentation of the modules, classes, and mixins that exist in the codebase. It’s important that we maintain a consistent visual style, and re-using classes and mixins helps ensure we’re consistent. It also speeds up development. There have been a lot of times that I’ve written a mixin or class that someone didn’t know about, and they re-implemented it in a different area of our code. Documenting these styles helps prevent this.

Furthermore, complex widgets that require a combination of HTML, CSS, and JS to function properly (such as popovers and dialog boxes) are hard to figure out just from reading source code. Explaining how to add these modules to a page, and all of their styling and JS options, is invaluable.

In addition to just explaining how something works, we also wanted to document when it’s appropriate to use these various modules. You also can’t get this from reading code alone. This helps both engineers and designers understand when to use a popover instead of a tooltip, for example.

This content is the meat of the site, and the parts that change most frequently (in fact, many still need to be written). We decided the most effective way to document these elements was to create real, working examples with HTML, CSS, and JS, rather than just provide static screenshots. This allows developers to interact with the widgets, see how they behave, and inspect the code. It also makes it easier to communicate complex modules, such as popovers and dialog boxes. We took a lot of inspiration from Twitter’s Bootstrap and Zurb’s Foundation, which are both fantastic examples of using working code as documentation.

Code conventions

Along with documenting the various modules and classes we have available, we also wanted a place to explain our frontend code conventions (e.g. how to name classes, how files are organized, etc.). This is especially useful for new developers who are getting up to speed, but is also beneficial as a reference for all developers.

Shake out inconsistencies

Finally, a secondary goal and benefit of documenting our styles is that it brings us face to face with inconsistencies that need to be ironed out. There have been numerous times that writing down how something works made these inconsistencies obvious. This acts as a great forcing function to get us to have consistent styles (although we haven’t had time to fix all of them).


In the months since the guide was released, it has been successful at achieving each of these goals. Various people around the company consult it regularly for our logos and colors; developers refer to it when implementing common modules; new developers have learned our coding conventions on their own; and we’ve found many ways to improve our visual style and code. Taking the time to build and document our brand guidelines and frontend code was well worth the effort, and I recommend anyone else who works on a moderately complex site build their own styleguide.

by Jeff Zych at November 11, 2013 08:16 PM

November 02, 2013

Ph.D. alumna

Keeping Teens ‘Private’ on Facebook Won’t Protect Them

(Originally written for TIME Magazine)

We’re afraid of and afraid for teenagers. And nothing brings out this dualism more than discussions of how and when teens should be allowed to participate in public life.

Last week, Facebook made changes to teens’ content-sharing options. They introduced the opportunity for those ages 13 to 17 to share their updates and images with everyone and not just with their friends. Until this change, teens could not post their content publicly even though adults could. When minors select to make their content public, they are given a notice and a reminder in order to make it very clear to them that this material will be shared publicly. “Public” is never the default for teens; they must choose to make their content public, and they must affirm that this is what they intended at the point in which they choose to publish.

Representatives of parenting organizations have responded to this change negatively, arguing that this puts children more at risk. And even though the Pew Internet & American Life Project has found that teens are quite attentive to their privacy, and many other popular sites allow teens to post publicly (e.g. Twitter, YouTube, Tumblr), privacy advocates are arguing that Facebook’s decision to give teens choices suggests that the company is undermining teens’ privacy.

But why should youth not be allowed to participate in public life? Do paternalistic, age-specific technology barriers really protect or benefit teens?

One of the most crucial aspects of coming of age is learning how to navigate public life. The teenage years are precisely when people transition from being a child to being an adult. There is no magic serum that teens can drink on their 18th birthday to immediately mature and understand the world around them. Instead, adolescents must be exposed to — and allowed to participate in — public life while surrounded by adults who can help them navigate complex situations with grace. They must learn to be a part of society, and to do so, they must be allowed to participate.

Most teens no longer see Facebook as a private place. They befriend anyone they’ve ever met, from summer-camp pals to coaches at universities they wish to attend. Yet because Facebook doesn’t allow youth to contribute to public discourse through the site, there’s an assumption that the site is more private than it is. Facebook’s decision to allow teens to participate in public isn’t about suddenly exposing youth; it’s about giving them an option to treat the site as being as public as it often is in practice.

Rather than trying to protect teens from all fears and risks that we can imagine, let’s instead imagine ways of integrating them constructively into public life. The key to doing so is not to create technologies that reinforce limitations but to provide teens and parents with the mechanisms and information needed to make healthy decisions. Some young people may be ready to start navigating broad audiences at 13; others are not ready until they are much older. But it should not be up to technology companies to determine when teens are old enough to have their voices heard publicly. Parents should be allowed to work with their children to help them navigate public spaces as they see fit. And all of us should be working hard to inform our younger citizens about the responsibilities and challenges of being a part of public life. I commend Facebook for giving teens the option and working hard to inform them of the significance of their choices.

(Originally written for TIME Magazine)

by zephoria at November 02, 2013 11:11 PM

October 30, 2013

MIMS 2012

How I spent my weekend


As my last post mentioned, I spent last weekend at the 2013 DevelopHer hackathon.

What I didn’t mention was I ended up taking second place, despite being a solo team!

Me and the judging panel

My project was a web app called anniedoc, which I now finally have running on Heroku instead of just locally. (Note that there’s no restrictions on commenting right now, and I’ll probably have to wipe the db periodically as a result. Be gentle!)

Castle Kronborg--the real life Elsinore

anniedoc is a prototype for collaboratively annotating documents inline via the web. Inline annotation has been tried before by e.g. Commentpress, but was generally a flop. But the state of front-end dev has advanced a lot since that attempt, and now we see inline-type commenting from Medium and RapGenius!

anniedoc also tries to balance encouraging viewer engagement with providing a calm, focused, uncluttered reading environment–there are a lot of good interfaces out there for talking, and some good interfaces for reading, but few manage to do both.

hack hack hack

I was not expecting to win. I wasn’t expecting to even make it past the first round of judging! You see, anniedoc was more or less an interview exercise for a job I was interviewing for/contracting with. Said job wanted to see results by Monday, and the hackathon was Friday and Saturday, so I was basically like “welp, looks like I’ll be spending the hackathon doing work instead. :/” Under other circumstances I would have preferred to hack with a team, but I didn’t want to make anyone else work on work stuff!

It all seems to have worked out for the best, though–people seemed to like my presentation, potential job was happy with my work as well, and a five-person team (including my Pyladies friend) won first place, forcing LinkedIn to shell out for five Macbook Airs (the top prize) instead of just one! Mwa ha ha ha.

(These hackathon photos, and many more, can be found here–licensed CC-BY-NC)

by Karen at October 30, 2013 03:52 AM

October 29, 2013

MIMS 2012

Hackathons and Minimum Viable Prototypes

There’s been some prominent blog posts recently questioning the usefulness of hackathon events. Some focus on the cultural issues associated with many hackathons–that by default they appeal to a very homogenous subset of tech workers (aka young white male coders who enjoy subsisting on beer and pizza). This can be mitigated by thoughtful event organizers–advertising your event in places where a diverse crowd will see it, explicitly inviting beginning developers and non-developers (designers, product managers, community members), having a code of conduct, providing child care, serving real food, etc. I attended LinkedIn’s DevelopHer hackathon last weekend, which was 100% female; they got these and many other things right, and I had a fantastic time!

A deeper criticism of hackathons is that, although they may be fun, since nearly all hackathon projects are abandoned after the event is over they’re no good for creating startups, real useful products, or social change. All they might be good for is networking. Thus, these events are oversold. I think there is a point here, but I don’t think you can conclude from it that hackathons aren’t worthwhile to run or attend. Rather, attendees and observers should modify their expectations.

First, a pet peeve. During the presentations at hackathons I’ve been to, I often see groups presenting their work as if they’re pitching a real live product in front of investors or potential users. It’s hard to pinpoint what exactly bothers me about this convention–the overly-polished marketing speak, the inauthenticity of talking about a userbase that doesn’t [yet] exist… But the overall effect is that what the presenter is saying oversells what they actually built, so by the time you get to the demo it’s usually at least a little disappointing. I get that pitching is a valuable skill, and perhaps pretending that your hackathon project is real is how people practice this. Depending on the hackathon, the judges may participate in that illusion. But they usually don’t as much as you like, and I as an audience member don’t at all.

Your hackathon project is not a product, so don’t talk about it as if it is. People don’t build products at hackathons. You can’t build a product when you’re shut in a room for 24 (or 48) hours, no matter how many “10X” (ugh) coders are on your team. Why? It’s not a product until after you get it in front of users, or at least stakeholders, and at a hackathon there just isn’t time for that.

Most hackathon participants understand this, partially. Good teams know that starting small, and building from there, is the way to go; inexperienced teams often fail to let go of their grand vision in the face of the time limit. Most teams have heard of the concept of “minimum viable product”, and as they make plans at the start of the hackathon they (hopefully!) brainstorm, prioritize, and cut features down to the minimum they see necessary for a product.

But remember what I said: you can’t make a freaking product–even a “minimum viable” one–in 24 hours!

The “minimum viable product” verbiage comes from the book The Lean Startup, which is very clueful in its advice (though it bears balancing with this article). I wish the book had popularized a more precise term for this concept, though. At a hackathon, you’re not building a “minimum viable product”, because it’s not a product (yet). You’re building a minimum viable prototype.

A concept that is at the root of any hackathon idea–besides projects that are purely for practice in some tech stack–is an assumption: that the world would be better if X existed, or that some population of people want to know Y or to be able to do Z. Even projects that scratch one’s own itch progress forth under the assumption that what you intend to build will, in fact, solve your problem. Eric Ries would call the implicit assumption within the “why” of your project idea a “hypothesis”, which after being tested becomes “validated learning”–and argues that every organization that wants to grow and innovate ought to be testing new hypotheses all the time. Peter Thiel calls these assumptions, if they turn out to be correct, “secrets”–and spoke at length about how startups are a institution for discovering secrets and that every successful startup has at least one secret behind it.

What a hackathon is good for is to distill the core “why” of your idea–the hypothesis, or hypotheses, that make you believe your idea is useful–and build a prototype that will best tell you if your hypotheses are true. Science!

What that means: you probably don’t need social network integration–or any user accounts at all–unless your hypothesis specifically requires potential users to view information based on their own profile or network. You might not need a real database–just some sample fake data that refreshes itself every so often, or even on every reload. Email notifications? Edit or delete buttons? Logout functionality? Cross-browser compatibility? Search? “Like this on Facebook”? Unless there’s a good reason why it’s needed to test your hypothesis, cut it.

(If you get to hour 16 and you’ve got your core prototype nice and polished, then’s the time to start adding things like delete buttons and actual user auth. But you should start from a place where that which can be faked, should be!)

Then, when you present your work, it will be easy to focus on demoing what’s cool about your app and the reasons why you built what you built. Instead of pretending you’re a Real Startup pitching to VCs, you can be honest about your motivations and what you did and didn’t build; your insight, and the quality of what you built to test that insight, should be impressive enough. You can also avoid getting bogged down in extraneous “look how many different auth systems we support!”-type features or the details of your tech stack. You’ve only got a minute or two and the audience mostly doesn’t care about that stuff unless you used something really unusual (“we wrote the frontend using C++ and asm.js, because we’re insane!”).

The other common hackathon pitfall I see is how teams answer the judges’ perennial question, “What are the next steps for your project?” Nearly every team I see answers with a list of technical features that they didn’t get to during the hackathon. This is not only probably-dishonest (odds are, you’re never gonna touch this code again), it’s generally the wrong answer. The right answer will have something to do with putting your thing in front of people–publicizing your project and seeing if people sign up for the mailing list, posting the code with some sort of free software license and seeing if people write patches or feature requests, sitting down users in front of your prototype and watching what they do with it and listening to what they say. No battle plan survives contact with the enemy, and whatever the heck you scrawled on a whiteboard 24 sleep-deprived hours ago will most likely only bear slight resemblance to what actual user feedback tells you your project needs.

You built this prototype*: time to use it to test your hypothesis! Some of that test is your presentation itself–if the judges and audience respond well (and if they’re representative of your target audience) that might be a sign you’re on the right track. If your hackathon project is something you care about, you’ll also put it in front of people in other venues to learn enough to know whether or not it’s an idea worth moving forward with.

That’s what hackathons are good for–besides meeting tech people, practicing skills, consuming inordinate amounts of caffeine, or “generating buzz” (whatever), hackathons are pretty good for generating prototypes. Think in those terms, and you are likely to be more satisfied with your results!

* (on rock ‘n roll)

by Karen at October 29, 2013 03:57 AM

October 27, 2013

Ph.D. student

notes on innovation in journalism

I’ve spent the better part of the past week thinking hard about journalism. This is due largely to two projects: further investigation into Weird Twitter, and consulting work I’ve been doing with the Center for Investigative Reporting. Journalism, the trope goes, is a presently disrupted industry. I’d say it’s fair to say it’s a growing research interest for me. So here’s the rundown on where things seem to be at.

Probably the most rewarding thing to come out of the fundamentally pointless task of studying Weird Twitter, besides hilarity, is getting a better sense of the digital journalism community. I’ve owed Ethnography Matters a part 2 for a while, and it seems like the meatiest bone to pick is still on the subject of attention economy. The @horse_ebooks/Buzzfeed connection drives that nail in deeper.

I find content farming pretty depressing and only got more depressed reading Dylan Love’s review of MobileWorks that he crowdsourced to crowdworkers using MobileWorks. I mean, can you think of a more dystopian world than one in which the press is dominated by mercenary crowdworkers pulling together plausible-sounding articles out of nowhere for the highest bidder.

I was feeling like the world was going to hell until somebody told me about Oximity, which is a citizen journalist platform, as opposed to a viral advertising platform. Naturally, this has a different flavor to it, though is less monetized/usable/populated. Hmm.

I spend too much time on the Internet. That was obvious when attending CIR’s Dissection:Impact events on Wednesday and Thursday. CIR is a foundation-funded non-profit that actually goes and investigates things like prisons, migrant farm workers, and rehab clinics. The people there really turned my view of things around, as I realized that there are still people out there dedicated to using journalism to do good in the world.

There were three interesting presentations with divergent themes.

One was a presentation of ConText, a natural language and network processing toolkit for analyzing the discussion around media. It was led by Jana Deisner at the I School at Urbana-Champaign. Her dissertation work was on covert network analysis to detect white collar criminals. They have a thoroughly researched impact model, and software is currently unusable by humans but combines best practices in text and network analysis. The intend to release it as an academic tool for researchers, open source.

Another was a presentation by Harmony Institute, which has high-profile clients like MTV. Their lead designer walked us through a series of compelling mockups of ImpactSpace, an impact analysis tool that shows the discussion around an issue as “constellations” through different “solar systems” of ideas. Their project promises to identify how one can frame a story to target swing viewers. But they were not specific about how they would get and process the data. They intend to make demos of thir service available on-line, and market it as a product.

The third presentation was by CIR itself, which has hired a political science post-doc to come up with an analysis framework. They focused on a story, “Rape in the Fields”, about sexual abuse of migrant farm workers. These people tend not to be on Twitter, but the story was a huge success on Univision. Drawing mainly on qualitative data, it considers “micro”, “mezo”, and “macro” impact. Micro interactions might be eager calls to the original journalist for more information, or powerful anectdotes of how somebody hurt had felt healed when they were able to tell their story to the world.

Each team has their disciplinary bias and their own strengths and weaknesses. But they are tackling the same problem: trying to evaluate the effectiveness of media. They know that data is powerful: CIR uses it all the time to find stories. They will sift through a large data set, look for anomalies, and then carefully investigate. But even when collaborative science, including “data science” components, is effectively used to do external facing research, the story gets more difficult, intellectually and politically, when it turns that kind of thinking reflexively on itself.

I think this story sounds a lot like the story of what’s happening in Berkeley. A disrupted research organization struggles to understand its role in a changing world under pressure to adapt to data that seems both ubiquitous and impoverished.
Does this make you buy into the connection between universities and journalism?

If it does, then I can tell you another story about how software ties in. If not, then I’ve got deeper problems.

There is an operational tie: D-Lab and CIR have been in conversation about how to join forces. With the dissolution of disciplines, investigative reporting is looking more and more like social science. But its the journalists who are masters of distribution and engagement. What can we learn about the imoact of social science research from journalists? And how might the two be better operationally linked?

The New School sent some folks to the Dissection event to talk about the Open Journalism program they are starting soon.

I asked somebody at CIR what he thought about Buzzfeed. He explained that it’s the same business model as HuffPo–funding real journalism with the revenue from the crappy clickbait. I hope that’s true. I wonder if they would suffer as a business if they only put out clickbait. Is good journalism anything other than clickbait for the narrow segment of the population that has expensive taste in news?

The most interesting conversation I had was with Mike Corey at CIR, who explained that there are always lots of great stories, but that the problem was that newspapers don’t have space to run all the stories, they are an information bottleneck. I found this striking because I don’t get my media from newspapers any more, and it revealed that the shifting of the journalism ecosystem is still underway. Thinking this through…

In the old model, a newspaper (or radio show, or TV show) had limited budget to distributed information, and so competed for prestige with creativity and curational prowess. Naturally they targeted different audiences, but there was more at stake in deciding what to and what not to report. (The unintentional past tense here just goes to show where I am in time, I guess.)

With web publishing, everybody can blog or tweet. What’s newsworthy is what gets sifted through and picked up. Moreover, this can be done experimentally on a larger scale than…ah, interesting. Ok, so individual reporters wind up building a social media presence that is effectively a mini-newspaper and…oh dear.

One of the interesting phrases that came out of the discussion at the Dissection event was “self-commodification”–the tendency of journalists to need to brand themselves as products, artists, performers. Watching journalists on Twitter is striking partly because of how these constraints affect their behavior.

Putting it another way: what if newspapers had unlimited paper on which to print things? How would they decide to sort and distribute information? This is effectively what the Gawker, Buzzfeed, Techcrunch, and all the rest of the web press is up to. Hell, it’s what the Wall Street Journal is up to, as older more prestigious brands are pressured to compete. This causes the much lamented decline in the quality of journalism.

Ok, ok, so what does any of this mean? For society, for business. What is the equilibrium state?

by Sebastian Benthall at October 27, 2013 08:05 PM

October 25, 2013

Ph.D. student

because you can is reason enough to do something

I feel that our noise list has become too predictable and not nearly noisy enough. To address that, below I've included an attempt at an ethnographic and interpretive review of a short talk from a well-known security expert at an SF hacker space. —npd

Seeing Jacob Appelbaum speak at Noisebridge is something of an event, and I think it's perhaps uniquely appreciated as I saw it, with the company of an entirely non-computer-geek friend.

"5 Minutes of Fame" is a version of "lightning" or "ignite" talks, held about one evening a month at Noisebridge, a hacker collective space in the Mission District in San Francisco. (Both of those concepts bear a little explanation.) Lightning talks are kind of the opposite of lectures — they are dramatically short, generally five minutes. Different emcees will manage them differently, but in many cases you revel in the short, tight timeframe: slides might be set to auto-advance, speakers may be forced from the stage at the end of their time limit and so they rush to get everything in and the Q&A period common to academic conferences is generally ignored, you can talk over beer later. (Noisebridge is pretty casual about theirs; scheduled for 8pm to 10pm, I left in the middle in order to catch the midnight Bart train.)

And what is a hacker space? Noisebridge is one of the most well-known of these; it's generally a warehouse space with tools and supplies for mechanical and electrical tinkering. Noisebridge sports classes for knitting and for soldering and has a pretty strong software bent as well, including running Tor nodes (software and infrastructure for Internet anonymity) off of their servers and Internet connection. Members pay some regular dues, can come and go as they please, use the tools and the space for storing stuff and building things. As you might imagine, Noisebridge (and I imagine this is true of hacker spaces in general) is really more about the community of people than it is about the shared physical resources; this is less a tool-lending library and more a makers' clubhouse.

And so that's where M. and I see Jacob Appelbaum talk. Except, he's not on the agenda (a few of these talks are planned well in advance, and others are practically spontaneous — it doesn't, after all, require all that much preparation to give a five minute talk, though, that's not to minimize the act itself, talking for exactly five minutes to a group of your peers can be a nerve-wracking experience), just listed as "Person of Mystery" whom the emcee doesn't even know. But then he appears from the back of the room and there's a kind of roar of appreciation from the crowd, the kind of audience reaction I might imagine for a rock star, except the audience is 50 or 60 nerds of various types and ages. He is introduced initially just as Jake (I recognize him from one or two previous events we've attended together) and then formally as Jacob Appelbaum, though at that point it's clearly unnecessary for the audience, with no background or affiliation given; Jake was a founding member of Noisebridge, back in the day. I try to whisper to M. to explain what's going on before he dives immediately into his talk about this official trip he's just gotten back from to Burma and all of his illegal antics while there; I say, "he's, like, a security guy who works for Wikileaks" and recognize that this is inadequate and kind of incorrect, but what else can I say?

Noisebridge, like many of these informal collectives, is architected to be open and welcoming. Once we navigate the old fashioned elevator (the kind with two gates that you have to open and close, the kind that I've only ever seen in movies and this is the first time that I've actually had to operate one), there is nothing at all holding us back from entering and we are only most casually asked for cash donations during the evening itself. People are loud and friendly and, while many of us are awkward, all are pretty willing to talk to strangers. But at the same time, it can be incredibly insular, with acronyms, handles and inside jokes thrown around without the slightest pause. I give up on the futility of whispering explanations to M. and many of the scifi jokes fly over my head too. (M. notes later how unnerving it is to be invited by strangers to join in on conversations when participating inevitably reveals your lack of the shared expertise.)

That level of mystery around his introduction is clearly indulged by Jake and enthralling to the Noisebridge crowd. "Oh, that's not supposed to be in there," he says, unconvincingly, about the very first photo in his set, which just happens to be a photo of the Dalai Lama from this official visit he's just come back from. And "indulgent" really seems like the right word at times: Jake enjoys cursing and insulting prominent figures and entire governments, and gets great appreciation from the audience whenever he cheekily disavows having conducted some illicit probe of a technical system he encountered. He does pause a bit on realizing that a young girl is among the audience (Noisebridge really does bring people of all ages and all types, a more diverse crowd than I'm accustomed to among technical groups) but ultimately doesn't self-censor much. She ends up asking some of the best questions after his talk and we are all simultaneously impressed by and proud of her.

Appelbaum's five minutes of fame go on for maybe 45 minutes, with no pretense of anyone trying to cut him off. The topic itself is fascinating: a report from Burma on the steps necessary for obtaining a cellphone and using one to access the Internet; results from attempts to penetrate Internet censorship and other insecure technical systems within the country; recounting conversations with activists (political or technical) about how they handle living under such a regime. His talk is half "look at the cool stuff I did" and half "isn't this unacceptable and horrifying"; Jake may be irreverent, but he's also deadly serious about human rights abuses and the murders and imprisonments connected to government surveillance. He ends by saying he has to leave straight for the airport to fly to another state ("fly to another state" he repeats again — are we meant to be impressed or to wonder which state it might be?).

M. asks me later what GSM is (just one piece of the alphabet soup), which has been the main topic of his talk, and I struggle to explain the technologies of our and other cellphones, a technical area that I find I really don't know much about. But the history of phones in this sort of technical community, what I would now call the Internet community, is so canonically well-known that I can recite stories about Captain Crunch and blue boxes just like anyone else can. And the way that @ioerror talks about Burmese cellphone restrictions, freedom of access to information and the conditions for hacking are really quite reminiscent of that irreverent mode of making free long-distance telephone calls to foreign countries, even when at many times it's unclear why they wanted to call Vatican City anyway.

I find myself listening to RadioLab a couple of days later, in my new mostly empty apartment without any Internet access (it gets to me; I honestly feel cut off from the world), to the story of a blind kid in Southern Florida who found he could whistle phone calls.* Was this the youth of Captain Crunch? No, I find, this is Joybubbles**, a part of this story that I haven't heard and apparently news coverage of Joe's getting caught phreaking at college is what reveals to phone phreakers around the country that they're not alone in doing this, and then whole communities spring up that find they can call in to certain broken telephone numbers that act as a kind of party line where they can talk to one another for free, an early chatroom/message board, I suppose.

I actually went to Noisebridge that night primarily because of a talk that was pre-listed, one from Schuyler Erle, a cartographer I know through some geo folks. (Neogeographers are that certain subset of the Internet community that loves doing cool stuff with maps and the latest often informal Web technologies. I would like to count myself among them at least to the extent that I've been going to WhereCamp, the geo unconference, for 4 years straight.) Schuyler doesn't actually talk about cartography at all really, but about science fiction, about strategy board games and about space travel. This talk (I won't try to repeat it, hey, the slides and notes — and this talk is more immaculately prepared than most — are online***) goes in to some detail about the challenges of future space travel (special relativity, cryogenesis, etc.) and colonization and war. And though it's late in the evening, this is the one moment of the night that surpasses even the enthusiasm of Jake Appelbaum giving a surprise talk at Noisebridge — Schuyler rhetorically asks, why, given all the challenges and hardships involved in near-speed-of-light space travel, would we ever bother to colonize and conquer the stars? Because they're there! Because we can!

And that moment of united enthusiasm (which, I imagine, includes M. and myself and all the Noisebridge regulars) explains, I think, a lot of what a hacker space is, what hacking is, why Jake Appelbaum is so revered here. Whether it's building your own X-ray machine, devising a one-meal-a-day diet, making free phone calls to a foreign embassy or circumventing Internet censorship in Burma — *because you can* is reason enough to do something.

* The last act of the episode "Escape!", February 20, 2012.

** Perhaps because of his own troubled childhood, Joe eventually changes his name to Joybubbles, declares himself to be five years old again and dedicates himself to this call-in service "Stories and Stuff" for kids to call and listen to. Children and phones! Another sort of connection to Steve Wozniak; when I saw him speak at Microsoft he was clearly most fascinated about the call-in joke line, or having a cellphone number that was all repeated digits so that kids would accidentally call him when they first played with a phone. How can there be so many of these internal connections?

*** Ha, and what a domain name, as I've just noticed. Having your Internet address be "Iconoclast" is an excellent micro-explanation of the ethic of Noisebridge.

by at October 25, 2013 09:02 PM

October 24, 2013

MIMS 2012

EdTech LumaScape

There is a common belief that education industry will be disrupted very soon.


A flurry of EdTech start-ups are emerging. In fact, there are so many that it is so difficult to understand the industry as a whole. Inspired by LumaScape, I created the following Education LumaScape. By no means it is exhaustive and always up-to-date, but I always welcome feedback for making this better :) .

by admin at October 24, 2013 06:13 PM