School of Information Blogs

January 17, 2017

Ph.D. student

economy of control

We call it a “crisis” when the predictions of our trusted elites are violated in one way or another. We expect, for good reason, things to more or less continue as they are. They’ve evolved to be this way, haven’t they? The older the institution, the more robust to change it must be.

I’ve gotten comfortable in my short life with the global institutions that appeared to be the apex of societal organization. Under these conditions, I found James Beniger‘s work to be particularly appealing, as it predicts the growth of information processing apparati (some combination of information worker and information technology) as formerly independent components of society integrate. I’m of the class of people that benefits from this kind of centralization of control, so I was happy to believe that this was an inevitable outcome according to physical law.

Now I’m not so sure.

I am not sure I’ve really changed my mind fundamentally. This extreme Beniger view is too much like Nick Bostrom’s superintelligence argument in form, and I’ve already thought hard about why that argument is not good. That reasoning stopped at the point of noting how superintelligence “takeoff” is limited by data collection. But I did not go to the next and probably more important step, which is the problem of aleatoric uncertainty in a world with multiple agents. We’re far more likely to get into a situation with multi-polar large intelligences that are themselves fraught with principle-agent problems, because that’s actually the status quo.

I’ve been prodded to revisit The Black Box Society, which I’ve dealt with inadequately. Its beefier chapters deal with a lot of the specific economic and regulatory recent history of the information economy of the United States, which is a good complement to Beniger and a good resource for the study of competing intelligences within a single economy, though I find this data a but clouded by the polemical writing.

“Economy” is the key word here. Pure, Arendtian politics and technics have not blended easily, but what they’ve turned into is a self-regulatory system with structure and agency. More than that, the structure is for sale, and so is the agency. What is interesting about the information economy is, and I guess I’m trying to coin a phrase here, is that it is an economy of control. The “good” being produced, sold, and bought, is control.

There’s a lot of interesting research about information goods. But I’ve never heard of a “control good”. But this is what we are talking about when we talk about software, data collection, managerial labor, and the conflicts and compromises that it creates.

I have a few intuitions about where this goes, but not as many as I’d like. I think this is because the economy of control is quite messy and hard to reason about.

by Sebastian Benthall at January 17, 2017 12:10 AM

January 13, 2017

Ph.D. student

habitus and citizenship

Just a quick thought… So in Bourdieu’s Science of Science and Reflexivity, he describes the habitus of the scientist. Being a scientist demands a certain adherence to the rules of the scientific game, certain training, etc. He winds up constructing a sociological explanation for the epistemic authority of science. The rules of the game are the conditions for objectivity.

When I was working on a now defunct dissertation, I was comparing this formulation of science with a formulation of democracy and the way it depends on publics. Habermasian publics, Fraserian publics, you get the idea. Within this theory, what was once a robust theory of collective rationality as the basis for democracy has deteriorated under what might be broadly construed as “postmodern” critiques of this rationality. One could argue that pluralistic multiculturalism, not collective reason, became the primary ideology for American democracy in the past eight years.

Pretty sure this backfired with e.g. the Alt-Right.

So what now? I propose that those interested in functioning democracy reconsider the habitus of citizenship and how it can be maintained through the education system and other civic institutions. It’s a bit old-school. But if the Alt-Right wanted a reversion to historical authoritarian forms of Western governance, we may be getting there. Suppose history moves in a spiral. It might be best to try to move forward, not back.

by Sebastian Benthall at January 13, 2017 12:29 AM

January 10, 2017

Ph.D. student

Loving Tetlock’s Superforecasting: The Art and Science of Prediction

I was a big fan of Philip Tetlock’s Expert Political Judgment (EPJ). I read it thoroughly; in fact a book review of it was my first academic publication. It was very influential on me.

EPJ is a book that is troubling to many political experts because it basically says that most so-called political expertise is bogus and that what isn’t bogus is fairly limited. It makes this argument with far more meticulous data collection and argumentation than I am able to do justice to here. I found it completely persuasive and inspiring. It wasn’t until I got to Berkeley that I met people who had vivid negative emotional reactions to this work. They seem to mainly have been political experts who do not having their expertise assessed in terms of its predictive power.

Superforecasting: The Art and Science of Prediction (2016) is a much more accessible book that summarizes the main points from EPJ and then discusses the results of Tetlock’s Good Judgment Project, which was his answer to an IARPA challenge in forecasting political events.

Much of the book is an interesting history of the United States Intelligence Community (IC) and the way its attitudes towards political forecasting have evolved. In particular, the shock of the failure of the predictions around Weapons of Mass Destruction that lead to the Iraq War were a direct cause of IARPA’s interest in forecasting and their funding of the Good Judgment Project despite the possibility that the project’s results would be politically challenging. IARPA comes out looking like a very interesting and intellectually honest organization solving real problems for the people of the United States.

Reading this has been timely for me because: (a) I’m now doing what could be broadly construed as “cybersecurity” work, professionally, (b) my funding is coming from U.S. military and intelligence organizations, and (c) the relationship between U.S. intelligence organizations and cybersecurity has been in the news a lot lately in a very politicized way because of the DNC hacking aftermath.

Since so much of Tetlock’s work is really just about applying mathematical statistics to the psychological and sociological problem of developing teams of forecasters, I see the root of it as the same mathematical theory one would use for any scientific inference. Cybersecurity research, to the extent that it uses sound scientific principles (which it must, since it’s all about the interaction between society, scientifically designed technology, and risk), is grounded in these same principles. And at its best the U.S. intelligence community lives up to this logic in its public service.

The needs of the intelligence community with respect to cybersecurity can be summed up in one word: rationality. Tetlock’s work is a wonderful empirical study in rationality that’s a must-read for anybody interested in cybersecurity policy today.

by Sebastian Benthall at January 10, 2017 10:54 PM

Ph.D. alumna

Why America is Self-Segregating

The United States has always been a diverse but segregated country. This has shaped American politics profoundly. Yet, throughout history, Americans have had to grapple with divergent views and opinions, political ideologies, and experiences in order to function as a country. Many of the institutions that underpin American democracy force people in the United States to encounter difference. This does not inherently produce tolerance or result in healthy resolution. Hell, the history of the United States is fraught with countless examples of people enslaving and oppressing other people on the basis of difference. This isn’t about our past; this is about our present. And today’s battles over laws and culture are nothing new.

Ironically, in a world in which we have countless tools to connect, we are also watching fragmentation, polarization, and de-diversification happen en masse. The American public is self-segregating, and this is tearing at the social fabric of the country.

Many in the tech world imagined that the Internet would connect people in unprecedented ways, allow for divisions to be bridged and wounds to heal.It was the kumbaya dream. Today, those same dreamers find it quite unsettling to watch as the tools that were designed to bring people together are used by people to magnify divisions and undermine social solidarity. These tools were built in a bubble, and that bubble has burst.

Nowhere is this more acute than with Facebook. Naive as hell, Mark Zuckerberg dreamed he could build the tools that would connect people at unprecedented scale, both domestically and internationally. I actually feel bad for him as he clings to that hope while facing increasing attacks from people around the world about the role that Facebook is playing in magnifying social divisions. Although critics love to paint him as only motivated by money, he genuinely wants to make the world a better place and sees Facebook as a tool to connect people, not empower them to self-segregate.

The problem is not simply the “filter bubble,” Eli Pariser’s notion that personalization-driven algorithmic systems help silo people into segregated content streams. Facebook’s claim that content personalization plays a small role in shaping what people see compared to their own choices is accurate.And they have every right to be annoyed. I couldn’t imagine TimeWarner being blamed for who watches Duck Dynasty vs. Modern Family. And yet, what Facebook does do is mirror and magnify a trend that’s been unfolding in the United States for the last twenty years, a trend of self-segregation that is enabled by technology in all sorts of complicated ways.

The United States can only function as a healthy democracy if we find a healthy way to diversify our social connections, if we find a way to weave together a strong social fabric that bridges ties across difference.

Yet, we are moving in the opposite direction with serious consequences. To understand this, let’s talk about two contemporary trend lines and then think about the implications going forward.

Privatizing the Military

The voluntary US military is, in many ways, a social engineering project. The public understands the military as a service organization, dedicated to protecting the country’s interests. Yet, when recruits sign up, they are promised training and job opportunities. Individual motivations vary tremendously, but many are enticed by the opportunity to travel the world, participate in a cause with a purpose, and get the heck out of dodge. Everyone expects basic training to be physically hard, but few recognize that some of the most grueling aspects of signing up have to do with the diversification project that is central to the formation of the American military.

When a soldier is in combat, she must trust her fellow soldiers with her life. And she must be willing to do what it takes to protect the rest of her unit. In order to make that possible, the military must wage war on prejudice. This is not an easy task. Plenty of generals fought hard to fight racial desegregation and to limit the role of women in combat. Yet, the US military was desegregated in 1948, six years before Brown v. Board forced desegregation of schools. And the Supreme Court ruled that LGB individuals could openly serve in the military before they could legally marry.

CC BY 2.0-licensed photo by The U.S. Army.

Morale is often raised as the main reason that soldiers should not be forced to entrust their lives to people who are different than them. Yet, time and again, this justification collapses under broader interests to grow the military. As a result, commanders are forced to find ways to build up morale across difference, to actively and intentionally seek to break down barriers to teamwork, and to find a way to gel a group of people whose demographics, values, politics, and ideologies are as varied as the country’s.

In the process, they build one of the most crucial social infrastructures of the country. They build the diverse social fabric that underpins democracy.

Tons of money was poured into defense after 9/11, but the number of people serving in the US military today is far lower than it was throughout the 1980s. Why? Starting in the 1990s and accelerating after 9/11, the US privatized huge chunks of the military. This means that private contractors and their employees play critical roles in everything from providing food services to equipment maintenance to military housing. The impact of this on the role of the military in society is significant. For example, this undermine recruits’ ability to get training to develop critical skills that will be essential for them in civilian life. Instead, while serving on active duty, they spend a much higher amount of time on the front lines and in high-risk battle, increasing the likelihood that they will be physically or psychologically harmed. The impact on skills development and job opportunities is tremendous, but so is the impact on the diversification of the social fabric.

Private vendors are not engaged in the same social engineering project as the military and, as a result, tend to hire and fire people based on their ability to work effectively as a team. Like many companies, they have little incentive to invest in helping diverse teams learn to work together as effectively as possible. Building diverse teams — especially ones in which members depend on each other for their survival — is extremely hard, time-consuming, and emotionally exhausting. As a result, private companies focus on “culture fit,” emphasize teams that get along, and look for people who already have the necessary skills, all of which helps reinforce existing segregation patterns.

The end result is that, in the last 20 years, we’ve watched one of our major structures for diversification collapse without anyone taking notice. And because of how it’s happened, it’s also connected to job opportunities and economic opportunity for many working- and middle-class individuals, seeding resentment and hatred.

A Self-Segregated College Life

If you ask a college admissions officer at an elite institution to describe how they build a class of incoming freshman, you will quickly realize that the American college system is a diversification project. Unlike colleges in most parts of the world, the vast majority of freshman at top tier universities in the United States live on campus with roommates who are assigned to them. Colleges approach housing assignments as an opportunity to pair diverse strangers with one another to build social ties. This makes sense given how many friendships emerge out of freshman dorms. By pairing middle class kids with students from wealthier families, elite institutions help diversify the elites of the future.

This diversification project produces a tremendous amount of conflict. Although plenty of people adore their college roommates and relish the opportunity to get to know people from different walks of life as part of their college experience, there is an amazing amount of angst about dorm assignments and the troubles that brew once folks try to live together in close quarters. At many universities, residential life is often in the business of student therapy as students complain about their roommates and dormmates. Yet, just like in the military, learning how to negotiate conflict and diversity in close quarters can be tremendously effective in sewing the social fabric.

CC BY-NC-ND 2.0-licensed photo by Ilya Khurosvili.

In the springs of 2006, I was doing fieldwork with teenagers at a time when they had just received acceptances to college. I giggled at how many of them immediately wrote to the college in which they intended to enroll, begging for a campus email address so that they could join that school’s Facebook (before Facebook was broadly available). In the previous year, I had watched the previous class look up roommate assignments on MySpace so I was prepared for the fact that they’d use Facebook to do the same. What I wasn’t prepared for was how quickly they would all get on Facebook, map the incoming freshman class, and use this information to ask for a roommate switch. Before they even arrived on campus in August/September of 2006, they had self-segregated as much as possible.

A few years later, I watched another trend hit: cell phones. While these were touted as tools that allowed students to stay connected to parents (which prompted many faculty to complain about “helicopter parents” arriving on campus), they really ended up serving as a crutch to address homesickness, as incoming students focused on maintaining ties to high school friends rather than building new relationships.

Students go to elite universities to “get an education.” Few realize that the true quality product that elite colleges in the US have historically offered is social network diversification. Even when it comes to job acquisition, sociologists have long known that diverse social networks (“weak ties”) are what increase job prospects. By self-segregating on campus, students undermine their own potential while also helping fragment the diversity of the broader social fabric.

Diversity is Hard

Diversity is often touted as highly desirable. Indeed, in professional contexts, we know that more diverse teams often outperform homogeneous teams. Diversity also increases cognitive development, both intellectually and socially. And yet, actually encountering and working through diverse viewpoints, experiences, and perspectives is hard work. It’s uncomfortable. It’s emotionally exhausting. It can be downright frustrating.

Thus, given the opportunity, people typically revert to situations where they can be in homogeneous environments. They look for “safe spaces” and “culture fit.” And systems that are “personalized” are highly desirable. Most people aren’t looking to self-segregate, but they do it anyway. And, increasingly, the technologies and tools around us allow us to self-segregate with ease. Is your uncle annoying you with his political rants? Mute him. Tired of getting ads for irrelevant products? Reveal your preferences. Want your search engine to remember the things that matter to you? Let it capture data. Want to watch a TV show that appeals to your senses? Here are some recommendations.

Any company whose business model is based on advertising revenue and attention is incentivized to engage you by giving you what you want. And what you want in theory is different than what you want in practice.

Consider, for example, what Netflix encountered when it started its streaming offer. Users didn’t watch the movies that they had placed into their queue. Those movies were the movies they thought they wanted, movies that reflected their ideal self — 12 Years a Slave, for example. What they watched when they could stream whatever they were in the mood for at that moment was the equivalent of junk food — reruns of Friends, for example. (This completely undid Netflix’s recommendation infrastructure, which had been trained on people’s idealistic self-images.)

The divisions are not just happening through commercialism though. School choice has led people to self-segregate from childhood on up. The structures of American work life mean that fewer people work alongside others from different socioeconomic backgrounds. Our contemporary culture of retail and service labor means that there’s a huge cultural gap between workers and customers with little opportunity to truly get to know one another. Even many religious institutions are increasingly fragmented such that people have fewer interactions across diverse lines. (Just think about how there are now “family services” and “traditional services” which age-segregate.) In so many parts of public, civic, and professional life, we are self-segregating and the opportunities for doing so are increasing every day.

By and large, the American public wants to have strong connections across divisions. They see the value politically and socially. But they’re not going to work for it. And given the option, they’re going to renew their license remotely, try to get out of jury duty, and use available data to seek out housing and schools that are filled with people like them. This is the conundrum we now face.

Many pundits remarked that, during the 2016 election season, very few Americans were regularly exposed to people whose political ideology conflicted with their own. This is true. But it cannot be fixed by Facebook or news media. Exposing people to content that challenges their perspective doesn’t actually make them more empathetic to those values and perspectives. To the contrary, it polarizes them. What makes people willing to hear difference is knowing and trusting people whose worldview differs from their own. Exposure to content cannot make up for self-segregation.

If we want to develop a healthy democracy, we need a diverse and highly connected social fabric. This requires creating contexts in which the American public voluntarily struggles with the challenges of diversity to build bonds that will last a lifetime. We have been systematically undoing this, and the public has used new technological advances to make their lives easier by self-segregating. This has increased polarization, and we’re going to pay a heavy price for this going forward. Rather than focusing on what media enterprises can and should do, we need to focus instead on building new infrastructures for connection where people have a purpose for coming together across divisions. We need that social infrastructure just as much as we need bridges and roads.

This piece was originally published as part of a series on media, accountability, and the public sphere. See also:

by zephoria at January 10, 2017 01:15 PM

January 09, 2017

MIMS 2018

Trump and the Strategy of Irrationality

I wrote this piece in November 2016 and sat on it for a while, unsure whether or not I wanted to publish it. Since then, the Washington Post and the Boston Globe have had great pieces making similar points to the one I made here: that Donald Trump’s unpredictability may, in certain situations, give him leverage in negotiations. The world has changed a lot in these two short months but many points I make here still stand. So please, enjoy.

Source: Wikimedia Commons

Donald Trump is not just the most controversial President-Elect in recent American history — he is also the most unpredictable. His lack of political experience, inconsistent views, and tendency towards outbursts leave even his most ardent supporters unsure of what a President Trump might do in a given situation. Yet counterintuitively, his unpredictability may help him in the international arena.

The reason is a basic tenet of game theory. In a conflict, a person’s bargaining power depends on their perceived willingness to go through with a threat, even at a cost to themselves. If an opponent sees a threatener as irrational, they will also see them as more willing to go through with a costly threat, either because they do not know or do not care about the consequences. Thus, the opponent is more likely to yield.

This is where the irrationality of Trump shines.

For example, he may have an advantage over traditional politicians in renegotiating foreign trade deals because he is viewed as unstable enough to scrap them, even if it would hurt the American economy. A politician who has shown more nuanced views of America’s trade relations and economic interests would not have this same leverage.

Thomas Schelling. Source: Harvard Gazette

This strategy of irrationality is not new. It was popularized in 1960 by the Nobel Prize winning economist Thomas Schelling in his book Strategy of Conflict. It was used in the Cold War by both American presidents and Russian secretaries. Even Voltaire said, “Be sure that your madness corresponds with the turn and temper of your age…and forget not to be excessively opinionated and obstinate.”

Of all the US presidents, Richard Nixon put the most faith in what he called the “madman strategy.” He tried to appear “mad” enough to use nuclear weapons in order to bring North Vietnam to the negotiation table. In a private conversation, Nixon told his Chief of Staff the following:

I want the North Vietnamese to believe I’ve reached the point where I might do anything to stop the war. We’ll just slip the word to them that “for God’s sake, you know Nixon is obsessed about Communism. We can’t restrain him when he’s angry — and he has his hand on the nuclear button.”

After four years, Nixon’s “madman strategy” failed to end the war. He could only apply it intermittently; his “madness” for flying planes strapped with nuclear weapons over Northern Vietnam was tempered by his sanity in negotiations with Russia and China. Additionally, the repercussions of using nuclear weapons were so drastic that it was difficult to convince anyone he was willing to use them, especially after Russia achieved nuclear parity with the US.

President Richard Nixon. Source: Flickr

President Trump may have more success in applying the “madman strategy” because many people already see him as mad. Unlike Nixon, who tried to shift his perception from sane to insane, Trump has cultivated his unstable persona over almost a year and a half of campaigning and decades in the public eye. His perceived lack of knowledge regarding everything political may also cause opponents to see him as incapable of making rational decisions.

The strategy of irrationality is contingent on a number of assumptions. It assumes a somewhat rational opponent and a centralized decision making authority, neither of which apply to America’s most virulent enemy, ISIS. It also assumes a medium of communication to send threats over, which may be more difficult in dealings with countries with whom the US lacks diplomatic relations, like Iran and North Korea.

The utility of the strategy of irrationality is further complicated by the fact that most relationships the United States has with other countries are simultaneously oppositional and collaborative. For example, President Trump may consider France an opponent in environmental and NATO negotiations but an ally in trading. His perceived instability could give him leverage in negotiations but harm mutually beneficial relations with France.

The strategy also depends on whether President Trump is as unpredictable as candidate Trump. President-Elect Trump has already backed off from some of his more outlandish campaign trail promises. Global views of Trump are constantly shifting, especially as news comes out about his cabinet, and a method to his madness may become apparent as he makes more executive decisions.

The unpredictability of Donald Trump has brought about sleepless nights for many Americans. His perceived irrationality may damage allegiances within and without the country, but it may also give him leverage in future international conflicts. Donald Trump has always said he is a dealmaker and he might just be crazy enough to be right.

by Gabe Nicholas at January 09, 2017 04:50 PM

Ph.D. alumna

Did Media Literacy Backfire?

Anxious about the widespread consumption and spread of propaganda and fake news during this year’s election cycle, many progressives are calling for an increased commitment to media literacy programs. Others are clamoring for solutions that focus on expert fact-checking and labeling. Both of these approaches are likely to fail — not because they are bad ideas, but because they fail to take into consideration the cultural context of information consumption that we’ve created over the last thirty years. The problem on our hands is a lot bigger than most folks appreciate.

CC BY 2.0-licensed photo by CEA+ | Artist: Nam June Paik, “Electronic Superhighway. Continental US, Alaska & Hawaii” (1995).

What Are Your Sources?

I remember a casual conversation that I had with a teen girl in the midwest while I was doing research. I knew her school approached sex ed through an abstinence-only education approach, but I don’t remember how the topic of pregnancy came up. What I do remember is her telling me that she and her friends talked a lot about pregnancy and “diseases” she could get through sex. As I probed further, she matter-of-factly explained a variety of “facts” she had heard that were completely inaccurate. You couldn’t get pregnant until you were 16. AIDS spreads through kissing. Etc. I asked her if she’d talked to her doctor about any of this, and she looked me as though I had horns. She explained that she and her friends had done the research themselves, by which she meant that they’d identified websites online that “proved” their beliefs.

For years, that casual conversation has stuck with me as one of the reasons that we needed better Internet-based media literacy. As I detailed in my book It’s Complicated: The Social Lives of Networked Teens, too many students I met were being told that Wikipedia was untrustworthy and were, instead, being encouraged to do research. As a result, the message that many had taken home was to turn to Google and use whatever came up first. They heard that Google was trustworthy and Wikipedia was not.

Understanding what sources to trust is a basic tenet of media literacy education. When educators encourage students to focus on sourcing quality information, they encourage them to critically ask who is publishing the content. Is the venue a respected outlet? What biases might the author have? The underlying assumption in all of this is that there’s universal agreement that major news outlets like the New York Times, scientific journal publications, and experts with advanced degrees are all highly trustworthy.

Think about how this might play out in communities where the “liberal media” is viewed with disdain as an untrustworthy source of information…or in those where science is seen as contradicting the knowledge of religious people…or where degrees are viewed as a weapon of the elite to justify oppression of working people. Needless to say, not everyone agrees on what makes a trusted source.

Students are also encouraged to reflect on economic and political incentives that might bias reporting. Follow the money, they are told. Now watch what happens when they are given a list of names of major power players in the East Coast news media whose names are all clearly Jewish. Welcome to an opening for anti-Semitic ideology.

Empowered Individuals…with Guns

We’ve been telling young people that they are the smartest snowflakes in the world. From the self-esteem movement in the 1980s to the normative logic of contemporary parenting, young people are told that they are lovable and capable and that they should trust their gut to make wise decisions. This sets them up for another great American ideal: personal responsibility.

In the United States, we believe that worthy people lift themselves up by their bootstraps. This is our idea of freedom. What it means in practice is that every individual is supposed to understand finance so well that they can effectively manage their own retirement funds. And every individual is expected to understand their health risks well enough to make their own decisions about insurance. To take away the power of individuals to control their own destiny is viewed as anti-American by so much of this country. You are your own master.

Children are indoctrinated into this cultural logic early, even as their parents restrict their mobility and limit their access to social situations. But when it comes to information, they are taught that they are the sole proprietors of knowledge. All they have to do is “do the research” for themselves and they will know better than anyone what is real.

Combine this with a deep distrust of media sources. If the media is reporting on something, and you don’t trust the media, then it is your responsibility to question their authority, to doubt the information you are being given. If they expend tremendous effort bringing on “experts” to argue that something is false, there must be something there to investigate.

Now think about what this means for #Pizzagate. Across this country, major news outlets went to great effort to challenge conspiracy reports that linked John Podesta and Hillary Clinton to a child trafficking ring supposedly run out of a pizza shop in Washington, DC. Most people never heard the conspiracy stories, but their ears perked up when the mainstream press went nuts trying to debunk these stories. For many people who distrust “liberal” media and were already primed not to trust Clinton, the abundant reporting suggested that there was something to investigate.

Most people who showed up to the Comet Ping Pong pizzeria to see for their own eyes went undetected. But then a guy with a gun decided he “wanted to do some good” and “rescue the children.” He was the first to admit that “the intel wasn’t 100%,” but what he was doing was something that we’ve taught people to do — question the information they’re receiving and find out the truth for themselves.

Experience Over Expertise

Many marginalized groups are justifiably angry about the ways in which their stories have been dismissed by mainstream media for decades. This is most acutely felt in communities of color. And this isn’t just about the past. It took five days for major news outlets to cover Ferguson. It took months and a lot of celebrities for journalists to start discussing the Dakota Pipeline. But feeling marginalized from news media isn’t just about people of color. For many Americans who have watched their local newspaper disappear, major urban news reporting appears disconnected from reality. The issues and topics that they feel affect their lives are often ignored.

For decades, civil rights leaders have been arguing for the importance of respecting experience over expertise, highlighting the need to hear the voices of people of color who are so often ignored by experts. This message has taken hold more broadly, particularly among lower and middle class whites who feel as though they are ignored by the establishment. Whites also want their experiences to be recognized, and they too have been pushing for the need to understand and respect the experiences of “the common man.” They see “liberal” “urban” “coastal” news outlets as antithetical to their interests because they quote from experts, use cleaned-up pundits to debate issues, and turn everyday people (e.g., “red sweater guy”) into spectacles for mass enjoyment.

Consider what’s happening in medicine. Many people used to have a family doctor whom they knew for decades and trusted as individuals even more than as experts. Today, many people see doctors as arrogant and condescending, overly expensive and inattentive to their needs. Doctors lack the time to spend more than a few minutes with patients, and many people doubt that the treatment they’re getting is in their best interest. People feel duped into paying obscene costs for procedures that they don’t understand. Many economists can’t understand why so many people would be against the Affordable Care Act because they don’t recognize that this “socialized” medicine is perceived as experts over experience by people who don’t trust politicians who tell them what’s in their best interest any more than they trust doctors. And public trust in doctors is declining sharply.

Why should we be surprised that most people are getting medical information from their personal social network and the Internet? It’s a lot cheaper than seeing a doctor, and both friends and strangers on the Internet are willing to listen, empathize, and compare notes. Why trust experts when you have at your fingertips a crowd of knowledgeable people who may have had the same experience as you and can help you out?

Consider this dynamic in light of discussions around autism and vaccinations. First, an expert-produced journal article was published linking autism to vaccinations. This resonated with many parents’ experience. Then, other experts debunked the first report, challenged the motivations of the researcher, and engaged in a mainstream media campaign to “prove” that there was no link. What unfolded felt like a war on experience, and a network of parents coordinated to counter this new batch of experts who were widely seen as ignorant, moneyed, and condescending. The more that the media focused on waving away these networks of parents through scientific language, the more the public felt sympathetic to the arguments being made by anti-vaxxers.

Keep in mind that anti-vaxxers aren’t arguing that vaccinations definitively cause autism. They are arguing that we don’t know. They are arguing that experts are forcing children to be vaccinated against their will, which sounds like oppression. What they want is choice — the choice to not vaccinate. And they want information about the risks of vaccination, which they feel are not being given to them. In essence, they are doing what we taught them to do: questioning information sources and raising doubts about the incentives of those who are pushing a single message. Doubt has become tool.

Grappling with “Fake News”

Since the election, everyone has been obsessed with fake news, as experts blame “stupid” people for not understanding what is “real.” The solutionism around this has been condescending at best. More experts are needed to label fake content. More media literacy is needed to teach people how not to be duped. And if we just push Facebook to curb the spread of fake news, all will be solved.

I can’t help but laugh at the irony of folks screaming up and down about fake news and pointing to the story about how the Pope backs Trump. The reason so many progressives know this story is because it was spread wildly among liberal circles who were citing it as appalling and fake. From what I can gather, it seems as though liberals were far more likely to spread this story than conservatives. What more could you want if you ran a fake news site whose goal was to make money by getting people to spread misinformation? Getting doubters to click on clickbait is far more profitable than getting believers because they’re far more likely to spread the content in an effort to dispel the content. Win!

CC BY 2.0-licensed photo by Denis Dervisevic.

People believe in information that confirms their priors. In fact, if you present them with data that contradicts their beliefs, they will double down on their beliefs rather than integrate the new knowledge into their understanding. This is why first impressions matter. It’s also why asking Facebook to show content that contradicts people’s views will not only increase their hatred of Facebook but increase polarization among the network. And it’s precisely why so many liberals spread “fake news” stories in ways that reinforce their belief that Trump supporters are stupid and backwards.

Labeling the Pope story as fake wouldn’t have stopped people from believing that story if they were conditioned to believe it. Let’s not forget that the public may find Facebook valuable, but it doesn’t necessarily trust the company. So their “expertise” doesn’t mean squat to most people. Of course, it would be an interesting experiment to run; I do wonder how many liberals wouldn’t have forwarded it along if it had been clearly identified as fake. Would they have not felt the need to warn everyone in their network that conservatives were insane? Would they have not helped fuel a money-making fake news machine? Maybe.

But I think labeling would reinforce polarization — but it would feel like something was done. Nonbelievers would use the label to reinforce their view that the information is fake (and minimize the spread, which is probably a good thing), while believers would simply ignore the label. But does that really get us to where we want to go?

Addressing so-called fake news is going to require a lot more than labeling.It’s going to require a cultural change about how we make sense of information, whom we trust, and how we understand our own role in grappling with information. Quick and easy solutions may make the controversy go away, but they won’t address the underlying problems.

What Is Truth?

As a huge proponent for media literacy for over a decade, I’m struggling with the ways in which I missed the mark. The reality is that my assumptions and beliefs do not align with most Americans. Because of my privilege as a scholar, I get to see how expert knowledge and information is produced and have a deep respect for the strengths and limitations of scientific inquiry. Surrounded by journalists and people working to distribute information, I get to see how incentives shape information production and dissemination and the fault lines of that process. I believe that information intermediaries are important, that honed expertise matters, and that no one can ever be fully informed. As a result, I have long believed that we have to outsource certain matters and to trust others to do right by us as individuals and society as a whole. This is what it means to live in a democracy, but, more importantly, it’s what it means to live in a society.

In the United States, we’re moving towards tribalism, and we’re undoing the social fabric of our country through polarization, distrust, and self-segregation. And whether we like it or not, our culture of doubt and critique, experience over expertise, and personal responsibility is pushing us further down this path.

Media literacy asks people to raise questions and be wary of information that they’re receiving. People are. Unfortunately, that’s exactly why we’re talking past one another.

The path forward is hazy. We need to enable people to hear different perspectives and make sense of a very complicated — and in many ways, overwhelming — information landscape. We cannot fall back on standard educational approaches because the societal context has shifted. We also cannot simply assume that information intermediaries can fix the problem for us, whether they be traditional news media or social media. We need to get creative and build the social infrastructure necessary for people to meaningfully and substantively engage across existing structural lines. This won’t be easy or quick, but if we want to address issues like propaganda, hate speech, fake news, and biased content, we need to focus on the underlying issues at play. No simple band-aid will work.

Special thanks to Amanda Lenhart, Claire Fontaine, Mary Madden, and Monica Bulger for their feedback!

This post was first published as part of a series on media, accountability, and the public sphere. See also:

by zephoria at January 09, 2017 01:13 PM

January 08, 2017

MIMS 2012

Sol LeWitt - Wall Drawing

I recently saw Sol LeWitt’s Wall Drawing #273 at the SF MOMA, which really stayed with me after leaving the museum. In particular, I like that it wasn’t drawn by the artist himself, but rather he wrote instructions for draftspeople to draw this piece directly on the walls of the museum, thus embracing some amount of variability. From the museum’s description:

As his works are executed over and over again in different locations, they expand or contract according to the dimensions of the space in which they are displayed and respond to ambient light and the surfaces on which they are drawn. In some instances, as in this work, those involved in the installation make decisions impacting the final composition.

Sol LeWitt's Wall Drawing #273 Sol LeWitt’s Wall Drawing #273

This embrace of variability reminds me of the web. People browse the web on different devices that have different sizes and capabilities. We can’t control how people will experience our websites. Since LeWitt left instructions for creating his pieces, I realized I could translate those instructions into code, and embrace the variability of the web in the process. The result is this CodePen.

See the Pen Sol LeWitt – Wall Drawing #273 by Jeff (@jlzych) on CodePen.

LeWitt left the following instructions:

A six-inch (15 cm) grid covering the walls. Lines from corners, sides, and center of the walls to random points on the grid.

1st wall: Red lines from the midpoints of four sides;

2nd wall: Blue lines from four corners;

3rd wall: Yellow lines from the center;

4th wall: Red lines from the midpoints of four sides, blue lines from four corners;

5th wall: Red lines from the midpoints of four sides, yellow lines from the center;

6th wall: Blue lines from four corners, yellow lines from the center;

7th wall: Red lines from the midpoints of four sides, blue lines from four corners, yellow lines from the center.

Each wall has an equal number of lines. (The number of lines and their length are determined by the draftsman.)

As indicated in the instructions, there are 7 separate walls with an equal number of lines, the number and length of which are determined by the draftsperson. To simulate the decisions the draftspeople make, I included controls to let people set how many lines should be drawn, and toggle which walls to see. I let each color be toggleable, as opposed listing out walls 1-7, since each wall is just different combinations of the red, blue, and yellow lines.

The end result fits right in with how human draftspeople have turned these instructions into art. The most notable difference I see between a human and a program is the degree of randomness in the final drawing. From comparing the output of the program to versions done by people, the ones drawn by people seem less “random.” I get the sense that people have a tendency to more evenly distribute the lines to points throughout the grid, whereas the program can create clusters and lines that are really close to each other which a person would consider unappealing and not draw.

It makes me wonder how LeWitt would respond to programmatic versions of his art. Is he okay with computers making art? Were his instructions specifically for people, or would he have embraced using machines to generate his work had the technology existed in his time? How “random” did he want people make these drawings? Does he like that a program is more “random,” or did he expect and want people to make his wall drawings in a way that they would find visually pleasing? We’ll never know, but it was fun to interpret his work through the lens of today’s technology.

by Jeff Zych at January 08, 2017 11:10 PM

January 06, 2017

Ph.D. alumna

Hacking the Attention Economy

For most non-technical folks, “hacking” evokes the notion of using sophisticated technical skills to break through the security of a corporate or government system for illicit purposes. Of course, most folks who were engaged in cracking security systems weren’t necessarily in it for espionage and cruelty. In the 1990s, I grew up among teenage hackers who wanted to break into the computer systems of major institutions that were part of the security establishment, just to show that they could. The goal here was to feel a sense of power in a world where they felt pretty powerless. The rush was in being able to do something and feel smarter than the so-called powerful. It was fun and games. At least until they started getting arrested.

Hacking has always been about leveraging skills to push the boundaries of systems. Keep in mind that one early definition of a hacker (from the Jargon File) was “A person who enjoys learning the details of programming systems and how to stretch their capabilities, as opposed to most users who prefer to learn only the minimum necessary.” In another early definition (RFC:1392), a hacker is defined as “A person who delights in having an intimate understanding of the internal workings of a system, computers and computer networks in particular.” Both of these definitions highlight something important: violating the security of a technical system isn’t necessarily the primary objective.

Indeed, over the last 15 years, I’ve watched as countless hacker-minded folks have started leveraging a mix of technical and social engineering skills to reconfigure networks of power. Some are in it for the fun. Some see dollar signs. Some have a much more ideological agenda. But above all, what’s fascinating is how many people have learned to play the game. And in some worlds, those skills are coming home to roost in unexpected ways, especially as groups are seeking to mess with information intermediaries in an effort to hack the attention economy.

CC BY-NC 2.0-licensed photo by artgraff.

It all began with memes… (and porn…)

In 2003, a 15-year-old named Chris Poole started an image board site based on a Japanese trend called 4chan. His goal was not political. Rather, like many of his male teenage peers, he simply wanted a place to share pornography and anime. But as his site’s popularity grew, he ran into a different problem — he couldn’t manage the traffic while storing all of the content. So he decided to delete older content as newer content came in. Users were frustrated that their favorite images disappeared so they reposted them, often with slight modifications. This gave birth to a phenomenon now understood as “meme culture.” Lolcats are an example. These are images of cats captioned with a specific font and a consistent grammar for entertainment.

Those who produced meme-like images quickly realized that they could spread like wildfire thanks to new types of social media (as well as older tools like blogging). People began producing memes just for fun. But for a group of hacker-minded teenagers who were born a decade after I was, a new practice emerged. Rather than trying to hack the security infrastructure, they wanted to attack the emergent attention economy. They wanted to show that they could manipulate the media narrative, just to show that they could. This was happening at a moment when social media sites were skyrocketing, YouTube and blogs were challenging mainstream media, and pundits were pushing the idea that anyone could control the narrative by being their own media channel. Hell, “You” was TIME Magazine’s person of the year in 2006.

Taking a humorist approach, campaigns emerged within 4chan to “hack” mainstream media. For example, many inside 4chan felt that widespread anxieties about pedophilia were exaggerated and sensationalized. They decided to target Oprah Winfrey, who, they felt, was amplifying this fear-mongering. Trolling her online message board, they got her to talk on live TV about how “over 9,000 penises” were raping children. Humored by this success, they then created a broader campaign around a fake character known as Pedobear. In a different campaign, 4chan “b-tards” focused on gaming the TIME 100 list of “the world’s most influential people” by arranging it such that the first letter of each name on the list spelled out “Marblecake also the game,” which is a known in-joke in this community. Many other campaigns emerged to troll major media and other cultural leaders. And frankly, it was hard not to laugh when everyone started scratching their heads about why Rick Astley’s 1987 song “Never Gonna Give You Up” suddenly became a phenomenon again.

By engaging in these campaigns, participants learned how to shape information within a networked ecosystem. They learned how to design information for it to spread across social media.

They also learned how to game social media, manipulate its algorithms, and mess with the incentive structure of both old and new media enterprises. They weren’t alone. I watched teenagers throw brand names and Buzzfeed links into their Facebook posts to increase the likelihood that their friends would see their posts in their News Feed. Consultants starting working for companies to produce catchy content that would get traction and clicks. Justin Bieber fans ran campaign after campaign to keep Bieber-related topics in Twitter Trending Topics. And the activist group Invisible Children leveraged knowledge of how social media worked to architect the #Kony2012 campaign. All of this was seen as legitimate “social media marketing,” making it hard to detect where the boundaries were between those who were hacking for fun and those who were hacking for profit or other “serious” ends.

Running campaigns to shape what the public could see was nothing new, but social media created new pathways for people and organizations to get information out to wide audiences. Marketers discussed it as the future of marketing. Activists talked about it as the next frontier for activism. Political consultants talked about it as the future of political campaigns. And a new form of propaganda emerged.

The political side to the lulz

In her phenomenal account of Anonymous — “Hacker, Hoaxer, Whistleblower, Spy” — Gabriella Coleman describes the interplay between different networks of people playing similar hacker-esque games for different motivations. She describes the goofy nature of those “Anons” who created a campaign to expose Scientology, which many believed to be a farcical religion with too much power and political sway. But she also highlights how the issues became more political and serious as WikiLeaks emerged, law enforcement started going after hackers, and the Arab Spring began.

CC BY-SA 3.0-licensed photo by Essam Sharaf via Wikimedia Commons.

Anonymous was birthed out of 4chan, but because of the emergent ideological agendas of many Anons, the norms and tactics started shifting. Some folks were in it for fun and games, but the “lulz” started getting darker and those seeking vigilante justice started using techniques like “doxing”to expose people who were seen as deserving of punishment. Targets changed over time, showcasing the divergent political agendas in play.

Perhaps the most notable turn involved “#GamerGate” when issues of sexism in the gaming industry emerged into a campaign of harassment targeted at a group of women. Doxing began being used to enable “swatting” — in which false reports called in by perpetrators would result in SWAT teams sent to targets’ homes. The strategies and tactics that had been used to enable decentralized but coordinated campaigns were now being used by those seeking to use the tools of media and attention to do serious reputational, psychological, economic, and social harm to targets. Although 4chan had long been an “anything goes” environment (with notable exceptions), #GamerGate became taboo there for stepping over the lines.

As #GamerGate unfolded, men’s rights activists began using the situation to push forward a long-standing political agenda to counter feminist ideology, pushing for #GamerGate to be framed as a serious debate as opposed to being seen as a campaign of hate and harassment. In some ways, the resultant media campaign was quite successful: major conferences and journalistic enterprises felt the need to “hear both sides” as though there was a debate unfolding. Watching this, I couldn’t help but think of the work of Frank Luntz, a remarkably effective conservative political consultant known for reframing issues using politicized language.

As doxing and swatting have become more commonplace, another type of harassment also started to emerge en masse: gaslighting. This term refers to a 1944 Ingrid Bergman film called “Gas Light” (which was based on a 1938 play). The film depicts psychological abuse in a domestic violence context, where the victim starts to doubt reality because of the various actions of the abuser. It is a form of psychological warfare that can work tremendously well in an information ecosystem, especially one where it’s possible to put up information in a distributed way to make it very unclear what is legitimate, what is fake, and what is propaganda. More importantly, as many autocratic regimes have learned, this tactic is fantastic for seeding the public’s doubt in institutions and information intermediaries.

The democratization of manipulation

In the early days of blogging, many of my fellow bloggers imagined that our practice could disrupt mainstream media. For many progressive activists, social media could be a tool that could circumvent institutionalized censorship and enable a plethora of diverse voices to speak out and have their say. Civic minded scholars were excited by “smart mobs” who leveraged new communications platforms to coordinate in a decentralized way to speak truth to power. Arab Spring. Occupy Wall Street. Black Lives Matter. These energized progressives as “proof” that social technologies could make a new form of civil life possible.

I spent 15 years watching teenagers play games with powerful media outlets and attempt to achieve control over their own ecosystem. They messed with algorithms, coordinated information campaigns, and resisted attempts to curtail their speech. Like Chinese activists, they learned to hide their traces when it was to their advantage to do so. They encoded their ideas such that access to content didn’t mean access to meaning.

Of course, it wasn’t just progressive activists and teenagers who were learning how to mess with the media ecosystem that has emerged since social media unfolded. We’ve also seen the political establishment, law enforcement, marketers, and hate groups build capacity at manipulating the media landscape. Very little of what’s happening is truly illegal, but there’s no widespread agreement about which of these practices are socially and morally acceptable or not.

The techniques that are unfolding are hard to manage and combat. Some of them look like harassment, prompting people to self-censor out of fear. Others look like “fake news”, highlighting the messiness surrounding bias, misinformation, disinformation, and propaganda. There is hate speech that is explicit, but there’s also suggestive content that prompts people to frame the world in particular ways. Dog whistle politics have emerged in a new form of encoded content, where you have to be in the know to understand what’s happening. Companies who built tools to help people communicate are finding it hard to combat the ways their tools are being used by networks looking to skirt the edges of the law and content policies. Institutions and legal instruments designed to stop abuse are finding themselves ill-equipped to function in light of networked dynamics.

The Internet has long been used for gaslighting, and trolls have long targeted adversaries. What has shifted recently is the scale of the operation, the coordination of the attacks, and the strategic agenda of some of the players.

For many who are learning these techniques, it’s no longer simply about fun, nor is it even about the lulz. It has now become about acquiring power.

A new form of information manipulation is unfolding in front of our eyes. It is political. It is global. And it is populist in nature. The news media is being played like a fiddle, while decentralized networks of people are leveraging the ever-evolving networked tools around them to hack the attention economy.

I only wish I knew what happens next.

This post was first published as part of a series on media, accountability, and the public sphere. See also:

by zephoria at January 06, 2017 09:12 AM

January 04, 2017

MIMS 2012

Books I Read in 2016

In 2016, I read 22 books. Only 3 of those 22 were fiction. I had a consistent clip of 1-3 per month, and managed to finish at least one book each month.

Highlights include:

  • The Laws of Simplicity by John Maeda: the first book I read this year was super interesting. In it, Maeda offers 10 laws for for balancing simplicity and complexity in business, technology, and design. By the end, he simplifies the book down to one law: “Simplicity is about subtracting the obvious, and adding the meaningful.”
  • David Whitaker Painting by Matthew Sturgis: I had never heard of the artist David Whitaker until I stumbled on this book at Half Price Books in Berkeley. He makes abstract paintings that combine lines and colors and gradients in fantastic ways. The cover sucked me in, and after flipping through a few pages I fell in love with his work and immediately bought the book. Check out his work on his portfolio.
  • Libra by Don DeLillo: a fascinating account of all the forces (including internal ones) that pushed Lee Harvey Oswald into assassinating JFK. The book is fiction and includes plenty of embellishments from the author (especially internal dialog), but is based on real facts from Oswald’s life and the assassination.
  • NOFX: The Hepatitis Bathtub and Other Stories by NOFX: a thoroughly entertaining history of the SoCal pop-punk band NOFX as told through various ridiculous stories from the members of the band themselves. It was perfect poolside reading in Cabo.
  • Org Design for Design Orgs by Peter Merholz & Kristin Skinner: This is basically a handbook for what I should be doing as the Head of Design at Optimizely. I can’t overstate how useful this has been to me in my job. If you’re doing any type of design leadership, I highly recommend it.
  • The Gift, by Lewis Hyde: a very thought-provoking read about creativity and the tension between art and commerce. So thought-provoking that it provoked me into writing down my thoughts in my last blog post.

Full List of Books Read

  • The Laws of Simplicity by John Maeda (1/3/16)
  • Although of Course You End up Becoming Yourself by David Lipsky (1/24/16)
  • Practical Empathy by Indi Young (2/1/16)
  • Time Out of Joint by Philip K. Dick (2/8/16)
  • A Wild Sheep Chase by Haruki Murakami (3/5/16)
  • Radical Focus: Achieving Your Most Important Goals with Objectives and Key Results by Christina Wodtke (3/21/16)
  • The Elements of Style by William Strunk Jr. and E.B. White (3/23/16)
  • Sprint: How to solve big problems and test new ideas in just 5 days by Jake Knapp, with John Zeratsky & Braden Kowitz (4/8/16)
  • David Whitaker Painting by Matthew Sturgis (4/18/16)
  • Show Your Work by Austin Kleon (5/8/16)
  • Nicely Said by Kate Kiefer Lee and Nicole Fenton (6/5/16)
  • The Unsplash Book by Jory MacKay (6/27/16)
  • Words Without Music: A Memoir by Philip Glass (July)
  • Libra by Don DeLillo (8/21/16)
  • How To Visit an Art Museum by Johan Idema (8/23/16)
  • 101 Things I Learned in Architecture School by Matthew Frederick (9/5/16)
  • Intercom on Jobs-to-be-Done by Intercom (9/17/16)
  • Org Design for Design Orgs by Peter Merholz & Kristin Skinner (9/26/16)
  • NOFX: The Hepatitis Bathtub and Other Stories by NOFX with Jeff Alulis (10/23/16)
  • The User’s Journey: Storymapping Products That People Love by Donna Lichaw (11/10/16)
  • Sharpie Art Workshop Book by Timothy Goodman (11/13/16)
  • The Gift by Lewis Hyde (12/29/16)

by Jeff Zych at January 04, 2017 05:08 AM

December 31, 2016

MIMS 2012

Thoughts on “The Gift”

I finally finished “The Gift,” by Lewis Hyde, after reading it on and off for at least the last 4 months (probably more). Overall I really enjoyed it and found it very thought-provoking. At its core it’s about creativity, the arts, and the tension between art and commerce — topics which are fascinating to me. It explores the question, how do artists make a living in a market-based economy? (I say “explores the question” instead of “answers” because it doesn’t try to definitively answer the question, although some solutions are provided).

It took me awhile to finish, though, because the book skews academic at times, which made some sections a slog to get through. The first half goes pretty deep into topics including the theory of gifts, history of gift-giving, folklores about gifts, and how gift-based economies function; the latter half uses Walt Whitman and Ezra Pound as real-life examples of the theory-based first half. Both of these sections felt like they could have been edited down to be much more succinct, while still preserving the main points being made. This would have made the book easier to get through, and the book’s main points easier to parse and more impactful.

There’s a sweet spot in the middle, however, which is a thought-provoking account of the creative process and how artists describe their work. If I were to re-read the book I’d probably just read Chapter 8, “The Commerce of the Creative Spirit.”

The book makes a lot of interesting points about gifts and gift-giving, market economies, artists and the creative process, how artists can survive in a market economy, and the Cold War’s affect on art in America, which I summarize below.

On Gifts and Gift-Giving

  • Gifts need to be used or given away to have any value. Value comes from the gift’s use. They can’t be sold or stay with an individual. If they do, they’re wasted. This is true of both actual objects and talent.
  • Gift giving is a river that needs to stay in motion, whereas markets are an equilibrium that seeks balance.
  • Giving a gift creates a bond between the giver and recipient. Commerce leaves no connection between people. Gifts foster community, whereas commerce fosters freedom and individuals. Gifts are agents of social cohesion.
  • Gifts are given with no expectation of a return gift. By giving something to a member of the community, or the community itself, you trust that the gift will eventually return to you in some other form by the community.
  • Converting a gift to money, e.g. by selling it on the open market, undermines the group’s cohesion, fragments the group, and could destroy it if it becomes the norm.
  • Gift economies don’t scale, though. Once it grows beyond the point that each member knows each other to some degree it will collapse.

On Market Economies

  • Market economies are good for dealing with strangers, i.e. people who aren’t part of a group, people who you won’t see again. There’s a fair value to exchange goods and services with people outside the group, and no bond is created between people.
  • Markets serve to divide, classify, quantify. Gifts and art are a way of unifying people.

On Artists and the Creative Process

  • Artists typically don’t know where their work comes from. They produce something, then evaluate it and think, “Did I do that?”
  • To produce art, you have to turn off the part of your brain that quantifies, edits, judges. Some artists step away from their work, go on retreats, travel, see new things, have new experiences, take drugs, isolate themselves, and so on. The act of judging and evaluating kills the creative process. Only after a work of art is created can an artist quantify it and judge it and (maybe) sell it.
  • Art is a gift that is given to the world, and that gift has the power to awaken new artists (see above, gifts must keep moving). That is, an artist is initially inspired by a master’s work of art to produce their own. In this way, art is further given back to the world, and the cycle of gift-giving continues.
  • Each piece of work an artist produces is a gift given to them by an unknown external agent, and in turn a gift they pass on to the world.
  • Artists “receive” their work – it’s an invocation of something (e.g. “muse”, “genius”, etc.). The initial spark comes to them from a source they do not control. Only after this initial raw “materia” appears does the work of evaluation, clarification, revision begin. Refining an idea, and bringing it into the world, comes after that initial spark is provided to them by an external source. "Invoking the creative spirit"
    • Artists can’t control the initial spark, or will it to appear. The artist is the servant of the initial spark.
    • Evaluation kills creativity – it must be laid aside until after raw material is created.
  • The act of creation does not empty the wellspring that provided that initial spark; rather, the act of creation assures the flow continues and that the wellspring will never empty. Only if it’s not used does it go dry.
  • Imagination can assemble our fragmented experiences into a coherent whole. An artist’s work, once produced, can then reproduce the same spirit or gift initially given to them in the audience.
  • This binds people by being a shared “gift” for all who are able to receive it. This widens one’s sense of self.
  • The spirit of a people can be given form in art. This is how art comes to represent groups.
  • The primary commerce of art is gift exchange, not market exchange.

How Artists Can Survive in a Market Economy

The pattern for artists to survive is that they need to be able to create their work in a protected gift sphere, free of evaluation and judgment and quantification. Only then, after the work has been made real, can they evaluate it and bring it to market. By bringing it to the market they can convert their gift into traditional forms of wealth, which they can re-invest back in their gift. But artists can’t start in the market economy, because that isn’t art. It’s “commercial art,” i.e. creating work to satisfy an external market demand, rather than giving an internal gift to the world.

There are 3 ways of doing this:

  1. Sell the work itself on the market — but only after it’s been created. Artists need to be careful to keep the two separate.
  2. Patronage model. A king, or grants, or other body pays for the artist to create work.
  3. Work a job to pay the bills, and create your work outside of that. This frees artists from having to subsist on their work alone, and frees them to say what they want to say. This is, in a sense, self-patronage.
  4. Bonus way: arts support the arts. This means the community of artists creates a fund, or trust, that is invested in new artists. The fund’s money comes by taking a percentage of the profits from established artists. This is another form of patronage.

But even using these models, Hyde is careful to point out that this isn’t a way to become rich – it’s a way to “survive.” And even within these models there are still pitfalls.

Soviet Union’s affect on art in America

In the 25th Anniversary edition afterword, Hyde makes the connection that the Cold War spurred America to increase funding to the arts and sciences to demonstrate the culture and freedom of expression that a free market supports. A communist society, on the other hand, doesn’t value art and science since they don’t typically have direct economic benefit, and thus doesn’t have the same level of expression as a free market. The end of the Cold War, unfortunately, saw a decrease in funding since the external threat was removed. This was an interesting connection that I hadn’t thought about before.

Final Thoughts

All in all, a very thought-provoking book that I’m glad I read.

by Jeff Zych at December 31, 2016 08:47 PM

December 27, 2016

Ph.D. student

the impossibility of conveying experience under conditions of media saturation

I had a conversation today with somebody who does not yet use a smart phone. She asked me how my day had gone so far and what I had done.

Roughly speaking, the answer to the question was that I had spent the better part of the day electronically chatting with somebody else, who had recommended to me an article that I had found interesting, but then when attempting to share it with friends on social media I was so distracted by a troubled post by somebody else that I lost all resolve.

Not having either article available at my fingertips for the conversation, and not wanting to relay the entirety of my electronic chat, I had to answer with dissatisfying and terse statements to the effect the person I had spoken with was just fine, the article I read had been interesting, and that something had reminded me of something else, which put me in a bad mood.

The person I was speaking with is very verbal, and answers like these are disappointing for her. To her, not being able to articulate something is a sign that one is not thinking about it sufficiently clearly. To be inarticulate is to be uncomprehending.

What I was facing, on the contrary, was a situation where I had been subject to articulation and nothing but it for the better part of the day. My life is so saturated by media that the amount of information I’m absorbing in the average waking hour or two is just more than can be compressed into a conversation. The same text can’t occur twice, and the alternative perspective of the interlocutor makes it almost impossible to relay what the media meant to me, even if I were able to reproduce it literally for her. Add to this the complexity of my own reactions to the stimuli, which oscillate with my own thoughts on the matter, and you can see how I’d come to the conclusion at the end of the day that there is no way to convey ones lived experience accurately in writing when ones life is so saturated by media that such a conveyance would devolve into an endlessly nested system of quotations.

I’ve spent the past five years in graduate school. There’s a sense in graduate school that writing still matters. One may be expected to produce publications, even go to such lengths as writing a book. But when so much of what used to be considered conversation is now writing, one wonders whether the book, or the published article, has lost its prestige. The vague mechanics of publication no longer serves as a gatekeeper for what can and cannot be read. Rather, ‘publication’ serves some other function, perhaps a signal of provenance or a promise that something will be preserved.

The recent panic over “fake news” recalls a past when publication was a source of quality control and accountability. There was something comprehensible about a world where the official narrative was written and verified by an institution. Centralized media was a condition for modernism. Now what is our media ecosystem? Not even the awesome power of the search engine is able to tame the jungle of social media. Media is available all but immediately to the appetite of the consumer, and the millennial citizen bathes daily in the broth of those appetites. Words are no longer a mode of communication, they are a mode of consciousness. And the more of them that confront the mind, the more they resemble mere sensations, not the kinds of ideas one would assemble into a phrase and utter to another with import.

There is no going back. Our media chaos is, despite its newness, primordial. Old patterns of authority are obsolete. The questions we must ask are: what now? And, how can we tell anybody?

by Sebastian Benthall at December 27, 2016 12:01 AM

December 23, 2016

Ph.D. student

notes about natural gas and energy policy

I’m interested in energy (in the sense of the economy and ecology of energy as it powers society) but know nothing about it.

I feel like the last time I really paid attention to energy, it was still a question of oil (and its industrial analog, Big Oil) and alternative, renewable energy.

But now energy production in the U.S. has given way from oil to natural gas. I asked a friend about why, and I’ve filled in a big gap in my understanding of What’s Going On. What I filled it in with might be wrong, but here’s what it is so far:

  • At some point natural gas became a viable alternative to oil because the energy companies discovered it was cheaper to collect natural gas than to drill for oil.
  • The use of natural gas for energy has less of a carbon footprint than oil does. That makes it environmentally friendly relative to the current regulatory environment.
  • The problem (there must be a problem) is that the natural gas collection process has lots of downsides. These downsides are mainly because the process is very messy, involving smashing into some pocket of natural gas under lots of rock and trying to collect the good stuff. Lots of weird gases go everywhere. That has downsides, including:
    • Making the areas where this is happening unlivable. Because it’s harder to breathe? Because the water can be set on fire? It’s terrible.
    • It releases a lot of methane into the environment, which may be as bad if not worse for climate change than carbon. Who knows how bad it really is? Unclear.
  • Here’s the point (totally unconfirmed): The shift from oil to natural gas as an energy source has been partly due to a public awareness and regulatory gap about the side effects. There’s now lots of political pressure and science around carbon. But methane? I thought that was an energy source (because of Mad Max Beyond Thunderdome). I guess I was wrong.
  • Meanwhile, OPEC and non-OPEC have teamed up to restrict oil sales to hike up oil prices. Sucks for energy consumers, but that’s actually good for the environment.
  • Also, in response to the apparent reversal of U.S. federal interest in renewable energy, philanthropy-plus-market has stepped in with Breakthrough Energy Ventures. Since venture capital investors with technical backgrounds, unlike the U.S. government, tend to be long on science, this is just great.
  • So what: The critical focus for those interested in the environment now should be on the environmental and social impact of natural gas production, as oil has been taken care of and heavy hitters are backing sustainable energy in a way that will fix the problem if it can truly be fixed. We just have to not boil the oceans and poison all the children before they can get to it.
  • /

      If that doesn’t work, I guess at the end of the day, there’s always pigs.

by Sebastian Benthall at December 23, 2016 08:16 PM

MIMS 2016

Well, I did not know about Fractals. I will check it out.

Well, I did not know about Fractals. I will check it out.

And thanks for reading! :)

by nikhil at December 23, 2016 04:23 AM

December 18, 2016

Ph.D. alumna

Heads Up: Upcoming Parental Leave

There’s a joke out there that when you’re having your first child, you tell everyone personally and update your family and friends about every detail throughout the pregnancy. With Baby #2, there’s an abbreviated notice that goes out about the new addition, all focused on how Baby #1 is excited to have a new sibling. And with Baby #3, you forget to tell people.

I’m a living instantiation of that. If all goes well, I will have my third child in early March and I’ve apparently forgotten to tell anyone since folks are increasingly shocked when I indicate that I can’t help out with XYZ because of an upcoming parental leave. Oops. Sorry!

As noted when I gave a heads up with Baby #1 and Baby #2, I plan on taking parental leave in stride. I don’t know what I’m in for. Each child is different and each recovery is different. What I know for certain is that I don’t want to screw over collaborators or my other baby – Data & Society. As a result, I will be not taking on new commitments and I will be actively working to prioritize my collaborators and team over the next six months. In the weeks following birth, my response rates may get sporadic and I will probably not respond to non-mission-critical email. I won’t go completely offline in March (mostly for my own sanity), but I am fairly certain that I will take an email sabbatical in June when my family takes some serious time off** to be with one another and travel.

A change in family configuration is fundamentally walking into the abyss. For as much as our culture around maternity leave focuses on planning, so much is unknown. After my first was born, I got a lot of work done in the first few weeks afterwards because he was sleeping all the time and then things got crazy just as I was supposedly going back to work. That was less true with #2, but with #2 I was going seriously stir crazy being home in the cold winter and so all I wanted was to go to lectures with him to get out of bed and soak up random ideas. Who knows what’s coming down the pike. I’m fortunate enough to have the flexibility to roll with it and I intend to do precisely that.

What’s tricky about being a parent in this ecosystem is that you’re kinda damned if you do, damned if you don’t. Women are pushed to go back to work immediately to prove that they’re serious about their work – or to take serious time off to prove that they’re serious about their kids. Male executives are increasingly publicly talking about taking time off, while they work from home.  The stark reality is that I love what I do. And I love my children. Life is always about balancing different commitments and passions within the constraints of reality (time, money, etc.).  And there’s nothing like a new child to make that balancing act visible.

So if you need something from me, let me know!  And please understand and respect that I will be navigating a lot of unknown and doing my best to achieve a state of balance in the upcoming months of uncertainty.


** June 2017 vacation. After a baby is born, the entire focus of a family is on adjustment. For the birthing parent, it’s also on recovery because babies kinda wreck your body no matter how they come out. Finding rhythms for sleep and food become key for survival. Folks talk about this time as precious because it can enable bonding. That hasn’t been my experience and so I’ve relished the opportunity with each new addition to schedule some full-family bonding time a few months after birth where we can do what our family likes best – travel and explore as a family. If all goes well in March, we hope to take a long vacation in mid-June where I intend to be completely offline and focused on family. More on that once we meet the new addition.

by zephoria at December 18, 2016 05:35 AM

December 15, 2016

Ph.D. student

Protected: What’s going on?

This post is password protected. You must visit the website and enter the password to continue reading.

by Sebastian Benthall at December 15, 2016 04:38 AM

December 12, 2016

Ph.D. student

energy, not technology

I’m still trying to understand what’s happening in the world and specifically in the U.S. with the 2016 election. I was so wrong about it that I think I need to take seriously the prospect that I’ve been way off in my thinking about what’s important.

In my last post, I argued that the media isn’t as politically relevant we’ve been told. If underlying demographic and economic variables were essentially as predictive as anything of voter behavior, then media mishandling of polling data or biased coverage just isn’t what’s accounting for the recent political shift.

Part of the problem with media determinist accounts of the election is that because they deal with the minutia of reporting within the United States, they don’t explain how Brexit foreshadowed Trump’s election, as anybody paying attention has been pointing out for months.

So what happens if we take seriously explanation that really what’s happening is a reaction against globalization. That’s globalization in the form of a centralized EU government, or globalization in the form of U.S. foreign policy and multiculturalism. If the United States under Obama was trying to make itself out to be a welcoming place for global intellectual talent to come and contribute to the economy through Silicon Valley jobs, then arguably the election was the backfire.

An insulated focus on “the tech industry” and its political relevance has been a theme in my media bubble for the past couple of years. Arguably, that’s just because people thought the tech industry was where the power and the money was. So of course the media should scrutinize that, because everyone trying to get to the top of that pile wants to know what’s going on there.

Now it’s not clear who is in power any more (I’ll admit I’m just thinking about power as a sloppy aggregate of political and economic power. Let’s assume that political power backing an industry leads to a favorable regulatory environment for that industry’s growth, and it’s not a bad model). It doesn’t seem like it’s Silicon Valley any more. Probably it’s the energy industry.

There’s a lot going on in the energy industry! I know basically diddly about it but I’ve started doing some research.

One interesting thing that’s happening is that Russia and OPEC are teaming up to cut oil production. This is unprecedented. It also, to me, creates a confusing narrative. I thought Obama’s Clean Power Plan, focusing on renewable energy, and efforts to build international consensus around climate change were the best bets for saving the world from high emissions. But since cutting oil production leads to cutting oil production, what if the thing that really can cut carbon dioxide emissions is an oligopolistic price hike on oil?

That said, oil prices may not necessarily dictate energy prices because the U.S. because a lot of energy used is natural gas. Shale gas, in particular, is apparently a growing percentage of natural gas used in the U.S. It’s apparently better than oil in terms of CO2 emissions. Though it’s mined through fracking, which disgusts a lot of people!

Related: I was pretty pissed when I heard about Rex Tillerson, CEO of Exxon Mobil, being tapped for Secretary of State. Because that’s the same old oil companies that have messed things up so much before, right? Maybe not. Apparently Exxon Mobil also invests heavily in natural gas. As their website will tell you, that gas industry uses a lot of human labor. Which is obviously a plus in this political climate.

What’s interesting to me about all this is that it all seems very important but it has absolutely nothing to do with social media or even on-line marketplaces. It’s all about stuff way, way upstream on the supply chain.

It is certainly humbling to feel like your area of expertise doesn’t really matter. But I’m not sure what to even do as a citizen now that I realize how little I understand. I think there’s been something very broken about my theory about society and the world.

The next few posts may continue to have this tone of “huh”. I expect I’ll be stating what’s obvious to a lot of people. But whatever. I just need to sort some things out.

by Sebastian Benthall at December 12, 2016 02:55 AM

December 07, 2016

Ph.D. student

post-election updates

Like a lot of people, I was completely surprised by the results of the 2016 election.

Rationally, one has to take these surprises as an opportunity to update ones point of view. As it’s been almost a month, there’s been lots of opportunity to process what’s going on.

For my own sake, more than for any reader, I’d like to note my updates here.

The first point has been best articulated by Jon Stewart:

Stewart rejected the idea that better news coverage would have changed the outcome of the election. “The idea that if [the media] had done a better job this country would have made another choice is fake,” he said. He cited Brexit as an example of an unfortunate outcome that occurred despite its lead-up being appropriately covered by outlets like the BBC, which offered a much more balanced view than CNN, for example. “Trump didn’t happen because CNN sucks—CNN just sucks,” he said.

Satire and comedy also couldn’t have stood in the way of Trump winning, Stewart said. If this election has taught us anything, he said, its that “controlling the culture does not equate to holding the power.”

I once cared a lot about “money in politics” at the level of campaign donations. After a little critical thinking, this leads naturally to a concern about the role of the media more generally in elections. Centralized media in particular will never put themselves behind a serious bid for campaign finance reform because those media institutions cash out every election. This is what it means for a problem to be “systemic”: it is caused by a tightly reinforcing feedback loop that makes it into a kind of social structural knot.

But with the 2016 presidential election, we’ve learned that Because of the Internet, media are so fragmented that even controlled media are not in control. People will read what they want to read, one way or another. Whatever narrative suits a person best, they will be able to find it on the Internet.

A perhaps unhelpful way to say this is that the Internet has set the Bourdieusian habitus free from media control.

But if the media doesn’t determine habitus, what does?

While there is a lot of consternation about the failure of polling (which is interesting), and while that could have negatively impacted Democratic campaign strategy (didn’t it?), the more insightful sounding commentary has recognized that the demographic fundamentals were in favor of Trump all along because of what he stood for economically and socially. Michael Moore predicted the election result; logically, because he was right, we should update towards his perspective; he makes essentially this point about Midwestern voters, angry men, depressed progressives, and the appeal of oddball voting all working against Hilary. But none of these conditions have as much to do with media as they do to the preexisting population conditions.

There’s a tremendous bias among those who “study the Internet” to assign tremendous political importance to the things we have expertise on: the media, algorithms, etc. My biggest update this election was that I now think that these are eclipsed in political relevance compared to macro-economic issues like globalization. At best changes to, say, the design of social media platforms are going to change things for a few people at the margins. But larger structural forces are both more effective and more consequential in politics. I bet that a prediction of the 2016 election based primarily on the demographic distribution of winners and losers according to each candidate’s energy policy, for example, would have been more valuable than all the rest of the polling and punditry combined. I suppose I was leaning this way throughout 2016, but the election sealed the deal for me.

This is a relief for me because it has revealed to me just how much of my internalization and anxieties about politics have been irrelevant. There is something very freeing in discovering that many things that you once thought were the most important issues in the world really just aren’t. If all those anxieties were proven to just be in my head, then it’s easier to let them go. Now I can start wondering about what really matters.

by Sebastian Benthall at December 07, 2016 07:17 AM

December 03, 2016

Ph.D. student

directions to migrate your WebFaction site to HTTPS

Hiya friends using WebFaction,

Securing the Web, even our little websites, is important — to set a good example, to maintain the confidentiality and integrity of our visitors, to get the best Google search ranking. While secure Web connections had been difficult and/or costly in the past, more recently, migrating a site to HTTPS has become fairly straightforward and costs $0 a year. It may get even easier in the future, but for now, the following steps should do the trick.

Hope this helps, and please let me know if you have any issues,

P.S. Yes, other friends, I recommend WebFaction as a host; I’ve been very happy with them. Services are reasonably priced and easy to use and I can SSH into a server and install stuff. Sign up via this affiliate link and maybe I get a discount on my service or something.

P.S. And really, let me know if and when you have issues. Encrypting access to your website has gotten easier, but it needs to become much easier still, and one part of that is knowing which parts of the process prove to be the most cumbersome. I’ll make sure your feedback gets to the appropriate people who can, for realsies, make changes as necessary to standards and implementations.

Updated 2 December 2016: to use new letsencrypt-webfaction design, which uses WebFaction's API and doesn't require emails and waiting for manual certificate installation.

Updated 16 July 2016: to fix the cron job command, which may not have always worked depending on environment variables

One day soon I hope WebFaction will make more of these steps unnecessary, but the configuring and testing will be something you have to do manually in pretty much any case. You should be able to complete all of this in an hour some evening.

Create a secure version of your website in the WebFaction Control Panel

Login to the Web Faction Control Panel, choose the “DOMAINS/WEBSITES” tab and then click “Websites”.

“Add new website”, one that will correspond to one of your existing websites. I suggest choosing a name like existingname-secure. Choose “Encrypted website (https)”. For Domains, testing will be easiest if you choose both your custom domain and a subdomain of (If you don’t have one of those subdomains set up, switch to the Domains tab and add it real quick.) So, for my site, I chose and

Finally, for “Contents”, click “Re-use an existing application” and select whatever application (or multiple applications) you’re currently using for your http:// site.

Click “Save” and this step is done. This shouldn’t affect your existing site one whit.

Test to make sure your site works over HTTPS

Now you can test how your site works over HTTPS, even before you’ve created any certificates, by going to in your browser. Hopefully everything will load smoothly, but it’s reasonably likely that you’ll have some mixed content issues. The debug console of your browser should show them to you: that’s Apple-Option-K in Firefox or Apple-Option-J in Chrome. You may see some warnings like this, telling you that an image, a stylesheet or a script is being requested over HTTP instead of HTTPS:

Mixed Content: The page at ‘’ was loaded over HTTPS, but requested an insecure image ‘’. This content should also be served over HTTPS.

Change these URLs so that they point to (you could also use a scheme-relative URL, like // and update the files on the webserver and re-test.

Good job! Now, should work just fine, but shows a really scary message. You need a proper certificate.

Get a free certificate for your domain

Let’s Encrypt is a new, free, automated certificate authority from a bunch of wonderful people. But to get it to setup certificates on WebFaction is a little tricky, so we’ll use the letsencrypt-webfaction utility —- thanks will-in-wi!

SSH into the server with ssh

To install, run this command:

GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib gem2.2 install letsencrypt_webfaction

(Run the same command to upgrade; necesary if you followed these instructions before Fall 2016.)

For convenience, you can add this as a function to make it easier to call. Edit ~/.bash_profile to include:

function letsencrypt_webfaction {
    PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction $*

Now, let’s test the certificate creation process. You’ll need your email address, the domain you're getting a certificate for, the path to the files for the root of your website on the server, e.g. /home/yourusername/webapps/sitename/ and the WebFaction username and password you use to log in. Filling those in as appropriate, run this command:

letsencrypt_webfaction --letsencrypt_account_email --domains --public /home/yourusername/webapps/sitename/ --username webfaction_username --password webfaction_password

If all went well, you’ll see nothing on the command line. To confirm that the certificate was created successfully, check the SSL certificates tab on the WebFaction Control Panel. ("Aren't these more properly called TLS certificates?" Yes. So it goes.) You should see a certificate listed that is valid for your domain; click on it and you can see the expiry date and a bunch of gobblydegook which actually is the contents of the certificate.

To actually apply that certificate, head back to the Websites tab, select the -secure version of your website from the list and in the Security section, choose the certificate you just created from the dropdown menu.

Test your website over HTTPS

This time you get to test it for real. Load in your browser. (You may need to force refresh to get the new certificate.) Hopefully it loads smoothly and without any mixed content warnings. Congrats, your site is available over HTTPS!

You are not done. You might think you are done, but if you think so, you are wrong.

Set up automatic renewal of your certificates

Certificates from Let’s Encrypt expire in no more than 90 days. (Why? There are two good reasons.) Your certificates aren’t truly set up until you’ve set them up to renew automatically. You do not want to do this manually every few months; you will forget, I promise.

Cron lets us run code on WebFaction’s server automatically on a regular schedule. If you haven’t set up a cron job before, it’s just a fancy way of editing a special text file. Run this command:

EDITOR=nano crontab -e

If you haven’t done this before, this file will be empty, and you’ll want to test it to see how it works. Paste the following line of code exactly, and then hit Ctrl-O and Ctrl-X to save and exit.

* * * * * echo "cron is running" >> $HOME/logs/user/cron.log 2>&1

This will output to that log every single minute; not a good cron job to have in general, but a handy test. Wait a few minutes and check ~/logs/user/cron.log to make sure it’s working.

Rather than including our username and password in our cron job, we'll set up a configuration file with those details. Create a file config.yml, perhaps at the location ~/le_certs. (If necessary, mkdir le_certs, touch le_certs/config.yml, nano le_certs/config.yml.) In this file, paste the following, and then customize with your details:

letsencrypt_account_email: ''
api_url: ''
username: 'webfaction_username'
password: 'webfaction_password'

(Ctrl-O and Ctrl-X to save and close it.) Now, let’s edit the crontab to remove the test line and add the renewal line, being sure to fill in your domain name, the path to your website’s directory, and the path to the configuration file you just created:

0 4 15 */2 * PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib /usr/local/bin/ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction --domains --public /home/yourusername/webapps/sitename/ --config /home/yourusername/le_certs/config.yml >> $HOME/logs/user/cron.log 2>&1

You’ll probably want to create the line in a text editor on your computer and then copy and paste it to make sure you get all the substitutions right. Paths must be fully specified as the above; don't use ~ for your home directory. Ctrl-O and Ctrl-X to save and close it. Check with crontab -l that it looks correct. As a test to make sure the config file setup is correct, you can run the command part directly; if it works, you shouldn't see any error messages on the command line. (Copy and paste the line below, making the the same substitutions as you just did for the crontab.)

PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib /usr/local/bin/ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction --domains --public /home/yourusername/webapps/sitename/ --config /home/yourusername/le_certs/config.yml

With that cron job configured, you'll automatically get a new certificate at 4am on the 15th of alternating months (January, March, May, July, September, November). New certificates every two months is fine, though one day in the future we might change this to get a new certificate every few days; before then WebFaction will have taken over the renewal process anyway. Debugging cron jobs can be tricky (I've had to update the command in this post once already); I recommend adding an alert to your calendar for the day after the first time this renewal is supposed to happen, to remind yourself to confirm that it worked. If it didn't work, any error messages should be stored in the cron.log file.

Redirect your HTTP site (optional, but recommended)

Now you’re serving your website in parallel via http:// and https://. You can keep doing that for a while, but everyone who follows old links to the HTTP site won’t get the added security, so it’s best to start permanently re-directing the HTTP version to HTTPS.

WebFaction has very good documentation on how to do this, and I won’t duplicate it all here. In short, you’ll create a new static application named “redirect”, which just has a .htaccess file with, for example, the following:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
RewriteCond %{HTTP:X-Forwarded-SSL} !on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

This particular variation will both redirect any URLs that have www to the “naked” domain and make all requests HTTPS. And in the Control Panel, make the redirect application the only one on the HTTP version of your site. You can re-use the “redirect” application for different domains.

Test to make sure it’s working!,, and should all end up at (You may need to force refresh a couple of times.)

by at December 03, 2016 12:55 AM

November 30, 2016

adjunct professor

Dept. of Commerce’s Privacy Shield Checklist

Practitioner friends, the Department of Commerce just released their checklist for Privacy Shield applicants. More on this later.

by web at November 30, 2016 05:42 PM

November 27, 2016

Ph.D. student

reflexive control

A theory I wish I had more time to study in depth these days is the Soviet field of reflexive control (see for example this paper by Timothy Thomas on the subject).

Reflexive control is defined as a means of conveying to a partner or an opponent specially prepared information to incline him to voluntarily make the predetermined decision desired by the initiator of the action. Even though the theory was developed long ago in Russia, it is still undergoing further refinement. Recent proof of this is the development in February 2001, of a new Russian journal known as Reflexive Processes and Control. The journal is not simply the product of a group of scientists but, as the editorial council suggests, the product of some of Russia’s leading national security institutes, and boasts a few foreign members as well.

While the paper describes the theory in broad strokes, I’m interested in how one would formalize and operationalize reflexive control. My intuitions thus far are like this: traditional control theory assumes that the controlled system is inanimate or at least not autonomous. The controlled system is steered, often dynamically, to some optimal state. But in reflexive control, the assumption is that the controlled system is autonomous and has a decision-making process or intelligence. Therefore reflexive control is a theory of influence, perhaps deception. Going beyond mere propaganda, it seems like reflexive control can be highly reactive, taking into account the reaction time of other agents in the field.

There are many examples, from a Russian perspective, of the use of reflexive control theory during conflicts. One of the most recent and memorable was the bombing of the market square in Sarejevo in 1995. Within minutes of the bombing, CNN and other news outlets were reporting that a Serbian mortar attack had killed many innocent people in the square. Later, crater analysis of the shells that impacted in the square, along with other supporting evidence, indicated that the incident did not happen as originally reported. This evidence also threw into doubt the identities of the perpetrators of the attack. One individual close to the investigation, Russian Colonel Andrei Demurenko, Chief of Staff of Sector Sarejevo at the time, stated, “I am not saying the Serbs didn’t commit this atrocity. I am saying that it didn’t happen the way it was originally reported.” A US and Canadian officer soon backed this position. Demurenko believed that the incident was an excellent example of reflexive control, in that the incident was made to look like it had happened in a certain way to confuse decision-makers.

Thomas’s article points out that the notable expert in reflexive control in the United States is V. A. Lefebvre, a Soviet ex-pat and mathematical psychologist at UC Irvine. He is listed on a faculty listing but doesn’t seem to have a personal home page. His wikipedia page says that reflexive theory is like the Soviet alternative to game theory. That makes sense. Reflexive theory has been used by Lefebvre to articulate a mathematical ethics, which is surely relevant to questions of machine ethics today.

Beyond its fascinating relevance to many open research questions in my field, it is interesting to see in Thomas’s article how “reflexive control” seems to capture so much of what is considered “cybersecurity” today.

One of the most complex ways to influence a state’s information resources is by use of reflexive control measures against the state’s decision-making processes. This aim is best accomplished by formulating certain information or disinformation designed to affect a specific information resource best. In this context an information resource is defined as:

  • information and transmitters of information, to include the method or technology of obtaining, conveying, gathering, accumulating, processing, storing, and exploiting that information;
  • infrastructure, including information centers, means for automating information processes, switchboard communications, and data
    transfer networks;
  • programming and mathematical means for managing information;
  • administrative and organizational bodies that manage information processes, scientific personnel, creators of data bases and knowledge, as well as personnel who service the means of informatizatsiya [informatization].

Unlike many people, I don’t think “cybersecurity” is very hard to define at all. The prefix “cyber-” clearly refers to the information-based control structures of a system, and “security” is just the assurance of something against threats. So we might consider “reflexive control” to be essentially equivalent to “cybersecurity”, except with an emphasis on the offensive rather than defensive aspects of cybernetic control.

I have yet to find something describing the mathematical specifics of the theory. I’d love to find something and see how it compares to other research in similar fields. It would be fascinating to see where Soviet and Anglophone research on these topics is convergent, and where it diverges.

by Sebastian Benthall at November 27, 2016 04:00 AM

November 21, 2016

Ph.D. student

For “Comments on Haraway”, see my “Philosophy of Computational Social Science”

One of my most frequently visited blog posts is titled “Comments on Haraway: Situated knowledge, bias, and code”.  I have decided to password protect it.

If you are looking for a reference with the most important ideas from that blog post, I refer you to my paper, “Philosophy of Computational Social Science”. In particular, its section on “situated epistemology” discusses how I think computational social scientists should think about feminist epistemology.

I have decided to hide the original post for a number of reasons.

  • I wrote it pointedly. I think all the points have now been made better elsewhere, either by me or by the greater political zeitgeist.
  • Because it was written pointedly (even a little trollishly), I am worried that it may be easy to misread my intention in writing it. I’m trying to clean up my act :)
  • I don’t know who keeps reading it, though it seems to consistently get around thirty or more hits a week. Who are these people? They won’t tell me! I think it matters who is reading it.

I’m willing to share the password with anybody who contacts me about it.

by Sebastian Benthall at November 21, 2016 06:20 AM

November 18, 2016

MIDS student

EU-US Privacy Shield: Effects on Data companies

An overview of the framework and its impact on companies dealing with data


The EU Commission adopted a new framework for the protection of transatlantic personal data transfers of anyone in the EU to the US .This was finalised on 21 July 2016 and needless to say has since been facing some legal challenges – the latest by a French privacy advocacy group (here) . 500 companies have already signed up to the Privacy Shield and a larger number of applications are being processed by the US Department of Commerce. Only time will tell if this framework finally puts all anxieties to rest or will it meet the same fate as the Safe Harbour Privacy Principles (a swift and potentially untimely end – here).

Privacy Shield FAQs (Ref)

  • What is the need for Privacy Shield ? To provide high levels of privacy protection for personal data collected in EU and transferred to the US for processing
  • Who will be covered ? Personal data of any individual (EU citizen or not) collected in the EU for transfer to the US and beyond (as discussed below)
  • Is Privacy Shield the only framework under which one can transfer data ? There are many tools that can be used here like standard contractual clauses (SCC), binding corporate rules and the Privacy Shield. However, each has its own challenge. For eg SCC are also under a legal challenge from the Irish Data Protection Authority (DPA) and the current execution method makes them painfully slow and burdensome.(details here)
  • Which companies does it apply to? Any company that is the recipient of  personal data transferred from the EU must first sign up to this framework with the US Department of commerce.If they pass the  data to any other US or non-EU agent, the company will need to sign a contract with the agent to ensure that the agent upholds the same level of privacy protection requirements.
  • What are the salient features of the Privacy Shield ?
    • Transparency
      • Easy access to website for list of companies signed up under Privacy Shield
      • Logical order of steps that could be taken to address any suspected violations by a company with clear guidelines of time frame within which an update/decision must be provided
    • Privacy principles upheld such as
      • Individual has the right to be informed about the type and reason for information collected by a company , reason for transfer of data elsewhere etc
      • Limitations to use of the data for a purpose different from the original purpose
      • Obligation to minimize data collection and to store it only for the time required
      • Obligation to secure data
      • Allow access to individual for correction of data
  • What are the shortfalls? Besides introducing operational hurdles, Privacy Shield fails to provide clarity on US government surveillance of the personal data. Additionally this framework does not apply to within-EU transfer of data (to be covered under the General Data Protection Regulation from 2018)

How does this impact data related companies?

Given the amount of investments (Microsoft and Facebook together invested in creating an under-sea transatlantic cable network) and potential trade under threat, companies have heaved a sigh of relief with this new framework coming into effect and have started signing up enthusiastically. However,  there are quite a few issues that can have an impact going forward such as,

  • The framework clearly states that the use of the collected data for purposes unrelated to the original is prohibited. This would affect the current practice of sharing data with other companies to show “relevant” ads on website
  • The obligation to sign contracts with agents to ensure same level of privacy protection as under the Privacy Shield can be quite a taxing exercise and may affect some long time partnerships
  • A number of operationally intensive steps have been introduced such as a variety of opt-in/opt-out possibilities, consent process, ability for individuals to access their data.
  • It is difficult to justify the appropriate time-frame for storing the data especially geolocation data
  • The ongoing legal cases bring into question the longevity of the framework
  • Potential mismatch between the requirements for within EU and outside EU data transfers

by arvinsahni at November 18, 2016 05:48 AM

Ph.D. student

Protected: I study privacy now

This post is password protected. You must visit the website and enter the password to continue reading.

by Sebastian Benthall at November 18, 2016 05:20 AM

November 16, 2016

adjunct professor

On Kenneth Rogoff’s The Curse of Cash

Professor Kenneth Rogoff’s Curse of Cash convincingly argues that we pay a high price for our commitment to cash: Over a trillion dollars of it is circulating outside of US banks, enough for every American to be holding $4,200. Eighty percent of US currency is in hundred dollar bills, yet few of us actually carry large bills around (except perhaps in the Bay Area, where the ATMs do dispense 100s…). So where is all this money? Rogoff’s careful evidence gathering points to the hands of criminals and tax evaders. Perhaps more importantly, the availability of cash makes it impossible for central banks to pursue negative interest rate policies—because we can just hoard our money as cash and have an effective zero interest rate.

What to do about this? Rogoff does not argue for a cashless economy, but rather a less cash economy. Eliminate large bills, particularly the $100 (interesting fact–$1mm in 100s weighs just 22 pounds), and then moving large amounts of value around illegally becomes much more difficult. Proxies for cash are not very good—they are illiquid, heavy, or easily detectable. And what about Bitcoin?—not as anonymous as people think. Think Rogoff’s plan is impossible? Well, India Prime Minister Modi just implemented a version of it, eliminating the 500 and 1,000 rupee notes.

As you might imagine, Rogoff’s proposal angers many privacy advocates and libertarians. His well written, well informed, and well argued book deserves more than its 2 stars on Amazon.

My critique is a bit different from the discontents on Amazon. I think Rogoff’s proposal offers a good opportunity to think through what consumer protection in payments systems might look like in a less-cash world—this is a world I think we are entering. Yet, Rogoff’s discussion shows a real lack of engagement in the payments and especially the privacy literature. For Rogoff’s proposal to be taken seriously, we need to revamp payments to address the problems of fees, cybersecurity, consumer protection, and other pathologies that electronic payments exacerbate.

The Problem of Fees

One immediately apparent problem is that as much as cash contributes to crime and tax evasion, electronic payments contribute to waste as well, in different ways. The least obvious is the cartel-like fees imposed by electronic payments providers. All consumers—including cash users—subsidize the cost of electronic payments, and the price tag is massive. In the case of credit cards, fees can be as high as 3.5% of the transaction. I know from practice that startups’ business models are sometimes shaped around the problem of such fees. Fees may even be responsible for the absence of a viable micropayment system for online content.

Fees represent a hidden tax that a less-cash society will pay more of, unless users are transitioned to payment alternatives that draw directly from their bank accounts. Rogoff seems to implicitly assume that consumers will chose that alternative, but it is not clear to me that consumers perceive of the fee difference between standard credit card accounts and use of debit or ACH-linked systems. For many consumers, especially more affluent ones, the obvious choice is to choose a credit card, pay the balance monthly, and enjoy the perks. Rogoff’s policy then means more free perks for the rich that are subsidized by poorer consumers.

Taking Cybercrime Seriously

Here’s a more obvious crime problem—while Rogoff is quick to observe that cash means that cashiers will skim, there is less attention paid to the kinds of fraud that electronic payments enable. Electronic payment creates new planes of attack for different actors who are not in proximity to the victims. A cashier will skim a few dollars a night, but can be fired. Cybercriminals will bust out for much larger sums from safe havens elsewhere in the world.

The Problem of Impulsive Spending and Improvidence

Consumers also spend more when they use electronic payments. And so a less cash society means that you’ll have…less money! Cash itself is an abstract representation of value, but digital cash is both an abstraction and immaterial. One doesn’t feel the “sting” of parting with electronic cash. In fact, there is even a company making a device to simulate parting with cash to deter frivolous spending.

The Problem of Cyberattack

Rogoff imagines threats to electronic payment as power outages and the like. That’s just the beginning. There are cybercriminals who are economically motivated, but then there are those who just want to create instability or make a political statement. We should expect attacks on payments to affect confidentiality, integrity, and availability of services, and these attacks will come both from economically-motivated actors, to nation states, to terrorists simply wanting to put a thumb in the eye of commerce. The worst attacks will not be power-outage-like events, but rather attacks on integrity that undermine trust in the payment system.

Moving From Regulation Z to E

The consumer protection landscape tilts in the move from credit cards to debit and ACH. Credit cards are wonderful because the defaults protect consumers from fraud almost absolutely. ACH and debit payments place far more risk of loss onto the consumer, theoretically, more risk than even cash presents. For instance, if a business swindles a cash-paying customer, that customer only loses the cash actually transferred. In a debit transaction, the risk of loss is theoretically unlimited unless it is noticed by the consumer within 60 days. Many scammers operate today and make millions by effectuating small, unnoticed charges against consumers’ electronic accounts.

The Illiberal State; Strong Arm Robbery

Much of Rogoff’s argument depends on other assumptions, ones that we might not accept so willingly anymore. We currently live in a society committed to small-l liberal values. We have generally honest government officials. What if that were to change? In societies plagued with corruption and the need to bribe officials, mobile payments become a way to extract more money from the individual than she would ordinarily carry. Such systems make it impossible to hide how much money one has from officials or in a strong-arm robbery.

Paying Fast and Slow

Time matters and Rogoff is wrong about the relative speed of payment in a cash versus electronic transaction. Rogoff cites a 2008 study showing that debit and cash transactions take the same amount of time. This is a central issue for retailers and large ones such as Wal-Mart know to the second what is holding up a line, because these seconds literally add up to millions of dollars in lost sales. Retailers mindful of time kept credit card transaction quick, but with the advent of chip transactions, cash clearly is the quickest method of payment. It is quite aggravating to wait for so many people charging small purchases nowadays.

Mobile might change these dynamics–not not anytime soon. Bluetooth basically does not work. To use mobile payments safely one should keep their phone locked. So when you add up the time of 1) unlocking the phone, 2) finding the payment app, 3) futzing with it, and 4) waiting for the network to approve the transaction, cash is going to be quicker. These transaction costs could be lowered, but the winner is going to be the platform-provided approaches (Apple or Android) and not competitive apps.

Privacy 101

Privacy is a final area where Rogoff does not identify the literature or the issues involved. And this is too bad because electronic payments need not eliminate privacy. In fact, our current credit card system segments information such that it gives consumers some privacy: Merchants have problems identifying consumers because names are not unique and because some credit card networks prohibit retailers from using cardholder data for marketing. The credit card network is a kind of ISP and knows almost nothing about the transaction details. And the issuing and acquiring banks know how much was spent and where, but not the SKU-level data of purchases.

The problem is that almost all new electronic payments systems are designed to collect as much data as possible and to spread it around to everyone involved. This fact is hidden from the consumer, who might already falsely assume that there’s no privacy in credit transactions.

The privacy differential has real consequences for privacy that Rogoff never really contemplates or addresses. It ranges from customer profiling to the problem that you can never just buy a pack of gum without telling the retailer who you are. You indeed may have “nothing to hide” about your gum, but consider this—once the retailer identifies you, you have an “established business relationship” with that retailer. The retailer than has the legal and technical ability to send you spam, telemarketing calls, and even junk fax messages! This is why Jan Whittington and I characterized personal information transfers as “continuous” transactions—exchanges where payment doesn’t sever the link between the parties. Such continuous transactions have many more costs than the consumer can perceive.


Professor Rogoff’s book describes in detail how cash leads to enabling more crime, paying more taxes, and how it hobbles our government from implementing more aggressive monetary policy. But the problem is that the proposed remedy suffers from a series of pathologies that will increase costs to consumers in other ways, perhaps dramatically. So yes, there is a curse of cash, but there are dangerous and wasteful curses associated with electronic payment, particularly credit.

The critiques I write here are well established in the legal literature. Merely using the Google would have turned up the various problems explained here. And this makes me want to raise another point that is more general about academic economists. I have written elsewhere that economists’ disciplinarity is a serious problem, leading to scholarship out of touch with the realities of the very businesses that economists claim to study. I find surprisingly naive works by economists in privacy who seem immune to the idea that smart people exist outside the discipline and may have contemplated the same thoughts (often decades earlier). Making matters worse, the group agreement to observe disciplinary borders creates a kind of Dunning–Kruger effect, because peer review also misses relevant literature outside the discipline. Until academic economists look beyond the borders of their discipline, their work will always be a bit irrelevant, a bit out of step. And the industry will not correct these misperceptions because works such as these benefit banks’ policy goals.

by web at November 16, 2016 07:59 PM

November 10, 2016

Ph.D. alumna

Put an End to Reporting on Election Polls

We now know that the US election polls were wrong. Just like they were in Brexit. Over the last few months, I’ve told numerous reporters and people in the media industry that they should be wary of the polling data they’re seeing, but I was generally ignored and dismissed. I wasn’t alone — two computer scientists whom I deeply respect — Jenn Wortman Vaughan and Hanna Wallach — were trying to get an op-ed on prediction and uncertainty into major newspapers, but were repeatedly told that the outcome was obvious. It was not. And election polls will be increasingly problematic if we continue to approach them the way we currently do.

It’s now time for the media to put a moratorium on reporting on election polls and fancy visualizations of statistical data. And for data scientists and pollsters to stop feeding the media hype cycle with statistics that they know have flaws or will be misinterpreted as fact.

Why Political Polling Will Never Be Right Again

Polling and survey research has a beautiful history, one that most people who obsess over the numbers don’t know. In The Averaged American, Sarah Igo documents three survey projects that unfolded in the mid-20th century that set the stage for contemporary polling: the Middletown studies, Gallup, and Kinsey. As a researcher, it’s mindblowing to see just how naive folks were about statistics and data collection in the early development of this field, how much the field has learned and developed. But there’s another striking message in this book: Americans were willing to contribute to these kinds of studies at unparalleled levels compared to their peers worldwide because they saw themselves as contributing to the making of public life. They were willing to reveal their thoughts, beliefs, and ideas because they saw doing so as productive for them individually and collectively.

As folks unpack the inaccuracies of contemporary polling data, they’re going to focus on technical limitations. Some of these are real. Cell phones have changed polling — many people don’t pick up unknown numbers. The FCC’s ruling that limited robocalls to protect consumers in late 2015 meant that this year’s sampling process got skewed, that polling became more expensive, and that pollsters took shortcuts. We’ve heard about how efforts to extrapolate representativeness from small samples messes with the data — such as the NYTimes report on a single person distorting national polling averages.

But there’s a more insidious problem with the polling data that is often unacknowledged. Everyone and their mother wants to collect data from the public. And the public is tired of being asked, which they perceive as being nagged. In swing states, registered voters were overwhelmed with calls from real pollsters, fake pollsters, political campaigns, fundraising groups, special interest groups, and their neighbors. We know that people often lie to pollsters (confirmation bias), but when people don’t trust information collection processes, normal respondent bias becomes downright deceptive. You cannot collect reasonable data when the public doesn’t believe in the data collection project. And political pollsters have pretty much killed off their ability to do reasonable polling because they’ve undermined trust. It’s like what happens when you plant the same crop over and over again until the land can no longer sustain that crop.

Election polling is dead, and we need to accept that.

Why Reporting on Election Polling Is Dangerous

To most people, even those who know better, statistics look like facts. And polling results look like truth serum, even when pollsters responsibly report margin of error information. It’s just so reassuring or motivating to see stark numbers because you feel like you can do something about those numbers, and then, when the numbers change, you feel good. This plays into basic human psychology. And this is why we use numbers as an incentive in both education and the workplace.

Political campaigns use numbers to drive actions on their teams. They push people to go to particular geographies, they use numbers to galvanize supporters. And this is important, which is why campaigns invest in pollsters and polling processes.

Unfortunately, this psychology and logic gets messed up when you’re talking about reporting on election polls in the public. When the numbers look like your team is winning, you relax and stop fretting, often into complacency.When the numbers look like your team is losing, you feel more motivated to take steps and do something. This is part of why the media likes the horse race — they push people to action by reporting on numbers, which in effect pushes different groups to take action. They like the attention that they get as the mood swings across the country in a hotly contested race.

But there is number burnout and exhaustion. As people feel pushed and swayed, as the horse race goes on and on, they get more and more disenchanted. Rather than galvanizing people to act, reporting on political polling over a long period of time with flashy visuals and constantly shifting needles prompts people to disengage from the process. In short, when it comes to the election, this prompts people to not show up to vote. Or to be so disgusted that voting practices become emotionally negative actions rather than productively informed ones.

This is a terrible outcome. The media’s responsibility is to inform the public and contribute to a productive democratic process. By covering political polls as though they are facts in an obsessive way, they are not only being statistically irresponsible, but they are also being psychologically irresponsible.

The news media are trying to create an addictive product through their news coverage, and, in doing so, they are pushing people into a state of overdose.

Yesterday, I wrote about how the media is being gamed and not taking moral responsibility for its participation in the spectacle of this year’s election. One of its major flaws is how it’s covering data and engaging in polling coverage. This is, in many ways, the easiest part of the process to fix. So I call on the news media to put a moratorium on political polling coverage, to radically reduce the frequency with which they reference polls during an election season, and to be super critical of the data that they receive. If they want to be a check to power, they need to have the structures in place to be a check to math.

(This was first posted on Points.)

by zephoria at November 10, 2016 07:53 PM

November 09, 2016

Center for Technology, Society & Policy

Un-Pitch Day success & project opportunities

Our Social Impact Un-Pitch Day event back in October was a great success — held in conjunction with the Information Management Student Association at the School of Information, organizational attendees received help scoping potential technology projects, while scores of Berkeley students offered advice with project design and also learned more about opportunities both to help the attending organizations and CTSP funding.

A key outcome of the event was a list of potential projects developed by 10 organizations, from social service non-profits such as Berkeley Food Pantry and, to technology advocacy groups such as the ACLU of Northern California and the Center for Democracy and Technology (just to name a few!).

We are providing a list of the projects (with contact information) with the goal both of generating interest in these groups’ work as well as providing potential project ideas and matches for CTSP applicants. Please note that we cannot guarantee funding for these projects should you choose to “adopt” a project and work with one of these organizations. Even if a project match doesn’t result in a CTSP fellowship, we hope we can match technologists with these organizations to help promote tech policy for the public interest regardless.

Please check out the list and consider contacting one of these organizations ASAP if their project fits your interests or skill sets! As a reminder, the deadline to apply to CTSP for this funding cycle is November 28, 2016.

by Jennifer King at November 09, 2016 11:19 PM

Ph.D. alumna

I blame the media. Reality check time.

For months I have been concerned about how what I was seeing on the ground and in various networks was not at all aligned with what pundits were saying. I knew the polling infrastructure had broken, but whenever I told people about the problems with the sampling structure, they looked at me like an alien and told me to stop worrying. Over the last week, I started to accept that I was wrong. I wasn’t.

And I blame the media.

The media is supposed to be a check to power, but, for years now, it has basked in becoming power in its own right. What worries me right now is that, as it continues to report out the spectacle, it has no structure for self-reflection, for understanding its weaknesses, its potential for manipulation.

I believe in data, but data itself has become spectacle. I cannot believe that it has become acceptable for media entities to throw around polling data without any critique of the limits of that data, to produce fancy visualizations which suggest that numbers are magical information. Every pollster got it wrong. And there’s a reason. They weren’t paying attention to the various structural forces that made their sample flawed, the various reasons why a disgusted nation wasn’t going to contribute useful information to inform a media spectacle. This abuse of data has to stop. We need data to be responsible, not entertainment.

This election has been a spectacle because the media has enjoyed making it as such. And in doing so, they showcased just how easily they could be gamed. I refer to the sector as a whole because individual journalists and editors are operating within a structural frame, unmotivated to change the status quo even as they see similar structural problems to the ones I do. They feel as though they “have” to tell a story because others are doing so, because their readers can’t resist reading. They live in the world pressured by clicks and other elements of the attention economy. They need attention in order to survive financially. And they need a spectacle, a close race.

We all know that story. It’s not new. What is new is that they got played.
Over the last year, I’ve watched as a wide variety of decentralized pro-Trump actors first focused on getting the media to play into his candidacy as spectacle, feeding their desire for a show. In the last four months, I watched those same networks focus on depressing turnout, using the media to trigger the populace to feel so disgusted and frustrated as to disengage. It really wasn’t hard because the media was so easy to mess with. And they were more than happy to spend a ridiculous amount of digital ink circling round and round into a frenzy.

Around the world, people have been looking at us in a state of confusion and shock, unsure how we turned our democracy into a new media spectacle. What hath 24/7 news, reality TV, and social media wrought? They were right to ask. We were irresponsible to ignore.

In the tech sector, we imagined that decentralized networks would bring people together for a healthier democracy. We hung onto this belief even as we saw that this wasn’t playing out. We built the structures for hate to flow along the same pathways as knowledge, but we kept hoping that this wasn’t really what was happening. We aided and abetted the media’s suicide.
The red pill is here. And it ain’t pretty.

We live in a world shaped by fear and hype, not because it has to be that way, but because this is the obvious paradigm that can fuel the capitalist information architectures we have produced.

Many critics think that the answer is to tear down capitalism, make communal information systems, or get rid of social media. I disagree. But I do think that we need to actively work to understand complexity, respectfully engage people where they’re at, and build the infrastructure to enable people to hear and appreciate different perspectives. This is what it means to be truly informed.

There are many reasons why we’ve fragmented as a country. From the privatization of the military (which undermined the development of diverse social networks) to our information architectures, we live in a moment where people do not know how to hear or understand one another. And our obsession with quantitative data means that we think we understand when we hear numbers in polls, which we use to judge people whose views are different than our own. This is not productive.

Most people are not apathetic, but they are disgusted and exhausted. We have unprecedented levels of anxiety and fear in our country. The feelings of insecurity and inequality cannot be written off by economists who want to say that the world is better today than it ever was. It doesn’t feel that way. And it doesn’t feel that way because, all around us, the story is one of disenfranchisement, difference, and uncertainty.

All of us who work in the production and dissemination of information need to engage in a serious reality check.

The media industry needs to take responsibility for its role in producing spectacle for selfish purposes. There is a reason that the public doesn’t trust institutions in this country. And what the media has chosen to do is far from producing information. It has chosen to produce anxiety in the hopes that we will obsessively come back for more. That is unhealthy. And it’s making us an unhealthy country.

Spectacle has a cost. It always has. And we are about to see what that cost will be.

(This was first posted at Points.)

by zephoria at November 09, 2016 04:47 PM

MIMS 2016

All of these were created by actual people to emphasize bad form design.

All of these were created by actual people to emphasize bad form design. Here are their actual sources if you are interested in appropriate attribution:

by nikhil at November 09, 2016 12:20 AM

November 02, 2016

Center for Technology, Society & Policy

Now accepting 2016-2017 fellow applications!

Our website is now updated with the both the fellowship application and application upload page. As a reminder, we are accepting applications for the 2016-2017 cycle through Monday, November 28 at 11:59am PT.

by Jennifer King at November 02, 2016 10:47 PM

MIMS 2011

How Wikipedia’s silent coup ousted our traditional sources of knowledge

[Reposted from The Conversation, 15 January 2016]

As Wikipedia turns 15, volunteer editors worldwide will be celebrating with themed cakes and edit-a-thons aimed at filling holes in poorly covered topics. It’s remarkable that a user-editable encyclopedia project that allows anyone to edit has got this far, especially as the website is kept afloat through donations and the efforts of thousands of volunteers. But Wikipedia hasn’t just become an important and heavily relied-upon source of facts: it has become an authority on those facts.

Through six years of studying Wikipedia I’ve learned that we are witnessing a largely silent coup, in which traditional sources of authority have been usurped. Rather than discovering what the capital of Israel is by consulting paper copies of Encyclopedia Britannica or geographical reference books, we source our information online. Instead of learning about thermonuclear warfare from university professors, we can now watch a YouTube video about it.

The ability to publish online cheaply has led to an explosion in the number and range of people putting across facts and opinion than was traditionally delivered through largely academic publishers. But rather than this leading to an increase in the diversity of knowledge and the democratisation of expertise, the result has actually been greater consolidation in the number of knowledge sources considered authoritative. Wikipedia, particularly in terms of its alliance with Google and other search engines, now plays a central role.

From outsider to authority

Once ridiculed for allowing anyone to edit it, Wikipedia is now the seventh most visited website in the world, and the most popular reference source among them. Wikipedia articles feature at the top of the majority of searches conducted on Google, Bing, and other search engines. In 2012, Google announced the Knowledge Graph which moved Google from providing possible answers to a user’s questions in the search results it offers, to providing an authoritative answer in the form of a fact box with content drawn from Wikipedia articles about people, places and things.

Perhaps the clearest indication of Wikipedia’s new authority is demonstrated by who uses it and regards its content as credible. Whereas governments, corporations and celebrities couldn’t have cared less whether they had a Wikipedia page in 2001, now tales of politicians, celebrities, governments or corporations (or their PR firms) ham-fistedly trying to edit Wikipedia articles on them to remove negative statements or criticism regularly appear in the news.

Happy 15th birthday Wikipedia! Beko, CC BY-SA 

Wisdom of crowds

How exactly did Wikipedia become so authoritative? Two complementary explanations stand out from many. First, the rise of the idea that crowds are wise and the logic that open systems produce better quality results than closed ones. Second is the decline in the authority accorded to scientific knowledge, and the sense that scientific authorities are no longer always seen as objective or even reliable. As the authority of named experts housed in institutions has waned, Wikipedia, as a site that the majority of users believe is contributed to by unaffiliated and therefore unbiased individuals, has risen triumphant.

The realignment of expertise and authority is not new; changes to whom or what society deems credible sources of information have been a feature of the modern age. Authors in the field of the sociology of knowledge have written for decades about the struggles of particular fields of knowledge to gain credibility. Some have been more successful than others.

What makes today’s realignment different is the ways in which sources like Wikipedia are governed and controlled. Instead of the known, visible heads of academic and scientific institutions, sources like Wikipedia are largely controlled by nebulous, often anonymous individuals and collectives. Instead of transparent policies and methods, Wikipedia’s policies are so complex and numerous that they have become obscure, especially to newcomers. Instead of a few visible gatekeepers, the internet’s architecture means that those in control are often widely distributed and difficult to call to account.

Wikipedia is not neutral. Its platform has enabled certain languages, topics and regions to dominate others. Despite the difficulty of holding our new authorities of knowledge to account, it’s a challenge that’s critical to the future of an equitable and truly global internet. There are new powers in town, so alongside the birthday cake and celebrations there should be some reflection on who will watch Wikipedia and where we go from here.

by Heather Ford at November 02, 2016 10:15 AM

November 01, 2016

MIDS student

First blog post – Terms and Conditions apply !!

If you are like me, you would have never read a single terms and conditions document ( lovingly called T&Cs) whether it’s for a website, an app or even contracts.I am guilty of signing job contracts after just checking name, $$$ and location assuming that it’s all in good faith.

I have recently learnt that concept of good faith is heavily skewed against people like you and me.

As part of a class assignment, we were asked to read the T&C doc of any app /website of our choice. I approached this with a sinking feeling and expectation of the next one hour being wasted on fighting sleep while I try to make sense of this soporific exercise.I selected a data science related website that I had been following for a while. Needless to say, I had signed up without reading the T&Cs.

The first sentence on the T&C had me hooked . It stated in CAPS “IF YOU DO NOT AGREE TO ALL OF THE FOLLOWING, YOU MAY NOT USE OR ACCESS THE SERVICES IN ANY MANNER.” Somehow that felt like someone just threatened me .
Surprisingly the document had very simple and easy-to-understand language i.e. I did not need to hire a lawyer to read a menagerie of multiple comma, semi-colon infused sentences. Bravo on that !!

The simplicity also meant that for the first time, I realised the imbalance of power between the users and the service providers. My notion about buyers having the upper hand was just turned over on its head. Although “buyers” is not the technically correct terminology to use as the website under question provided free service .

As you read through the document, it is quite evident that they have access to a lot more than you can imagine. God forbid, if you are a competition winner, you are essentially giving them a royalty-free,global, timeless license to use your material.To be fair to them, they do state that ownership is still yours though I am not sure what does one do with it when someone else can use and distribute it freely.

The scary or comforting thing (depending on which camp you come from) is that this is not a lone case. My classmates analysed many apps in varying fields like social media, fitness, transport, finance etc and found similar issues.For example, you realise that if you have accounts with certain websites, your each and every movement on that website, its sister websites, any website that may be using its services are all being tracked because of course, they want to show you better ads !! But what if I don’t care about ads ? well, sometimes you can escape by just upgrading to premium ( no guarantee of not tracking even then ) or as in most cases, make your peace with the fact that privacy may not be so private after all . Besides a high annoyance factor, ads can also be misleading at times. One of the very popular social media site states the following in their T&C . ” You give us permission to use your name, profile picture, content, and information in connection with commercial, sponsored, or related content (such as a brand you like) served or enhanced by us. ….. You understand that we may not always identify paid services and communications as such.

Another interesting trend that was pointed out was the exponential increase in the length of the T&C document with each passing year of existence – most probably due to lawsuits and/or change in privacy,corporate or other laws.

At least after this exercise. I try to skim through the T&C of most of the new websites that I sign up for . However, each time I end up with a queasy feeling that I should not click on the “Agree” button but the attraction of that new functionality, especially when its free is too difficult to overcome.

(CONFESSION: I have not yet read the T&C for WordPress)

by arvinsahni at November 01, 2016 11:32 PM

October 30, 2016

adjunct professor

October 25, 2016

MIMS 2016

Bots As A Prototyping Tool

A means to an end and not the end itself.

I wanted to talk about a particular way that I think bots can be used but aren’t being used right now: as a prototyping tool. A few months back, I built my first bot (on Slack) as a prototype for usability testing. Let me tell you how I created a bot to use as a prototype in the design process.

Prototyping Using Conversation

Meet TriviaBot.

The Problem

Back in Spring 2016, I was in the final semester of my Master’s degree program at UC Berkeley. For the capstone project, we were building a natural language processing API called the Gadfly Project. Specifically, one that would automate the process of question generation. Given input text, our API would generate questions based on it.

To test our work and prioritize efforts, we needed to validate the quality of the questions being generated by our API. For this, we needed to get people to use the API. Our constraints were clear — use questions generated by our algorithm and keep the distractions to a minimum so we we get the data we want.

Why Slack? Why Bot?

The whole product development process so far had been focused on: What is the quality of questions that we can generate using algorithms? This meant that our prototype to test would need to access the API and the algorithms. This can take more time (to reach a level of acceptability) but that was the kind of feedback we needed.

In addition, we wanted to create a natural context for testing the quality of machine generated questions. What is the natural context for testing a checkout flow? Maybe, a shopping website. What is the natural context for testing questions? Maybe, a conversation. A bot provided a much natural conversational workflow that would be better to test question quality.

Bots also provide a great opportunity for distribution thanks to the reach Slack has. Our school community had early-adopted (free) Slack as a collaboration tool. Through this, we had access to around 300 people and a place where we could control the testing.

Also, we could use two Slack features to improve the test experience.

  1. We were able to allow for feedback using Slack reactions to let people know if they liked or disliked a question. This was simple quantitative feedback we could use to inform our work going forward.
  2. People could interact with the bot by playing news based trivia in a dedicated Slack channel. Our initial iteration relied on people initiating the interaction but we wanted to reduce that friction. So, we used Slack’s /remind feature to automate the test.


  1. For this prototype, thanks to the conversational aspect, “thinking aloud” came more naturally to our participants. Thinking aloud as a method helps you find out how users interpret design elements and why they find some hard to use.
  2. Our usability tests quickly showed some immediate flaws that needed to be fixed. People complained that the questions being asked were too specific and “nitpicky”. They also expressed confusion with how to proceed when they didn’t know the answer.
  3. People requested hints. We could use the bot itself to provide hints upon request but that would have made it more work for people. So, we chose to instead implement multiple choice questions that immediately led to a better experience.
  4. It was interesting to see participants fabricate a personality for the bot based on their interactions. An initial iteration of the bot icon was considered “judgmental” due to the fact that it asks questions to see if you know something. The activity that your bot (reporting, documenting, retrieval) does can influence the attitude users have towards bots based on their previous experiences and mental models.
  5. People’s attitude towards the testing platform can interfere with your test. Make sure you recruit participants correctly. In our case, one participant did not understand how to use reactions. In fact, they didn’t know what reactions meant on Slack. Reactions representing each answer choice were added by the bot so that there would be no ambiguity about how the users should answer the question.

The conversational interface prototype led to clear and immediate improvements to the algorithm (introduced multiple choice questions) and also to how we structured interactions with the user.

In this article, I covered my experience with prototyping using a conversational interface. I think bots can be a genuine addition to the design process. For prototyping, they can be used where a conversational element is a key part of the experience. If you are currently designing bots, here are a few things that helped me during prototyping:

Throughout the process, it is important to focus on the main question: what are you trying to learn?

Test your assumptions before the actual test. You don’t need to create the perfect interaction but make sure you aren’t impeding the conversation.

It is possible to iterate rapidly between tests. The beauty of being a bot designer/developer is that you can quickly build on what you learnt. If you are a designer, you can always involve the developer in the tests to get buy-in.

That said, often conversations take unexpected routes so be open to learn.

A while back, I put together principles that I used for designing bots. Hit me up on twitter if you are working on a bot or want to talk user experience or bot building or the importance of spaceship corridors in science fiction world building.

Bots As A Prototyping Tool was originally published in Chatbot’s Life on Medium, where people are continuing the conversation by highlighting and responding to this story.

by nikhil at October 25, 2016 11:24 PM

October 16, 2016

Ph.D. student

Are national election betting markets efficient with respect to state-level prediction markets?

PredictIt has several prediction markets for the general election. However, PredictIt also has a market for the election outcome in every state. If we took each state's prediction market, and used it to simulate general election, what would we find? If the market is roughly efficient, we would expect the mean of simulated election outcomes to be about the same as the national election markets' share price.

Surprise! This is not the case. Simulated national odds are higher than actual national odds. I suspect some state-level markets are too bullish on Clinton - Georgia, for example, trades as high as 30 cents). Discussion follows. (update Nov 3 - now estimated price is lower than national price. Again, some states seem too bearish on Clinton - Colorado is trading as low as 60 cents for her shares).


This naive approach yields a 97% chance of a Clinton win. This estimate is significantly higher than the PredictIt general election market, and higher than Rothchild's debiased aggregate. However, in each simulated election, state outcomes should be highly correlated with one another (thanks to David Rothschild for pointing this out).

After adding an election-level 'temperature' to each trial, and using that to jitter state-level election results, I get predicted share price of 85¢, and predicted 92% odds of a Clinton win - closer in line with FiveThirtyEight and PredictWise, respectively, but still higher than PredictIt's national market, where Clinton shares are trading for about 80¢.


In any case, it does not seem that national betting markets are totally efficient with respect to their state-level counterparts. Where discrepancies between obviously equivelent shares (e.g. "Will a woman be the president?"), the same does not seem to be true for odds on the national betting market, versus the state betting markets.

So, are the general markets not bullish enough on Clinton, or are the state markets too bullish on Clinton? I suspect both. A few states -- specifically, Georgia, Alaska, Arizona and Montana -- may have overpriced Clinton shares. The betting odds in Georgia, e.g., are 62/27 Trump. I bet -- and I probably didn't need to program anything to know this -- that Trump wins in Georgia.

On the other hand, FiveThirtyEight and PredictWise has the odds above what one would expect from national market odds. Now I can add my own forecast to that list. So, perhaps Clinton prices are undervalued in the national betting markets.

All of this said, I am not a trained economist, statistician or politican scientist. AND YOU SHOULD NEVER BET!. Also, this work suffers from a lack of historical data on PredictIt markets, which prevents me from properly turning my own share data into probabilities. I am relying on David Rothschild's aggergated market probability.

Much thanks to: Daniel Griffin, Alex Stokes, and David Rothschild for feedback. If you have any comments on this work, or if you have historical data from a prediction market, please contact me:

ffff [at] berkeley [] edu

Simulating the election with market-derived probabilities

All the code for replicating these results are available here, as a Jupyter notebook.

First, we'll get a prediction market from each state on PredictIt.

PredictIt has a Republican and Democratic market for each state election, with each market having its own yes/no. The odds on an outcome may be slightly different between markets (more on this below).

!pip install requests
%pylab inline
import requests 
import json

def request_json (url):
    return json.loads(
           requests.get(url, headers={

my_url = ''

def predictit (ticker):
    return request_json('' + ticker)

def market (state_abbreviation, party):
    api_res = predictit('DEM.'+state_abbreviation+'.USPREZ16')
    contracts = api_res['Contracts']
    contract = filter(lambda c: c['ShortName']==party, contracts)[0]
    return contract

# market('CA', 'Republican')

We need to turn the prediction market prices into probabilities. Following Rothschild, 2009:

First, I take the average of the bid and ask for the stock that pays out if the Democrat wins on Election Day. If the bid-ask spread is greater than five points, I take the last sale price. If there are no active offers and no sales in the last two weeks of the race, I drop the race.

We can do this separately for both Democratic and Republican markets. We'll focus on the Clinton win outcome, corresponding to a yes in the Democratic markets and a No in the Republican markets.

We should also debias this probability. Following Leigh et al. (2007) Is There a Favorite-Longshot Bias in Election Markets?, Rothschild (2009) suggests Pr = theta(1.64*theta^-1(price)).

Limitations: I don't have programmatic access to historical trade data, So I cannot find date of last sale. Consequently, no races are dropped here. Without historical trade data, I also can't find a value for theta so these values are not debiased. If anyone has access to historical PredictIt data, or historical data from any prediction market, please contact me:

ffff [at] berkeley [] edu

In the meantime, we'll make share prices into probabilities as best we can:

def probability (state_abbreviation, party=None):
    # Average both party markets by default
    if (party is None):
       return (probability(state_abbreviation, 'Democratic') +
              probability(state_abbreviation, 'Republican'))/2
    mkt = market(state_abbreviation, party)
    # For republican markets, get the No cost
    if (party=='Republican'):
       sell = mkt['BestSellNoCost']
       buy = mkt['BestBuyNoCost']
    # For democratic markets, get the "Yes cost
    elif (party=='Democratic'):
       sell = mkt['BestSellYesCost']
       buy = mkt['BestBuyYesCost']
    # If the spread is > 5
    spread = buy-sell
    if (sell > 5):
       # Just use the last trade price
       return mkt['LastTradePrice']
    return (sell+buy)/2.0



To sanity check, we will also pull Rothchild's de-biased, market-derived probabilities.

def predictwise_states ():
    table = json.loads(requests.get('').content)
    def predictwise_state (row):
        return {'name': row[0],
                'probability': float(int(row[2].split(' ')[0]))/100.0,
                'delegates': int(row[-1])}
    return map(predictwise_state, table['table'])

pw_states = predictwise_states()

Now, we'll construct a list of all states, where each state has a probability of a Clinton win, and a number of delegates.

def state (abbrev, delegates):
    return {"abbreviation": abbrev,
           "delegates": delegates,
           "probability": probability(abbrev),}

states_delegates = {'AL':9, 'AK':3, 'AZ':11, 'AR':6, 'CA':55, 'CO':9, 'CT':7, 'DC':3, 'DE':3, 'FL':29, 'GA':16, 'HI':4, 'ID':4, 'IL':20, 'IN':11, 'IA':6, 'KS':6, 'KY':8, 'LA':8, 'ME':4, 'MD':10, 'MA':11, 'MI':16, 'MN':10, 'MS':6, 'MO':10, 'MT':3, 'NE':5, 'NV':6, 'NH':4, 'NJ':14, 'NM':5, 'NY':29, 'NC':15, 'ND':3, 'OH':18, 'OK':7, 'OR':7, 'PA':20, 'RI':4, 'SC':9, 'SD':3, 'TN':11, 'TX':38, 'UT':6, 'VT':3, 'VA':13, 'WA':12, 'WV':5, 'WI':10, 'WY':3,} 
print 'sanity check - delegates add up to 538?', 538 == sum([val for key, val in states_delegates.iteritems()])

states = [state(key,val) for key, val in states_delegates.iteritems()]

sanity check - delegates add up to 538? True

Update 10-16-16 As David Rothschild pointed out, outcomes in each state are heavily correlated with one another:

1) Impact of events are heavily correlated through all states. 2) Election Day polling error is reasonably correlated. I work off a correlation matrix that is unique to each pairwise group, but is roughly 75% correlation on average.

TODO How are election day polling errors correlated? How do we find them?

As far as correlated outcomes across states, I pick some election wide temperature (a random variable chosen from a normal distribution with a mean of 0), then, for each state, I use that temperature to generate a probability_offset unique to that state (a random variable chosen from a normal distribution with a mean of temperature).

def normal (center, scale):
    return random.normal(center,scale)

def bound (probability):
    if (probability>1):
        return 1
    elif (probability<0):
        return 0
    return probability

Finally we can start to simulate elections. We'll use each probability to allocate delegates, or not, to Clinton. We'll do this some number of times, producing a list of Clinton delegates elected in each simulation. Now we can calculate how many of those elections Clinton won. We will count only majority wins. Electoral deadlocks will count as a loss.

# for each election, select a temperature from a distribution
temperature_mean = 0
temperature_stdev = 0.1
# for each state in an election, offset is drawn from a distribution
# where the mean of distribution is the election temperature
state_offset_stdev = 0.01

def decide (probability):
    return random.random()<probability

def election_holder (temperature):
    Returns fn (state), which simulates an election in that state,
    and returns a number of delegates allocated to Dems (0 if loss).
    def hold_election (state):
        probability_offset = normal(temperature, state_offset_stdev)
        probability = bound(state['probability'] + probability_offset)
        return state['delegates'] * decide(probability)
    return hold_election

def simulate_election (states):
    Return number of delegates for Clinton.
    temperature = normal(temperature_mean, temperature_stdev)
    return sum(map(election_holder(temperature), states))

def simulate_elections (trials, states):
    ts = [states for i in range(trials)]
    return map(simulate_election, ts)

def percent_winning (simulations):
    winning = lambda delegates: delegates > 270
    return float(len(filter(winning, simulations)))/float(num_trials)

num_trials = 100000
simulations = simulate_elections(num_trials, states)
print 'predicted market price:',  percent_winning(simulations), '¢'

simulations = simulate_elections(num_trials, pw_states)
print 'predicted chance of Clinton win:',  percent_winning(simulations)*100, '%'
predicted market price: 0.85465 ¢
predicted chance of Clinton win: 93.083 %

Something doesn't add up. Where are the biases?

Our estimated 85¢ share price is significantly higher than the PredictIt general election market (which has about 80¢ for Clinton shares as of 10-16-16). Are the general markets not bullish enough on Clinton, or are the state markets too bullish on Clinton?

Let's assume that the national election markets, and poll-based forecasts, are roughly correct. What would it mean for the state markets to undervalue a Clinton win? Well, let's take a look at those states where market prices have an estimated 20% to 50% chance of a Clinton win.

filter(lambda s: s['probability']<0.5 and s['probability']>0.2, states)
[{'abbreviation': 'GA', 'delegates': 16, 'probability': 0.275},
 {'abbreviation': 'AK', 'delegates': 3, 'probability': 0.245},
 {'abbreviation': 'AZ', 'delegates': 11, 'probability': 0.45999999999999996},
 {'abbreviation': 'MO', 'delegates': 10, 'probability': 0.2025}]

Excluding Utah, which has a non-negligable chance of a 3rd-party win, these other states may have overvalued Clinton shares.

Of course, my derived value is also greater than the national poll vaue. So, maybe share prices are undervalued in the national polls.

All of this said YOU SHOULD NEVER BET!!!!. I am not a professional, and this is not betting advice. Don't do it!

October 16, 2016 09:59 PM

October 14, 2016

MIMS 2018

October 10, 2016

Ph.D. alumna

Columbus Day!?!? What the f* are we celebrating?

Today is Columbus Day, a celebration of colonialism wrapped up under the guise of exploration. Children around the US are taught that European settlers came in 1492 and found a whole new land magically free for occupation. In November, they will be told that there were small and disperse savage populations who opened their arms to white settlers fleeing oppression. Some of those students may eventually learn on their own about violence, genocide, infection, containment, relocation, humiliation, family separation, and cultural devaluation which millions of Native peoples experienced over centuries.

Hello, cultural appropriation!

Later this month, when everyone is excited about goblins and ghosts, thousands of sexy Indian costumes will be sold, prompting young Native Americans to cringe at the depictions of their culture and community. Part of the problem is that most young Americans think that Indians are dead or fictitious. Schools don’t help — children are taught to build teepees and wear headdresses as though this is a story of the past, not a living culture. And racist attitudes towards Native people are baked into every aspect of our culture. Why is it OK for Washington’s football team to be named the Redskins? Can you imagine a football team being named after the N-word?

Historically, Native people sit out Columbus Day in silence. This year, I hope you join me and thousands others by making a more active protest to Change what people learn!

In 2004, the Smithsonian’s National Museum of the American Indian was opened on the Mall in Washington DC as a cultural heritage institution to celebrate the stories of Native people and tell their story. I’m a proud trustee of this esteemed institution. I’m even more excited by upcoming projects that are focused on educating the public more holistically about the lives and experiences of Native peoples.

As a country, we’re struggling with racism and prejudice, hate that is woven deep into our cultural fabric. Injustice is at the core of our country’s creation, whether we’re talking about the original sin of slavery or the genocide of Native peoples. Addressing inequities in the present requires us to come to terms with our past. We need to educate ourselves about the limits of our understanding about our own country’s history. And we need to stop creating myths for our children that justify contemporary prejudice.

On this day, a day that we should not be celebrating, I have an ask for you. Please help me and NMAI build an educational effort that will change the next generation’s thinking about Native culture, past and present. Please donate a multiple of $14.91 to NMAI: in honor of how much life existed on these lands before colonialist expansion. Help Indian nations achieve their rightful place of respect among the world’s nations and communities.

by zephoria at October 10, 2016 11:22 AM

October 08, 2016

MIMS 2016

Thinking About Bots

Before actually building bots.

Over the past few weeks, I have been working at the Sutardja Center at UC Berkeley on a program for students interested in working on the latest trend in human computer interaction — conversational interfaces or chatbots. This is a condensed version of a presentation/talk I gave at the kick-off event.

After I talked to other students, I realized that a lot of students were not sure how to begin the process. How do you even think about building bots?

So, here’s an attempt at highlighting simple principles that I have used and you can use to think about building bots.

Define Your Bot’s Purpose

Why does this bot exist?

Define a use case for your bot and work out why anyone in this world would want to use it. Find a target user of your bot, someone who is not working on your bot, and find out how they carry out the use case currently. When you do this, you can work out the interactions your bot can offer.

Bonus: Act out the use case with another person. You, THE BOT, are just minding your own business. Enter USER who wants to get job done. Work out how the user can be successful in doing what they are trying to do and what could prevent them from being successful.

Context Makes Bots Less Annoying

Alright, you know what it is that your bot will be doing. What’s next?

How do you build bots that are not annoying? Well, from my experience, context is the most important thing. Most importantly, the bot needs to maintain state in some way. Maybe it records a person’s reply and replies to them based on it. Maybe it records everything a user says to provide intelligent recommendations? If your bot is interacting directly the context is one-on-one but in a group the bot needs to keep context with several individuals while supporting group level interactions. E.g. It would need to orient new members about how to use the bot. Maybe your bot interacts with external services or gets data from external services. How would user privacy work in this context?

There are several layers of context to get through and I can’t list them all but understanding and working it out is a great beginning. Just not this kind of context.

Set Expectations With The People Using Your Bot

A bot is not a person. Don’t pretend that this isn’t software.

I cannot stress enough the importance of starting off your interactions by communicating clearly and succinctly with the person using the bot. Managing user expectations becomes particularly challenging when the end-user just wants to know what the bot can do, and doesn’t care about how it does those things. Having your bot do it makes for a more natural experience. Erring on the side of setting expectations lower leads to less disappointment and a more positive experience.

The risk here is that of leading users to believe that the robot is worse than it actually is. I have some thoughts about a progressive enhancement strategy for building bots but that is not in the scope of this article.

Explore What Messaging Platforms Offer

Messaging is the medium.

For a lot of modern bots, messaging platforms are the medium of conversation. Text is the primary mode of information exchange on these platforms. However, People tend to focus on the information as well as the structure that delivers this information. Each platform affords distinct interactions and consequently, creates different mental models for people. Your user’s mental model of your bot is transformed by the platform where the bot exists. The way people would use your bot depends on how they use the underlying platform. Understanding this is crucial to making the best use of the platform’s capabilities and ultimately providing the best experience.

Imagine usability testing a Slack bot for someone who doesn’t know what Slack channels are. You need to think about how people use the platform where your bot exists.

This leads to my next point.

You Can Facilitate Conversation By Adding Structure To It

Conversations can go out of scope in limitless ways.

Your nifty pattern matching function provides different greetings to people but does it handle multiple languages? I am asking because mine didn’t.

Human language is not a great tool for conveying information. Ironic, I know. If you have ever struggled to provide directions to someone, you know what I mean. It is often easier to point vaguely in the general direction than reply verbally. People aren’t adequately equipped to understand what other people are saying.

It is quite unlikely that your bot can understand and reply satisfyingly to everything a person can say. So, you provide structured inputs. These can be formatted replies or text buttons. It makes it easier for you as well as the user. By providing a limited amount of conversational pathways or limiting interactions you can create something that gets work done. Ultimately, isn’t that what people want?

Finally, remember to —


Build, test, improve and then build, test and improve some more.

I remember the first time someone wanted to use a bot that I made and took 10 tries to get started. I remember when a user’s typing pattern was so odd that their phrases could not trigger behavior. I remember seeing a user who did not know how to add an emoji reaction in Slack, kept replying to the bot with the emoji and ultimately apologized for “not getting it”. People shouldn’t have to say sorry for using your product, right?

It can be hard to prototype a bot because bits and parts of the functionality don’t make sense out of context. You can carry out Wizard of Oz tests with target users. The “wizard” observes user inputs and simulates the system’s responses in real-time. You can carry out this test with a friend over any messaging app to save time and test ideas quickly.

So, these are some principles that have helped me in thinking about building bots. This is an un-ordered list because all of these principles are important and the process is not necessarily linear.

What other principles have you used?

I am currently leading the Collider challenge (4 teams, 3 students each) who are working on building bots to create and facilitate interesting conversations with people. The bot community is pretty wild wild west right now so I wanted to document what has worked for me in the past.

If you’re looking to talk about design, data, and engineering or just want to discuss Rick & Morty, tweet at me!

Enjoyed the article? Click the ❤ below to recommend it to other interested readers!

Thinking About Bots was originally published in Chatbots Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

by nikhil at October 08, 2016 09:44 PM

Ph.D. student

moved BigBang core repository to DATACTIVE organization

I made a small change this evening which I feel really, really good about.

I transferred the BigBang project from my personal GitHub account to the datactive organization.

I’m very grateful for DATACTIVE‘s interest in BigBang and am excited to turn over the project infrastructure to their stewardship.

by Sebastian Benthall at October 08, 2016 01:47 AM

October 05, 2016

Center for Technology, Society & Policy

Join CTSP for social impact Un-Pitch Day on October 21st

Are you a local non-profit or community organization that has a pressing challenge that you think technology might be able to solve, but you don’t know where to start? Or, are you a Berkeley graduate or undergraduate student seeking an opportunity to put your technical skills to use for the public good?

If so, join us for Un-Pitch Day on October 21st from 3 – 7pm, where Berkeley graduate students will offer their technical expertise to help solve your organization’s pressing technology challenges. During the event, non-profits and community organizations will workshop their challenge(s) and desired impact. Organizations will then be partnered with graduate student technology mentors to define and scope potential solutions.

All attending non-profits and community organizations will have the opportunity to pitch their pressing challenge(s) to student attendees with the goal of potentially matching with a student project group to adopt their project. By attending Un-Pitch day, organizations will gain a more defined sense of how to solve their technology challenges, and potentially, a team of students interested in working with your organization to develop a prototype or project to solve it.

The goal of this event is to both help School of Information master’s students (and other UCB grad students) identify potential projects they can adopt for the 2016-2017 academic year (ending in May). Working in collaboration with your organization, our students can help develop a technology-focused project or conduct research to aid your organization.

There is also the possibility of qualifying for funding ($2000 per project team member) for technology projects with distinct public interest/public policy goals through the Center for Technology, Society & Policy (funding requires submitting an application to the Center, due in late November). Please note that we cannot guarantee that each project presented at Un-Pitch Day will match with an interested team.

Event Agenda

Friday, October 21st from 3 – 7pm at South Hall on the UC Berkeley campus

Light food & drinks will be provided for registered attendees.

Registration is required for this event; click here to register.

3:00 – 4:00pm: Non-profit/Community Organization Attendees

Workshop with Caravan Studios, a division of TechSoup, to frame your problem definition and impact goals.

4:00 – 5:00pm: Non-profit/Community Organization Attendees & I-School students

Team up with student technology mentors to define and scope potential solutions and create a simple visual artifact outlining the idea.

5:00 – 5:30pm: Open to the public

CTSP will present details about public interest project funding opportunities and deadlines.

5:30 – 6:00pm: Open to the public

Attendee organizations present short project pitches (2-3 mins) to the room.

6:00 – 7:00pm: Open to the public

Open house for students and organizations to mingle and connect over potential projects. Appetizers and refreshments provided by CTSP.

by Jennifer King at October 05, 2016 09:04 PM

September 21, 2016

MIMS 2018

Teenage Thoughts on the Afterlife

Photo from Flickr.

If you just want to see what I thought the afterlife would be like when I was a teenager, jump below to below the line.

I got the rare treat of hearing poetry read out loud at Berkeley Lunch Poems the other week. There, I heard a professor read “The Afterlife” by Billy Collins. The poem really stuck with me. Here is how it starts:

“The Afterlife” by Billy Collins

The poem goes on to describe different people living out the afterlives they thought were awaiting for them. It is a quite touching poem.

Unexpectedly, “The Afterlife” reawakened a torrent of thoughts I had as a child and young adolescent about what happens after death. It was a topic frequently stepped around in Hebrew School. Judaism has very little to say about the afterlife — it more or less confesses to not knowing—and as budding rhetoricians, we saw this as a clear weak spot in the faith to constantly harangue teachers about it. Perhaps we were hoping to be rewarded by our good deeds, like the Christians were; perhaps we just wanted to be reassured about the fate that awaited us. The teachers offered us no such assurances — “What do you think will happen?” they would ask, dodging the question.

This left a lot of space for rumination. Did I want to live forever on some spectral plane? Had I lived a good enough life to risk being rewarded or punished for my deeds? During the boring, unending days of Hebrew School, I crafted and refined a view of the afterlife that I thought balanced promoting good deeds and avoiding boredom. I refined it over my early teenage years after the death of my grandmother, and then, considering the problem solved, promptly forgot about it.

That is, until Billy Collins’ poem forced me to reckon with my own thoughts on the afterlife nearly a decade later. Now, dear reader, I am writing it down in hopes to capture it before it drifts back into obscurity.

You wake up in a studio apartment. There is no bed here, but there is a living room with a large television and an amply stocked kitchen. Now that you are dead, you do not need to eat, but it will certainly make watching TV feel more familiar. The television has a remote and a VCR with a single tape.

Photo from Flickr.

Once you pop the tape in, it is clear what is going on: this is the tape of your life. Or really, it is a tape of the whole world and you are the main character. The remote does not allow you to fast forward or rewind (none of that “Click” garbage going on here), but it does allow you to pause and jump around in space. This latter feature will be useful. After a couple of days of watching in awe as your parents bungle through your first days on Earth, you will most likely get bored. Besides, there are some exciting world events to catch, and even better, your friends are having their childhoods too.

That additional personal perspective makes some moments particularly painful. You can see how much happier the weird kid from summer camp would have been if you hadn’t ignored him and how heartless you were to your mom, who was dealing with a litany of financial problems, when she left work early to bring you to the dentist. Some scenes that felt monumental at the time are too boring to watch and some scenes that then felt boring then feel monumental. For the most part though, the movie is mundane yet vaguely pleasant. That’s why the snacks are key (yes, they vary day to day.)

As the years roll into decades, the movie gets less interesting. Your on screen self has gotten too introspective and spends too much time in uninteresting conversations. You use the remote to travel more, even dipping into movie theaters and reading books over people’s shoulders. You have little interest in watching your character watch TV shows that you have already seen. The best content is watching your children and grandchildren make all the same mistakes you did.

And after a lifetime of watching, there your are on your deathbed. You see yourself wasting precious mental energy reflecting on your life, unaware that you will soon have far too much time to do just that. As you close your eyes for the last time, the VCR goes black and the tape ejects.

A door appears on what used to be an empty wall. Even though there is no signage, you know what is on the other side: nothing. The true emptiness your empirical self always warned you about. As you look around the apartment, you see that you are not leaving much behind. Even if you can play the VHS again, there is nothing else you could hope to get out of it. So you grab a handful of pretzels from the kitchen, open the door, and walk through.

Photo from Flickr.

by Gabe Nicholas at September 21, 2016 07:10 PM

MIMS 2012

Managing Design work with Discovery Kanban at Optimizely

In my previous article, I described how we transformed our Agile development process at Optimizely to include research and design by implementing our own flavor of Discovery kanban. This process gives designers and researchers space to understand our customer’s problems and explore solutions before we commit to shipping new features. In this article, I’ll go a level deeper to describe all the stages research and design work moves through during the discovery process.

Updating work on the Discovery kanban board Updating work on the Discovery kanban board


The Discovery kanban board is based on the double-diamond model of design, which visualizes the design process as two connected diamonds. The first diamond represents understanding the problem, and the second diamond represents solving the problem. Diamonds are used to communicate that work alternates between divergent phases, in which ideas are broadly explored, and convergent phases, in which ideas are filtered and refined. Peter Merholz, a founding partner of UX consulting company Adaptive Path, has a nice overview of double diamond design.

Double Diamond design diagram


The diamonds represent different kinds of work, with defined inputs and outputs, so I adapted each diamond into a kanban board. A kanban board is simply a series of stages, represented as columns, that work moves through, with acceptance criteria that dictate when items move into each stage.

The first kanban board is Problem Understanding, where we gather information to understand and frame the problem. The second kanban board is Solution Exploration, where we take the output of Problem Understanding and solve the specific problem we identified. Together, they make up Optimizely’s Discovery kanban process.

Overview of Discovery kanban at Optimizely Overview of Discovery kanban at Optimizely

Problem Understanding

The goal of the Problem Understanding phase is to understand, refine, and frame the problem so it’s concrete and solvable. The input is a problem area to study further, and the output is a deeper understanding of the problem (often in the form of a report or presentation). The output either feeds directly into the Solution Exploration backlog, or it’s general knowledge that benefits the company (e.g. helps us make a strategic decision).

Work on this board moves through 5 columns: Backlog, To Do, Researching, Synthesizing, and Socializing.

Problem Understanding kanban board Problem Understanding kanban board


“Backlog” is just a bucket of problem areas and pain points. It isn’t prioritized, and anyone can put a card up on the board. There’s no promise that we’ll work on it, but it’s at least captured and we can see what people are thinking about.

The cards themselves are typically written in the form of a question to answer, e.g. “What metrics do people want to track?” But sometimes we just write a problem area to study, e.g. “New metrics & metrics builder.”

To Do

Each week, we take items out of Backlog and move them into “To Do.” We typically prioritize items based on how big of a problem they are for customers (with input from the product vision, roadmap, and company priorities), but this isn’t a strict rule. For example, we may need some data to make a strategic decision, even if it isn’t the “biggest” problem.

After an item has been prioritized, we write up a research plan to answer the question. This includes choosing a research method(s) that’s best suited to answering this question. There’s no prescribed format for the research plan, but at a minimum it specifies what question we’re answering and the research method to use.

The research method could be a qualitative method, such as customer interviews or contextual inquiry, or a quantitative method, such as sending a survey, analyzing server logs, or querying the data warehouse. For some problems we use a combination of different methods.


Once a research plan has been agreed upon, work moves into the “Researching” stage. This means we execute the research plan — customer interviews are scheduled, a survey is sent out, analytics are analyzed, and so on. This stage is divergent — we go broad to gather as much data related to the problem as we can.


After gathering data in the Researching stage, work moves into the “Synthesizing” stage. We analyze the data we’ve gathered for insights and, hopefully, a deeper understanding of the problem. This stage is convergent — we are filtering and focusing the data into concrete takeaways and problems to be solved.


Once the research has been synthesized, it moves into the “Socializing” phase. The bulk of the work is done, but the outcome is being shared with the team or company. This takes the form of meetings, a presentation, a written report, opportunity assessment, or whatever format is appropriate for the problem being studied. At the very least, a link to the research plan and the data captured is shared with the team.

Sometimes we learn that a problem is especially complicated and the research plan wasn’t sufficient to understand it. If so, we may do more research (if this problem is important enough), or just make the best decision we can based on the data we do have.

After studying a particular problem, the work either ends there (in the case of things like personas or data to inform product strategy), or it turns into a problem for design to solve. In the latter case, work moves into the Solution Exploration backlog.

Solution Exploration

Once we understand the problem, we can start designing solutions to it. We track this work on the Solution Exploration board (a.k.a. the second diamond in double diamond design). A problem may have UX challenges to solve, technical challenges to solve (i.e. is this feasible to build?), or both. The output of the Solution Exploration phase is a finished solution to build. The solution is either a complete UI, such as mockups or a prototype, or a technical solution, such as a technical design doc or a technical prototype, which can be implemented in a production environment.

In this phase, work also moves through 5 columns: Backlog, To Do, Exploring & Thinking, Iterating & Deciding, and Ready to be Built.

Solution Exploration kanban board Solution Exploration kanban board


Just like in Problem Understanding, work starts life in the “Backlog.” This is a big pool of problems to solve, and anyone can add items to it. But once again, just because it’s in the backlog doesn’t mean we’ll work on it — it still needs to be prioritized.

To Do

Each week we prioritize items in the Backlog and move them into “To Do.” This means we have a well-formed problem, informed by data (typically generated in Problem Understanding, but may come from other sources), that the team will design a solution to. For most problems, we formally write out a what the problem is, why it’s worth solving, who we’re solving it for, and the scope. (View the template on Google Drive).

Writing out the problem before jumping into solutions ensures the team is aligned on what we’re solving, which prevents wasted work and changing scope after a project begins. It also surfaces assumptions or incomplete data we may have, which would mean we need to go back to Problem Understanding to gather more data.

Exploring & Thinking

After a problem definition has been written, work moves into the “Exploring & Thinking” stage. This means we’re exploring solutions to the problem. This could be sketching, prototyping, building technical proof-of-concepts, researching potential libraries or frameworks, writing technical design docs, or whatever method is best for exploring possible solutions. This stage is divergent — a broad range of possible solutions are being explored, but we’re not filtering or refining any particular solution yet.

Iterating & Deciding

Once a broad range of solutions have been explored, we move into the “Iterating & Deciding” stage. This means potential solutions are being evaluated and refined. Tradeoffs and approaches are being discussed with relevant groups. This stage is convergent — ideas are being refined into a single, finished solution.

There isn’t strict criteria for whether an item is in “Exploring & Thinking” or “Testing & Iterating.” The creative process is somewhat chaotic and people often switch between an “exploring” and “refining” mindset throughout a project. But there’s typically an inflection point at which we’re doing more refining than exploring. The exact column work is in is up to the person doing it.

Having 2 columns may sound unnecessary, but it’s useful for two reasons. First, it helps communicate where in the creative process work is (especially early stage vs. late stage). This helps designers, engineers, and PMs have better discussions about the solutions being explored. Second, it encourages people to separate divergent thinking from convergent thinking. It’s easy to just go with the first idea, but that’s rarely the best solution. By encouraging exploration, we increase the odds that we’ll choose the best solution.

Ready to be Built

After we’ve iterated enough we’re left with one finished solution that’s ready to be built and shipped to customers. A “finished solution” is one for which all edge cases have been thought through, it’s been vetted by relevant parties, and there are no major questions open that would slow down development. Finished solutions are in the form of mockups or a prototype, if a UX solution, or a technical design doc or proof-of-concept, if a technical feasibility solution. The finished solution then moves into engineering’s backlog, where it gets prioritized against all the other potential work to be done.

Together, the Problem Understanding and Solution Exploration kanban boards make up Discovery kanban. The definition of each column may sound prescriptive, but it’s actually a suggested process to follow. Work doesn’t always move linearly through each column, and we aren’t sticklers about meeting exact acceptance criteria to move work forward. The important thing is that we’re working together to solve our customer’s biggest problems, ultimately increasing customer value and business value.

Since implementing this process we’ve gotten better at prioritizing the biggest customer problems to understand, have space to explore solutions to those problems, and are regularly shipping work to customers. No process is perfect, but we’re always trying to improve how we work in an effort to consistently ship great products to our customers.

This post originally appeared on the Design @ Optimizely blog

by Jeff Zych at September 21, 2016 03:21 AM

September 20, 2016

Ph.D. alumna

There was a bomb on my block.

I live in Manhattan, in Chelsea, on 27th Street between 6th and 7th, the same block in which the second IED was found. It was a surreal weekend, but it is increasingly becoming depressing as the media moves from providing information to stoking fear, the exact response that makes these events so effective. I’m not afraid of bombs. I’m afraid of cars. And I’m increasingly becoming afraid of American media.

After hearing the bomb go off on 23rd and getting flooded with texts on Saturday night, I decided to send a few notes that I was OK and turn off my phone. My partner is Israeli. We’ve been there for two wars and he’s been there through countless bombs. We both knew that getting riled up was of no help to anyone. So we went to sleep. I woke up on Sunday, opened my blinds, and was surprised to see an obscene number of men in black with identical body types, identical haircuts, and identical cars. It looked like the weirdest casting call I’ve ever seen. And no one else. No cars, no people. As always, Twitter had an explanation so we settled into our PJs and realized it was going to be a strange day.

Flickr / Sean MacEntree

As other people woke up, one thing became quickly apparent — because folks knew we were in the middle of it, they wanted to reach out to us because they were worried, and scared. We kept shrugging everything off, focusing on getting back to normal and reading the news for updates about how we could maneuver our neighborhood. But ever since a suspect was identified, the coverage has gone into hyperventilation mode. And I just want to scream in frustration.

The worst part about having statistical training is that it’s hard to hear people get anxious about fears without putting them into perspective. ~100 people die every day in car crashes in the United States. That’s 33,804 deaths in a year. Thousands of people are injured every day by cars. Cars terrify me.And anyone who says that you have control over a car accident is full of shit; most car deaths and injuries are not the harmed person’s fault.

The worst part about being a parent is having to cope with the uncontrollable, irrational, everyday fears that creep up, unwarranted, just to plague a moment of happiness. Will he choke on that food? What if he runs away and gets hit by a car? What if he topples over that chair? The best that I can do is breathe in, breathe out, and remind myself to find my center, washing away those fears with each breath.

And the worst part about being a social scientist is understanding where others’ fears come from, understanding the power of those fears, and understanding the cost of those fears on the well-being of a society. And this is where I get angry because this is where control and power lies.

Traditional news media has a lot of say in what it publishes. This is one ofthe major things that distinguishes it from social media, which propagates the fears and anxieties of the public. And yet, time and time again, news media shows itself to be irresponsible, motivated more by the attention and money that it can obtain by stoking people’s fears than by a moral responsibility to help ground an anxious public.

I grew up on the internet. I grew up with the mantra “don’t feed the trolls.” I always saw this as a healthy meditation for navigating the internet, for focusing on the parts of the internet that are empowering and delightful.Increasingly, I keep thinking that this is a meditation that needs to be injected into the news ecosystem. We all know that the whole concept of terrorism is to provoke fear in the public. So why are we not holding news media accountable for opportunistically aiding and abetting terroristic acts?Our cultural obsession with reading news that makes us afraid parallels our cultural obsession with crises.

There’s a reason that hate is growing in this country. And, in moments like this, I’m painfully reminded that we’re all contributing to the culture of hate.When we turn events like what happened this weekend in NY/NJ into spectacle, when we encourage media to write stories about how afraid people are, when we read the stories of how the suspect was an average person until something changed, we give the news media license to stoke up fear. And when they are encouraged to stoke fear, they help turn our election cycle into reality TV and enable candidates to spew hate for public entertainment. We need to stop blaming what’s happening on other people and start taking responsibility.

In short, we all need to stop feeding the trolls.

by zephoria at September 20, 2016 07:04 PM

September 17, 2016

adjunct professor

FTC PL&P Reviewed in ICON

I am honored and delighted to have my book reviewed by EUI’s Bilyana Petkova, who wrote in part:

…the work of Hoofnagle stands out by offering both a welcome description of the applicable law and a broad contextual framework…Chris J. Hoofnagle takes over fifteen years of experience in American consumer protection, information, and privacy law and converts them into an absorbing, in-depth institutional analysis of the agency.
Overall, Chris Hoofnagle’s Federal Trade Commission Privacy Law and Policy is a fascinating read and a treasure trove of useful references for further research.

The full cite is: Bilyana Petkova, Book Review: Federal Trade Commission Privacy Law and Policy, 14(3) Int J Constitutional Law 781–783 (2016) doi:10.1093/icon/mow053

by web at September 17, 2016 09:53 PM

September 08, 2016

adjunct professor

"Perform a sex act"

How to be circumspect and explicit at the same time, from the Washington Post, Sept. 5: "Metro Transit Police arrested a man Monday afternoon whom they say exposed himself to a woman on an Orange Line train and tried to force her to perform a sex act." My mind isn't exactly racing: there aren't a whole lot of she-on-he sex acts that are introduced with the verb perform.

by Geoff Nunberg at September 08, 2016 05:14 PM

September 02, 2016

adjunct professor

LifeLock’s Non-Public Initial Assessment

In LifeLock, the FTC alleged that the company “failed to establish and maintain a comprehensive information security program…” as required by a 2010 order. Lifelock settled the case for over $100M, despite the fact that the company claimed it had a clean bill of health from a reputable third party PCI assessor, and according to Commissioner Olhausen, LifeLock suffered no breach. Much of LifeLock was sealed, and so the case is a bit of a puzzle–how could it be the case that a company that receives a clean PCI-DSS assessment could also fail to establish a security program?

I hear we’re going to learn more specific details on the case soon, but in the meantime, the FTC just released to me LifeLock’s initial (2010) assessment. It contains a comical “public version” which is completely redacted and a largely unredacted “non-public” version.

More to come soon, but bear in mind that the FTC gave Wynhdam a kind of safe harbor if the company obtains a clean PCI assessment. If other respondents ask for similar treatment, these assessments are going to become more important than ever.

by web at September 02, 2016 08:56 PM

On Cathy O’Neil’s Weapons of Math Destruction

Few have shed as much light on data science than Cathy O’Neil. The former Barnard math professor, author of Doing Data Science, and hedge fund quant has now published Weapons of Math Destruction (Crown 2016).

Weapons of Math Destruction (WMDs) are perversions of data science that increasingly influence our lives. O’Neil shows how sloppy mathematical processes, designed for efficiency and lacking any consideration of fairness, are being used to sort people. Why is this a problem? WMDs are focused on the poor, while the rich get to rely on old-school methods reputation and decisionmaking—the letter of recommendation, the personal interview, and so on. Why are WMDs worse than ordinary human decisionmaking, with all of its foibles? O’Neil argues that WMDs lack feedback loops and that WMD users are much more concerned about doing things well enough rather than correctly. To demonstrate these points, O’Neil walks the reader through anecdotes including the scoring of teachers based on student exam performance, the pathologies that have arisen from U.S. News & World Report’s rankings of colleges, the online advertising that leads people to subprime loans and for-profit colleges, use of algorithms to sentence criminals, use of predictive policing to allocate cops on the beat, the use of information to set personalized insurance rates, and Facebook’s potential to influence our mood and votes.

Our livelihoods increasingly depend on our ability to make our case to machines

O’Neil points out time and again that people learn to game the algorithm. So, why isn’t that enough to solve the problems that O’Neil elucidates? The gaming creates perverse incentives and gross outcomes. Teachers help their students cheat in order to perform well on test-score-based algorithms; the honest who do not get fired. Colleges “hire” highly-cited professors on a part-time basis only to list them on their website in order to improve the school’s ranking.

In other cases, individuals cannot game the system and they suffer for it. Poor neighborhoods with nuisance crimes get more and more police attention, and in turn, more arrests, which feeds into other systems that predict that the poor are more likely to be recidivists. People who do not comparison shop are identified and charged more because companies can. And finally, we face the risk that Facebook will use its platform to shape how we view the world, to encourage us to vote or not, and so on.

O’Neil discusses baseball data science extensively, showing how the reams of information from games can be used to make interesting predictions and alter gameplay strategy. According to O’Neil, baseball analytics are fair for several reasons: the system does not use proxies (indirect measures of player skill) for performance and instead relies on direct evidence such as number of runs hit. The inputs of baseball analytics are transparent: anyone can see and record them. In addition, whatever machine learning that is occurring is relatively transparent as well. Finally, the system can incorporate lessons from its predictions and adjust accordingly.

How would one improve on O’Neil’s assessments of WMDs? I would suggest several additional factors. First, taking baseball as an example, there are no pathological incentive conflicts in such analyses. Baseball teams want their players to perform well. O’Neil’s book details just how conflicting incentives are in other contexts, such as when your bank analyzes your value as a customer. Second, to the extent there are conflicts (after all, the baseball teams are in competition), they are in rough equilibrium. Two well-resourced teams are competing, and we can assume that each can anticipate and react to the predictive power of the other. On the other hand, I am at a total disadvantage with respect to my bank’s analytics. In theory, competition among banks protects me, but in reality, transaction costs in switching, the erosion of fiduciary duty to customers, and so on makes our relationship with banks a form of competitive conflict.

Weapons of Math Destruction makes a good case that efficiency is not an unqualified good. O’Neil shows how increasingly, WMDs are used to create efficiencies for companies that come at the cost of our dignity and to the fairness of our society. She suggests several interventions that would deepen the responsibility the data scientist has for the data subject. But under my suggested framework above (incentive conflict and equilibrium), I think a needed solution is to bring back competition—radical competition. The challenge is to bring back this competition in a society that has bought into the platform, a society that can say with a straight face that ISPs are competitive, and that ignores the obvious transaction costs involved in putatively competitive markets. If we could get away from using a single company for search, email, social networking, online videos, ecommerce, advertising, a browser, an app market, and an operating system, there would be one less company that could so deeply evaluate us and control how we experience the world.

by web at September 02, 2016 12:52 PM

August 25, 2016

Ph.D. student

Critical data studies track at the 2016 4S/EASST Annual Conference

This is an updated schedule of track 113, Critical Data Studies, at the 2016 Annual Meeting of the Society of the Social Study of Science (4S) and the European Association for the Study of Science and Technology (EASST). Please contact Stuart Geiger if you have any questions.


  • Charlotte Cabasse-Mazel (University of California, Berkeley)
  • Stuart Geiger (UC-Berkeley)
  • Laura Noren (New York University)
  • Brittany Fiore‐Gartland (University of Washington)
  • Gretchen Gano (University of California Berkeley)
  • Massimo Mazzotti (University of California, Berkeley)


This track brings together Science and Technology Studies scholars who are investigating data-driven techniques in academic research and analytic industries. Computational methods with large datasets are becoming more common across disciplines in academia (including the social sciences) and analytic industries. However, the sprawling and ambiguous boundaries of “big data” makes it difficult to research. The papers in this track investigate the relationship between theories, instruments, methods, practices, and infrastructures in data science research. How are such practices transforming the processes of knowledge creation and validation, as well as our understanding of empiricism and the scientific method?

Many of the papers in this track are case studies that focus on one particular fieldsite where data-intensive research is taking place. Other papers explore connections between emerging theory, machinery, methods, and practices. These papers examine a wide variety of data-collection instruments, software, inscription devices, packages, algorithms, disciplines, institutions, and many focus on how a broad sociotechnical system is used to produce, analyze, share, and validate knowledge. In looking at the way these knowledge forms are objectified, classified, imagined, and contested, this track looks critically on the maturing practices of quantification and their historical, social, cultural, political, ideological, economical, scientific, and ecological impacts.

When we say “critical,” we are drawing on a long lineage from Immanuel Kant to critical theory, investigating the conditions in which thinking and reasoning takes place. To take a critical approach to a field like data science is not to universally disapprove or reject it; it is more about looking at a broad range of social factors and impacts in and around data science. The papers in this track ask questions such as: How are new practices and approaches changing the way science is done? What does the organization of “big science” look like in an era of “big data”? What are the historical antecedents of today’s cutting-edge technologies and practices? How are institutions like hospitals, governments, schools, and cultural industries using data-driven practices to change how they operate? How are labor and management practices changing as data-intensive research is increasingly a standard part of major organizations? What are the conditions in which people are able to sufficiently understand and contest someone else’s data analysis? What happens when data analysts and data scientists are put in the position of keeping their colleagues accountable to various metrics, discovering what music genres are ‘hot’, or evaluating the impacts of public policy proposals? And how ought we change our own concepts, theories, approaches, and methods in Science and Technology Studies given these changes we are witnessing?

Schedule (without abstracts)

T113.1: Data and scientific practice

Sat Sept 3rd, 09:00-10:30am; Room 116

Chairs: Charlotte Cabasse-Mazel and Stuart Geiger

  • Scientific Open Data: Questions of Labor and Public Benefit
    • Irene Pasquetto (UCLA)
    • Ashley E. Sands (UCLA)
  • Condensing Data into Images, Uncovering ‘the Higgs’
    • Martina Merz (Alpen‐Adria‐University Klagenfurt / Wien / Graz)
  • Data Pedagogy: Learning to Make Sense of Algorithmic Numbers
    • Samir Passi (Cornell University)
  • Big Data or Big Codata? Flows in Historical and Contemporary Data Practices
    • Michael Castelle (University of Chicago)

T113.2: Data and expertise

Sat Sept 3rd, 11:00am-12:30pm; Room 116

Chair: Nick Seaver

  • It’s the context, stupid: Reproducibility as a scientific communication problem [note: previously scheduled in 9am panel]
    • Brittany Fiore‐Gartland (University of Washington)
    • Anissa Tanweer (University of Washington)
  • Emerging Practices of Data‐Driven Accountability in Healthcare: Individual Attribution of C-Sections
    • Kathleen Pine (ASU)
  • The (in)credibility of data science methods to non‐experts
    • Daan Kolkman (University of Surrey)
  • Big data and the mythology of algorithms
    • Howard Rosenbaum (Indiana University)

T113.3: Learning, pedagogy, and practice

Sat Sept 3rd, 14:00-15:30; Room 116

Chair: TBD

  • Infrastructuring data analysis in Digital methods with digital data and tools
    • Klara Benda (IT University of Copenhagen)
  • “An afternoon hack” Enabling data driven scientific computing in the open
    • Charlotte Mazel‐Cabasse (University of California, Berkeley)
  • Playing with educational data: the Learning Analytics Report Card (LARC)
    • Jeremy Knox (The University of Edinburgh)
  • Data science / science studies
    • Cathryn Carson (University of California, Berkeley)

T113.4: Data, theory, and looking forward

Sat Sept 3rd, 16:00-17:30; Room 116

Chairs: Stuart Geiger and Charlotte Cabasse-Mazel

  • Critical Information Practice
    • Yanni Loukissas (Georgia Tech); Matt Ratto (University of Toronto); Gabby Resch (University of Toronto)
  • Actor‐Network VS Network Analysis VS Digital Networks Are We Talking About the Same Networks?
    • Tommaso Venturini (King’s College); Anders Kristian Munk (University of Aalborg); Mathieu Jacomy (Sciences Po)
  • The Navigators
    • Nicholas Seaver (Tufts University)
  • Broad discussion on lessons learned and next steps
    • Everyone!

Schedule with abstracts

T113.1: Data and scientific practice

Sat Sept 3rd, 09:00-10:30am; Room 116

Chairs: Charlotte Cabasse-Mazel and Stuart Geiger

  • Scientific Open Data: Questions of Labor and Public Benefit
    • Irene Pasquetto (UCLA) and Ashley E. Sands (UCLA)

      Openness of publicly funded scientific data is policy enforced, and its benefits are normally taken for granted: increasing scientific trustworthiness, enabling replication and reproducibility, and preventing duplication of efforts.

      However, when public data are made open, a series of social costs arise. In some fields, such as biomedicine, scientific data have great economic value, and new business models based on the reuse of public data are emerging. In this session we critically analyze the relationship between the potential benefits and social costs of opening scientific data, which translate in changes in the workforce and challenges for current science funding models. We conducted two case studies, one medium-scale collaboration in biomedicine (FaceBase II Consortium) and one large-scale collaboration in astronomy (Sloan Digital Sky Server). We have conducted ethnographic participant observations and semi-structured interviews of SDSS since 2010 and FaceBase since 2015. Analyzing two domains sharpened our focus on each by enabling comparisons and contrasts. The discussion is also based on extensive document analysis.

      Our goal is to unpack open data rhetoric by highlighting its relation to the emergence of new mixed private and public funding models for science and changes in workforce dynamics. We show (1) how open data are made open “in practice” and by whom; (2) how public data are reused in private industry; (3) who benefits from their reuse and how. This paper contributes to the Critical Data Studies field for its analysis of the connections between big data approaches to science, social power structures, and the policy rhetoric of open data.

  • Condensing Data into Images, Uncovering ‘the Higgs’
    • Martina Merz (Alpen‐Adria‐University Klagenfurt / Wien / Graz)

      Contemporary experimental particle physics is amongst the most data-intensive sciences and thus provides an interesting test case for critical data studies. Approximately 30 petabytes of data produced at CERN’s Large Hadron Collider (LHC) annually need to be controlled and processed in multiple ways before physicists are ready to claim novel results: data are filtered, stored, distributed, analyzed, reconstructed, synthesized, etc. involving collaborations of 3000 scientists and heavily distributed work. Adopting a science-as-practice approach, this paper focuses on the associated challenges of data analysis using as an example the recent Higgs search at the LHC, based on a long-term qualitative study. In particle physics, data analysis relies on statistical reasoning. Physicists thus use a variety of standard and advanced statistical tools and procedures.

      I will emphasize that, and show how, the computational practice of data analysis is inextricably tied to the production and use of specific visual representations. These “statistical images” constitute “the Higgs” (or its absence) in the sense of making it “observable” and intelligible. The paper puts forward two main theses: (1) that images are constitutive of the prime analysis results due to the direct visual grasp of the data that they afford within large-scale collaborations and (2) that data analysis decisively relies on the computational and pictorial juxtaposition of “real” and “simulated data”, based on multiple models of different kind. In data-intensive sciences such as particle physics images thus become essential sites for evidential exploration and debate through procedures of black-boxing, synthesis, and contrasting.

  • Data Pedagogy: Learning to Make Sense of Algorithmic Numbers
    • Samir Passi (Cornell University)

      This paper conceptualizes data analytics as a situated process: one that necessitates iterative decisions to adapt prior knowledge, code, contingent data, and algorithmic output to each other. Learning to master such forms of iteration, adaption, and discretion then is an integral part of being a data analyst.

      In this paper, I focus on the pedagogy of data analytics to demonstrate how students learn to make sense of algorithmic output in relation to underlying data and algorithmic code. While data analysis is often understood as the work of mechanized tools, I focus instead on the discretionary human work required to organize and interpret the world algorithmically, explicitly drawing out the relation between human and machine understanding of numbers especially in the ways in which this relationship is enacted through class exercises, examples, and demonstrations. In a learning environment, there is an explicit focus on demonstrating established methods, tools, and theories to students. Focusing on data analytic pedagogy, then, helps us to not only better understand foundational data analytic practices, but also explore how and why certain forms of standardized data sensemaking processes come to be.

      To make my argument, I draw on two sets of empirics: participant-observation of (a) two semester long senior/graduate-level data analytic courses, and (b) a series of three data analytic training workshops taught/organized at a major U.S. East Coast university. Conceptually, this paper draws on research in STS on social studies of algorithms,sociology of scientific knowledge, sociology of numbers, and professional vision.

  • Big Data or Big Codata? Flows in Historical and Contemporary Data Practices
    • Michael Castelle (University of Chicago)

      Presently existing theorizations of “big data” practices conflate observed aspects of both “volume” and “velocity” (Kitchin 2014). The practical management of these two qualities, however, have a comparably disjunct, if interwoven, computational history: on one side, the use of large (relational and non-relational) database systems, and on the other, the handling of real-time flows (the world of dataflow languages, stream and event processing, and message queues). While the commercial data practices of the late 20th century were predicated on an assumption of comparably static archival (the site-specific “mining” of data “warehouses”), much of the novelty and value of contemporary “big data” sociotechnics is in fact predicated on the harnessing/processing vast flows of events generated by the conceptually-centralized/ physically-distributed datastores of Google, Facebook, LinkedIn, etc.

      These latter processes—which I refer to as “big codata”—have their origins in IBM’s mainframe updating of teletype message switching, were adapted for Wall Street trading firms in the 1980s, and have a contemporary manifestation in distributed “streaming” databases and message queues like Kafka and StormMQ, in which one differentially “subscribes” to brokered event streams for real-time visualization and analysis. Through ethnographic interviews with data science practitioners in various commercial startup and academic environments, I will contrast these technologies and techniques with those of traditional social-scientific methods—which may begin with empirically observed and transcribed “codata”, but typically subject the resultant inert “dataset” to a far less real-time sequence of material and textual transformations (Latour 1987).

T113.2: Data and expertise

Sat Sept 3rd, 11:00am-12:30pm; Room 116

Chair: Nick Seaver

  • It’s the context, stupid: Reproducibility as a scientific communication problem [note: previously scheduled in 9am panel]
    • Brittany Fiore‐Gartland (University of Washington) and Anissa Tanweer (University of Washington)

      Reproducibility has long been considered integral to scientific research and increasingly must be adapted to highly computational, data-intensive practices. Central to reproducibility is the sharing of data across varied settings. Many scholars note that reproducible research necessitates thorough documentation and communication of the context in which scientific data and code are generated and transformed. Yet there has been some pushback against the generic use of the term context (Nicolini, 2012); for, as Seaver puts it, “the nice thing about context is everyone has it” (2015). Dourish (2004) articulates two approaches to context: representational and interactional. The representational perspective sees context as stable, delineable information; in terms of reproducibility, this is the sort of context that can be captured and communicated with metadata, such as location, time, and size. An interactional perspective, on the other hand, views context not as static information but as a relational and dynamic property arising from activity; something that is much harder to capture and convey using metadata or any other technological fix.

      In two years of ethnographic research with scientists negotiating reproducibility in their own data-intensive work, we found “context” being marshalled in multiple ways to mean different things within scientific practice and discourses of reproducibility advocates. Finding gaps in perspectives on context across stakeholders, we reframe reproducibility as a scientific communication problem, a move that recognizes the limits of representational context for the purpose of reproducible research and underscores the importance of developing cultures and practices for conveying interactional context.

  • Emerging Practices of Data‐Driven Accountability in Healthcare: Individual Attribution of C-Sections
    • Kathleen Pine (ASU)

      This paper examines the implementation and consequences of data science in a specific domain: evaluation and regulation of healthcare delivery. Recent iterations of data-driven management expand the dimensions along which organizations are evaluated and utilize a growing array of non-financial measures to audit performance (i.e. adherence to best practices). Abstract values such as “quality” and “effectiveness” are operationalized through design and implementation of certain performance measurements—it is not just what outcomes that demonstrate the quality of service provision, but the particular practices engaged during service delivery.

      Recent years have seen the growth of a controversial new form of data-driven accountability in healthcare: application of performance measurements to the work of individual clinicians. Fine-grained performance measurements of individual providers were once far too resource intensive to undertake, but expanded digital capacities have made provider-level analyses feasible. Such measurements are being deployed as part of larger efforts to move from “volume-based” to “value- based” or “pay for performance” payment models.

      Evaluating individual providers, and deploying pay for performance at the individual (rather than the organizational) level is a controversial idea. Critics argue that the measurements reflect a tiny sliver of any clinician’s “quality,” and that such algorithmic management schemes will lead professionals to focus on only a small number of measured activities. Despite these and other concerns, such measurements are on the horizon. I will discuss early ethnographic findings on implementation of provider-level cesarean section measurements, describing tensions between professional discretion and accountability and rising stakes of data quality in healthcare.

  • The (in)credibility of data science methods to non‐experts
    • Daan Kolkman (University of Surrey)

      The rapid development and dissemination of data science methods, tools and libraries, allows for the development of ever more intricate models and algorithms. Such digital objects are simultaneously the vehicle and outcome of quantification practices and may embody a particular world-view with associated norms and values. More often than not, a set of specific technical skills is required to create, use or interpret these digital objects. As a result, the mechanics of the model or algorithm may be virtually incomprehensible to non-experts.

      This is of consequence for the process of knowledge creation because it may introduce power asymmetries and because successful implementation of models and algorithms in an organizational context requires that all those involved have faith in the model or algorithm. This paper contributes to the sociology of quantification by exploring the practices through which non-experts ascertain the quality and credibility of digital objects as myths or fictions. By considering digital objects as myths or fictions, the codified nature of these objects comes into focus.

      This permits the illustration of the practices through which experts and non-experts develop, maintain, question or contest such myths. The paper draws on fieldwork conducted in government and analytic industry in the form of interviews, observations and documents to illustrate and contrast the practices which are available to non-experts and experts in bringing about the credibility or incredibility of such myths or fictions. It presents a detailed account of how digital objects become embedded in the organisations that use them.

  • Big data and the mythology of algorithms
    • Howard Rosenbaum (Indiana University)

      There are no big data without algorithms. Algorithms are sociotechnical constructions and reflect the social, cultural, technical and other values embedded in their contexts of design, development, and use. The utopian “mythology” (boyd and Crawford 2011) about big data rests, in part, on the depiction of algorithms as objective and unbiased tools operating quietly in the background. As reliable technical participants in the routines of life, their impartiality provides legitimacy for the results of their work. This becomes more significant as algorithms become more deeply entangled in our online and offline lives. where we generate the data they analyze. They create “algorithmic identities,” profiles of us based on our digital traces that are “shadow bodes,” emphasizing some aspects and ignoring others (Gillespie 2012). They are powerful tools that use these identities to dynamically shape the information flows on which we depend in response to our actions and decisions made by their owners

      Because this perspective tends to dominate the discourse about big data, thereby shaping public and scientific understandings of the phenomenon, it is necessary to subject it to critical review as an instance if critical data studies. This paper interrogates algorithms as human constructions and products of choices that have a range of consequences for their users and owners; issues explored include: The epistemological implications of big data algorithms; The impacts of these algorithms in our social and organizational lives; The extent to which they encode power ways in which this power is exercised; The possibility of algorithmic accountability

T113.3: Learning, pedagogy, and practice

Sat Sept 3rd, 14:00-15:30; Room 116

Chair: TBD

  • Infrastructuring data analysis in Digital methods with digital data and tools
    • Klara Benda (IT University of Copenhagen)

      The Digital methods approach seeks the strategic appropriation of digital resources on the web for social research. I apply the grounded theory to theorize how data practices in Digital methods are entangled with the web as a socio-technical phenomenon. My account draws on public sources of Digital methods and ethnographic research of semester-long student projects based on observations, interviews and project reports. It is inspired by Hutchin’s call for understanding how people “create their cognitive powers by creating the environments in which they exercise those powers”. The analysis draws on the lens of infrastructuring to show that making environments for creativity in Digital methods is a distributed process, which takes place on local and community levels with distinct temporalities. Digital methods is predicated on creating its local knowledge space for social analysis by pulling together digital data and tools from the web, and this quick local infrastructuring is supported by layers of slower community infrastructures which mediate the digital resources of the web for a Digital methods style analysis by means of translation and curation.

      Overall, the socially distributed, infrastructural style of data practice is made possible by the web as a socio-technical phenomenon predicated on openness, sharing and reuse. On the web, new digital resources are readily available to be incorporated into the local knowledge space, making way for an iterative, exploratory style of analysis, which oscillates between infrastructuring and inhabiting a local knowledge space. The web also serves as a socio-technical platform for community practices of infrastructuring.

  • “An afternoon hack” Enabling data driven scientific computing in the open
    • Charlotte Mazel‐Cabasse (University of California, Berkeley)

      The scientific computing, or e-science, has enabled the development of large data driven scientific initiatives. A significant part of these projects relies on the software infrastructures and tool stacks that make possible to collect, clean and compute very large data sets.

      Based on an anthropological research among a community of open developers and/or scientists contributing to SciPy, the open source Python library used by scientists to enable the development of technologies for big data, the research focuses on the socio-technical conditions of the development of free and reproducible computational scientific tools and the system of values that supports it.

      Entering the SciPy community for the first time is entering a community of learners. People who are convinced that for each problem there is a function (and if there is not, one should actually create one), who think that everybody can (and probably should) code, who have been living between at least two worlds (sometime more) for a long time: academia and the open software community, and for some, different versions of the corporate world.

      Looking at the personal trajectories of these scientists that turned open software developers, this paper will investigate the way in which a relatively small group of dedicated people has been advancing a new agenda for science, defined as open and reproducible, through carefully designed data infrastructures, workflows and pipelines.

  • Playing with educational data: the Learning Analytics Report Card (LARC)
    • Jeremy Knox (The University of Edinburgh)

      Education has become an important site for computational data analysis, and the burgeoning field of ‘learning analytics’ is gaining significant traction, motivated by the proliferation of online courses and large enrolment numbers. However, while this ‘big data’ and its analysis continue to be hyped across academic, government and corporate research agendas, critical and interdisciplinary approaches to educational data analysis are in short supply. Driven by narrow disciplinary areas in computer science, learning analytics is not only ‘blackboxed’, - in other words a propensity to ‘focus only on its inputs and outputs and not on its internal complexity’ (Latour 1999, p304), but also abstracted and distanced from the activities of education itself. This methodological estrangement may be particularly problematic in an educational context where the fostering of critical awareness is valued.

      The first half of this paper will describe three ways in which we can understand this ‘distancing’, and how it is implicated in enactments of power within the material conditions of education: the institutional surveilling of student activity; the mythologizing of empirical objectivity; and the privileging of prediction. The second half of the paper will describe the development of a small scale and experimental learning analytics project undertaken at the University of Edinburgh that sought to explore some of these issues. Entitled the Learning Analytics Report Card (LARC), the project investigated playful ways of offering student choice in the analytics process, and the fostering of critical awareness of issues related to data analysis in education.

  • Data science / science studies
    • Cathryn Carson (University of California, Berkeley)

      Inside universities, data science is practically co-located with science studies. How can we use that proximity to shape how data science gets done? Drawing on theorizations of collaboration as a research strategy, embedded ethnography, critical technical practice, and design intervention, this paper reports on experiments in data science research and organizational/strategic design. It presents intellectual tools for working on data science (conceptual distinctions such as data science as specialty, platform, and surround; temporal narratives that capture practitioners’ conjoint sense of prospect and dread) and explores modes of using these tools in ways that get uptake and do work. Finally, it draws out possible consequences of the by now sometimes well-anchored situation of science studies/STS inside universities, including having science studies scholars in positions of institutional leverage.

T113.4: Data, theory, and looking forward

Sat Sept 3rd, 16:00-17:30; Room 116

Chairs: Stuart Geiger and Charlotte Cabasse-Mazel

  • Critical Information Practice
    • Yanni Loukissas (Georgia Tech); Matt Ratto (University of Toronto); Gabby Resch (University of Toronto)

      Big Data has been described as a death knell for the scientific method (Anderson, 2008), a catalyst for new epistemologies (Floridi, 2012), a harbinger for the death of politics (Morozov, 2014), and “a disruptor that waits for no one” (Maycotte, 2014). Contending with Big Data, as well as the platitudes that surround it, necessitates new kind of data literacy. Current pedagogical models, exemplified by data science and data visualization, too often introduce students to data through sanitized examples, black-boxed algorithms, and standardized templates for graphical display (Tufte, 2001; Fry, 2008; Heer, 2011). Meanwhile, these models overlook the social and political implications of data in areas like healthcare, journalism and city governance. Scholarship in critical data studies (boyd and Crawford, 2012; Dalton and Thatcher, 2014) and critical visualization (Hall, 2008; Drucker 2011) has established the necessary foundations for an alternative to purely technical approaches to data literacy.

      In this paper, we explain a pedagogical model grounded in interpretive learning experiences: collecting data from messy sources, processing data with an eye towards what algorithms occlude, and presenting data through creative forms like narrative and sculpture. Building on earlier work by the authors in the area of ‘critical making’ (Ratto), this approach—which we call critical information practice—offers a counterpoint for students seeking reflexive and materially-engaged modes of learning about the phenomenon of Big Data.

  • Actor‐Network VS Network Analysis VS Digital Networks Are We Talking About the Same Networks?
    • Tommaso Venturini (King’s College); Anders Kristian Munk (University of Aalborg); Mathieu Jacomy (Sciences Po)

      In the last few decades, the idea of ‘network’ has slowly but steadily colonized broad strands of STS research. This colonization started with the advent of actor-network theory, which provided a convenient set of notions to describe the construction of socio-technical phenomena. Then came network analysis, and scholars who imported in the STS the techniques of investigation and visualization developed in the tradition of social network analysis and scientometrics. Finally, with the increasing ‘computerization’ of STS, scholars turned their attention to digital networks a way of tracing collective life.

      Many researchers have more or less explicitly tried to link these three movements in one coherent set of digital methods for STS, betting on the idea that actor-network theory can be operationalized through network analysis thanks to the data provided by digital networks. Yet, to be honest, little proves the continuity among these three objects besides the homonymy of the word ‘network’. Are we sure that we are talking about the same networks?

  • The Navigators
    • Nicholas Seaver (Tufts University)

      Data scientists summon space into existence. Through gestures in the air, visualizations on screen, and loops in code, they locate data in spaces amenable to navigation. Typically, these spaces embody a Euro-American common sense: things near each other are similar to each other. This principle is evident in the work of algorithmic recommendation, for instance, where users are imagined to navigate a landscape composed of items arranged by similarity. If you like this hill, you might like the adjacent valley. Yet the topographies conceived by data scientists also pose challenges to this spatial common sense. They are constantly reconfigured by new data and the whims of their minders, subject to dramatic tectonic shifts, and they can be more than 3-dimensional. In highly dimensional spaces, data scientists encounter the “curse of dimensionality,” by which human intuitions about distance fail as dimensions accumulate. Work in critical data studies has conventionally focused on the biases that shape these spaces.

      In this paper, I propose that critical data studies should not only attend to how representative data spaces are, but also to the techniques data scientists use to navigate them. Drawing on fieldwork with the developers of algorithmic music recommender systems, I describe a set of navigational practices that negotiate with the shifting, biased topographies of data space. Recalling a classic archetype from STS and anthropology, these practices complicate the image of the data scientist as rationalizing, European map-maker, resembling more closely the situated interactions of the ideal-typical Micronesian navigator.

by R. Stuart Geiger at August 25, 2016 07:00 AM

August 22, 2016

Ph.D. student

becoming a #seriousacademic

I’ve decided to make a small change to my on-line identity.

For some time now, my Twitter account has been listed under a pseudonym, “Gnaeus Rafinesque”, and has had a picture of a cat. Today I’m changing it to my full name (“Sebastian Benthall”) and a picture of my face.

Gnaeus Rafinesque

Serious academic

I chose to use a pseudonym on Twitter for a number of reasons. One reason was that I was interested in participant observation in an Internet subculture, Weird Twitter, that generally didn’t use real names because most of their activity on Twitter was very silly.

But another reason was because I was afraid of being taken seriously myself. As a student, even a graduate student, I felt it was my job to experiment, fail, be silly, and test the limits of the media I was working (and playing) within. I learned a lot from this process.

Because I often would not intend to be taken seriously on Twitter, I was reluctant to have my tweets associated with my real name. I deliberately did not try to sever all ties between my Twitter account and my “real” identity, which is reflected elsewhere on the Internet (LinkedIn, GitHub, etc.) because…well, it would have been a lot of futile work. But I think using a pseudonym and a cat picture succeeded in signalling that I wasn’t putting the full weight of my identity, with the accountability entailed by that, into my tweets.

I’m now entering a different phase of my career. Probably the most significant marker of that phase change is that I am now working as a cybersecurity professional in addition to being a graduate student. I’m back in the working world and so in a sense back to reality.

Another marker is that I’ve realized that I’ve got serious things worth saying and paying attention to, and that projecting an inconsequential, silly attitude on Twitter was undermining my ability to say those things.

It’s a little scary shifting to my real name and face on Twitter. I’m likely to censor myself much more now. Perhaps that’s as it should be.

I wonder what other platforms are out there in which I could be more ridiculous.

by Sebastian Benthall at August 22, 2016 03:12 PM

August 10, 2016

Center for Technology, Society & Policy

Data Science and Expanding Our Sources of Ethical Inspiration

By Luke Stark & Anna Lauren Hoffmann, CTSP Fellows | Permalink

Steam rising from nuclear reactors

Photo by Mark Goebel

Recent public controversies regarding the collection, analysis, and publication of data sets about sensitive topics—from identity and sexuality to suicide and emotion—have helped push conversations around data ethics to the fore. In popular articles and emerging scholarly work (some of it supported by our backers at CTSP), scholars, practitioners and policymakers have begun to flesh out the longstanding conceptual and practical tensions expressed not only in the notion of “data ethics,” but in related categories such as “data science,” “big data,” and even plain old “data” itself.

Against this uncertain and controversial backdrop, what kind of ethical commitments might bind those who work with data—for example, researchers, analysts, and (of course) data scientists? One impulse might be to claim that the unprecedented size, scope, and attendant possibilities of so-called “big data” sets require a wholly new kind of ethics, one built with digital data’s particular affordances in mind from the start. Another impulse might be to suggest that even though “Big Data” seems new or even revolutionary, its ethical problems are not—after all, we’ve been dealing with issues like digital privacy for quite some time.

This tension between new technological capabilities and established ethical ideas recalls the “uniqueness debate” in computer ethics begun in the late 1970s. Following this precedent, we think that contemporary data ethics requires a bi-directional approach—that is, one that is critical of both “data” (and the tools, technologies, and humans that support its production, analysis, and dissemination) and “ethics” (the frameworks and ideas available for moral analysis). With this in mind, our project has been investigating the relevance and viability of historical codes of professional and research ethics in computing, information science, and data analytics context for thinking about data ethics today.

The question of data ethics itself isn’t just an abstract or academic problem. Codes of professional ethics can serve a range of valuable functions, such as educating professionals in a given field, instilling positive group norms, and setting a benchmark against which negative or unethical behavior may be censured. Moreover, the most effective or influential codes have not arisen in theoretical or conceptual vacuums; instead, as Jacob Metcalf writes, they represent “hard-won responses to major disruptions, especially medical and behavioral research scandals. Such disruptions re-open questions of responsibility, trust and institutional legitimacy, and thus call for codification of new social and political arrangements.”

The initial ambit of our research was to determine how these codes are lacking in light of contemporary technical and social developments, and to make concrete recommendations on how to change these codes for the better in light of our brave new ubiquitously digital world. The immediate challenge, however, is in identifying those codes that might be most relevant to the emerging, often contested, and frequently indistinct field of “data science.”

Metcalf notes three areas of applied ethics that ought to guide our thinking about data ethics. The first and perhaps most obvious of these areas is computing ethics, especially as represented by major ethics codes developed by groups like the Association of Computing Machinery (ACM) and the Institute of Electronic and Electrical Engineers (IEEE). The second area—bioethics—help show how ethical concerns for data scientists must also include attention to the ethics of conducting and disseminating research on or about human subjects, Finally, Metcalf turns his attention to journalism ethics and its strong professional commitment to professional identity and moral character. To this list, we’d also add ethical codes developed by statisticians as well as professional survey and market research associations.

However, if professional codes of ethics are often “hard-won responses to major disruptions,” we also argue that it is important to pay attention to the nature of the “disruption” in question. Doing so may point towards previously underappreciated or overlooked ethical domains—in this case, domains that might help us better come to terms with the rise of “big data” and an epistemological shift in how we produce, understand, and act on knowledge about the world.

But how do we identify additional domains of ethical consideration that might be relevant to data ethics, especially if they’re not immediately obvious or intuitive?

Through our initial thinking and research, we suggest that one answer to this question lies in the metaphors we use to talk about data science and “big data” today. As Sarah Watson writes in her essay “Data is the New _____,” metaphors have always been integral to “introducing new technologies in our everyday lives and finding ways to familiarize ourselves with the novelty.” Indeed, there has been a great deal of discussion around the language we use to talk about data today (Microsoft Research’s Social Media Collective has put together a useful reading list on the topic.)

From data “floods” to data “mining,” from data as “the new oil” to data as “nuclear waste,” our discussions of data today invoke a host of concerns that aren’t adequately captured in the usual host of ethical codes and professional domains.

For example, if much of data work involves “cleaning”—to that point that some professionals might even describe themselves, however snarkily, as data “janitors”—what might the professional ethical or other commitments around preservation, stewardship, or custodianship tell us about the role of data scientists today? Or, if data really is “toxic,” how might the ethics of those professionals that regularly work with nuclear or hazardous materials help inform our understanding of the ethical handling of data?

Discourses around cleanliness, toxicity, and other environmental metaphors around big data and data ethics is one of the chief emerging themes of our ongoing analysis of ethics codes. Categorizing and unpacking these conceptual frameworks will, we believe, help both scholars and technologies fundamentally rethink their relationship and responsibilities to ethical practice in computation, without needing to reinvent wheels or throwing babies out with bathwater.

We’ll be presenting our full findings publicly in the fall – so stay tuned!

See also: the I School Pledge, and posts describing and critiquing that proposed code of conduct.

by Nick Doty at August 10, 2016 08:10 PM

August 09, 2016

adjunct professor

15 Cromulent Neologisms From Joshua Cohen’s Book of Numbers

I am so delighted with Joshua Cohen’s Book of Numbers that here I’ve picked out my favorite neologisms from the work (earlier post on the book here).

  • Adverks sales: The industrial activity of advertising—onvertising, online advertising
  • Recs, rectards, rectarded, recy: One of the most colorful and widely used descriptors in the book. Techs are sophisticated users, and then there’s recs, recreational ones. For instance, Principal describes the company’s new New York office as being filled with “Divisions requiring minimal intelligence. Minimal skill. Not techs but recs.” And Principal’s father as having subscribed to a “cruft of rectarded netservices whose chief goal was to keep their users within the walled garden by providing a sense of community, along with local news and weather, only so as like not to lose them to the wilds of the web…”
  • Lusers: Loser users
  • Plastiwicker: Those cheap plastic chairs formed to look like wicker
  • Laptopped: Your probable current condition, dear reader
  • Fannypackers: Wearers of fanny packs
  • Acqhires: Workers “hired” through acquisition of their company
  • Lotused: Something that Steve Jobs might do
  • Comptrasting: To both compare and contrast
  • Octalfortied: Forgotten
  • Concentives: The name Cohen gave to a mystery shopper company. Seems perfect for a social media marketing company
  • Crustaceate: A crabwalk. To index internet sites like a crab, compare with “spider” or “crawl”
  • Glomars: Presumably a reference to the Glomar Explorer—a project so secret that one cannot disclose its existence or non-existence. We learn from the book that Tetration is spying on its users and perhaps framing them for crimes by suggesting content.
  • Lynchrims: “…situations in which one human hangs lynched without clothes from a tree while another human stands just below and rims their anus.”
  • Compocalypse: Computer related disaster

Useful words I learned from Book of Numbers





Verbigeration 🙂


by web at August 09, 2016 03:07 AM

August 07, 2016

adjunct professor

On the “Influencers”–Nothing New Under the Sun

Bloomberg reports, FTC to Crack Down on Paid Celebrity Posts That Aren’t Clear Ads. Yes, the FTC is saber-rattling on this issue, with its native ads workshop, statements on the issue, and enforcement actions. And the media coverage runs into the same old arguments.  First, “we didn’t intend to mislead.”

We’re venturing into a little bit of ridiculous territory with the FTC saying these things because influencers really want to follow the rules,” Pomponi said. “They want to do a good job — they want to be seen as useful to brands and don’t want to do anything that would jeopardize their relationships.”

That’s great and all, but as an advertiser, you hold the duty to ensure that your messaging is not misleading. You are in control of it. You draft it. You have to anticipate how a reasonable consumer right interpret it. FTCA liability does not require an intent to deceive. The issue is whether endorsements are likely to mislead, even if the deception was an unintentional mistake.

There’s a basic tension here. The point of endorsements, like native advertising, is to create a friendly engagement with the product. However, that friendly engagement may disarm the consumer. When the consumer recognizes material as advertising, it causes the consumer to more skeptically evaluate (or avoid) an advertising claim. Thus, the benefits of secret endorsement are in tension with the goal of enabling consumers to be self-reliant in recognizing commercial persuasion.

Second, there’s something new and different about influencers and ads:

Some advertisers say influencer posts don’t deserve such careful disclosure, because they are not the same thing as a traditional ad. Lauren Diamond Kushner, a partner at Kettle, a creative agency in New York, has worked on influencer campaigns with brands including Sunglass Hut. She said the Instagram stars and YouTubers often only work with the brands that they genuinely like and use.

Wrong! So, before the internet, there was this thing called TV. And on TV, there were celebrities who did ads. Those celebrities too screened products and only did endorsements that were not too embarrassing. (In many cases, real celebrities limit ads so that they only appear outside the US!). And before the TV, there was this thing called radio. And so on.

The “genuinely like and use” argument is baloney. What happens if the influencer changes her mind and stops using it? Do the tweets get deleted?

by web at August 07, 2016 01:28 PM

August 05, 2016

adjunct professor

What if We Are All Just Rectards? On Joshua Cohen’s Book of Numbers

I have a few friends working at major technology companies who share a similar story—they describe meetings with the founder, always an eccentric, Delphic, creature who gives feedback that is rushed and difficult to understand. The group huddles after meeting with the oracle, attempting to decode his meaning. Some hilarity ensues. Someone among the group is a founder whisperer, with a track record of properly decoding his pronouncements.

Joshua Cohen’s Book of Numbers captures this dynamic and puts the reader in touch with an unforgettable character. The founder of Tetration, a Google-like company, hires Joshua Cohen, a failed writer, to ghostwrite his biography. The founder, also named Joshua Cohen, has some traits of Steve Jobs, and is singularly focused on Tetration’s goal of “equaliz

[ing] ourselves with data and data with ourselves.”

Early in the novel, ghostwriter Cohen interviews Tetration founder Cohen (“Principal”), who gives a lurid account of his attitude toward users of the search engine and the world more generally.

Cohen: “How’s it treating you, NY?” I said.


Principal: “Whatever the thing to say is, write it.”

Cohen: “I take it you don’t have a great opinion of the press?”

Principal: “The same questions are always asked: Power color? HTML White, #FFFFFF. Favorite food? Antioxidants. Favorite drink? Yuen yeung, kefir, feni lassi, kombucha. Preferred way to relax? Going around NY lying to journalists about ever having time to relax. They have become unavoidable. The questions, the answers, the journalists. But it is not the lying we hate. We hate anything unavoidable.”

Cohen: “We? Meaning you or Tetration itself?”

Principal: “No difference. We are the business and the business is us. Selfsame. Our mission is our mission.”

Cohen: “Which is?”

Principal: “The end of search—”

Cohen: “—the beginning of find: yes, I got the memo. Change the world. Be the change. Tetrate the world in your image.”

Principal: “If the moguls of the old generation talked that way, it was only to the media. But the moguls of the new generation talk that way to themselves. We, though, are from the middle. Unable to deceive or be deceived.”

Cohen: “I want to get serious for a moment,” I said. “It’s 2004, four years after everything burst, and I want to know what you’re thinking. Is this reinvestment we’re getting back in NY just another bubble rising? Why does Silicon Valley even need a Silicon Alley — isn’t bicoastalism or whatever just the analog economy?”

Principal blinked, openshut mouth, nosebreathed.

Principal: “You — what attracted us to NY was you, was access. Also the tax breaks, utility incentives. Multiple offices are the analog economy, but the office itself is a dead economy. Its only function might be social, though whatever benefits result when employees compete in person are doubled in costs when employees fuck, get pregnant, infect everyone with viruses, sending everyone home on leave and fucking with the deliverables.”

Cohen: “Do the people who work for you know your feelings on this? If not, how do you think they’d react?”

Principal: “Do not ask us — ask NY. This office will be tasked with Adverks sales, personnel ops/recruitment, policy/advocacy, media relations. Divisions requiring minimal intelligence. Minimal skill. Not techs but recs. Rectards. Lusers. Loser users. Ad people. All staff will be hired locally.”

Cohen: “You realize this is for publication — you’re sure you want to go on the record?”

Principal: “We want the scalp of the head of the team responsible for this wallpaper.”

This is just one excerpt of Cohen’s fantastic book. I cannot do it justice. It is uncompromising, yet rewarding. People comment on its epic length, but what is more important is its Joycean ambitions and what it says about humanity and technology. Cohen perfectly captures the flat affect and arrested development of this breed of uber technologist. Why should you care? To them, we’re all rectards.

by web at August 05, 2016 07:03 PM

How to Read FTCPL&P Without Paying for It

Much of my book is available free or at low cost.

The lowest cost option for purchasing is Google Play, which sells it for only $20.

You can read the print book free at over 115 academic libraries.

If you are a faculty member or student at an academic institution, you might have a subscription to Cambridge Books Online, which enables you to download a PDF of the entire book.

The introduction, which gives an overview of the entire book, is available free on SSRN.

Now that it is 6 months from publication, I have posted chapter 6 (online privacy) which is probably the book’s most useful chapter, on SSRN.

An essay on the FTC’s early history (drawn from Chapter 1) is here.

An essay on the Bureau of Economics is on SSRN.

An essay on assessments is on SSRN.

by web at August 05, 2016 02:11 PM

August 04, 2016

Center for Technology, Society & Policy

SSRN and open access for non-institutional scholars

By Tiffany Li, Fellow, Internet Law & Policy Foundry | Permalink

Academics and open access advocates expressed concern when Elsevier acquired SSRN, the previously free and open social sciences research network. It appears now that those fears may have come true, as recent copyright takedowns by SSRN indicate a shift away from open access. The loss of a well-established open access research network will be deeply felt in the social sciences research community, including in my own field of law and public policy.

SSRN was an open access research network that allowed for free posting of academic research. In May of this year, academic publisher Elsevier purchased SSRN, to the concerns of many. In June, Authors Alliance, an open access non-profit, shared with Elsevier/SSRN a list of open access principles to which they might commit; all were rejected by Elsevier. Recently, SSRN has begun taking down papers without notice for copyright reasons – reasons that do not align with the general open access principles. (SSRN has subsequently described these removals as mistakes, rather than a change in policy.)

As a practitioner not employed by a university or academic research institution, I am especially disappointed by this news. Being able to share research on SSRN was important for researchers not associated with traditional academic institutions, which at the very least can often host papers for their professors and other staff. SSRN also provided an easy method for independent researchers, and the general public at large, to access timely and relevant social science research.

Many have noted that the high price of academic journal access can make research cost-prohibitive for anyone not formally affiliated with an academic institution. Losing SSRN will make it more difficult for practitioners and others outside traditional academia both to access and to contribute to the exchange of ideas that eventually drives business and policy decisions. Losing those voices and restricting information from the non-academic public would shrink our marketplace of ideas.

Furthermore, openly accessible academic research can benefit the public at large, especially in fields like law and public policy. Free information on laws and legislation is necessary for an informed democracy. SSRN was only one small provider of open information, but it was an important one, and its loss will be deeply felt.

I’ve been speaking of SSRN in the past tense, but the website is still functional. It is still possible that SSRN may change its policies to reflect open access ideals. However, it does not appear as if this is likely to occur. It may be naïve to expect that any corporation without a knowledge-based social mission would ever dedicate resources to an open access research network. There are some alternatives, including the new research network, SocArXiv, which officially launched last month. It is unclear if researchers will migrate to SocArXiv or any other of the alternatives to SSRN.

The (presumed) demise of SSRN is a reminder of the importance of open access to information generally. From a technology policy standpoint, there are a number of ways to safeguard open access and free information, from copyright reform to net neutrality. For more information on those topics, you can read the freely accessible articles on SSRN – for as long as they last.

Ed.: The University of California has an open access policy for research conducted by faculty at all UC campuses. Berkeley’s Open Access Initiative may be able to answer your questions. eScholarship serves as an institutional repository for UC scholars. While those institutional policies and repositories cannot directly help researchers without university affiliations who are trying to publish research, making our research openly available through those means can provide access for others to read.

Tiffany Li is Commercial Counsel at General Assembly, the global education institution. She is a Fellow with the Internet Law & Policy Foundry and an Affiliate with the Princeton Center for Information Technology Policy. She holds a J.D. from Georgetown University Law Center, where she was a Global Law Scholar, and a B.A. from University of California Los Angeles, where she was a Norma J. Ehrlich Alumni Scholar. Opinions expressed in this article are those of the author and do not necessarily reflect those of her employer or any affiliated organizations.

by Nick Doty at August 04, 2016 05:19 PM

July 18, 2016

MIMS 2012

Discovery Kanban at Optimizely

Discovery work planning meeting Discovery work planning meeting

Like most tech startups, Optimizely uses an Agile process to build software. Agile has been great at helping us plan what we’re building, coordinate across teams, and ship iteratively. Agile’s focus, however, is on shipping working software, which meant design and research were awkwardly shoehorned into the process.

Scrum, according to HBO’s Silicon Valley

Practically speaking, this meant we got into a habit of building a feature while it was being designed. This put pressure on designers to produce mockups of a feature in order to “unblock” engineers, which didn’t give them room to understand the problem and iterate on solutions. And it rarely provided space for our researchers to gather data to understand the problem, or do usability testing.

This caused problems on the engineering side, too: since engineers were building a feature while it was being designed, the requirements were shifting as we explored solutions. This slowed down the development process since they would have questions about what to build, and features would have to be rewritten as the designs evolved. This also made it hard to estimate how long it would take to build, which led to missed deadlines, cutting corners (such as not writing unit tests and or doing formal QA), and low morale.

In short, our process didn’t have high velocity or produce high quality solutions.

To improve this situation, we split our product development process into two phases: Discovery and Delivery. In the Discovery phase, the goal is to understand and solve our customer’s problems. The output is finished designs that solve for all use cases. After designs are done, work moves into the Delivery phase. The goal of this phase is to ship the finished solution to customers, and uses our standard Agile process to manage and plan the work.

The key to this system is that design and research are an explicit part of the process, and there is acceptance criteria that must be met before engineers write any code.

Diagram of Optimizely’s product development process Diagram of Optimizely’s product development process

To help with organizational buy-in and planning, Discovery work is tracked on a physical kanban board in our office. The board is split into two parts: Problem Understanding (Research), and Solution Exploration (Design). In the Problem Understanding phase we gather data to help us understand and frame the problem. Data includes both qualitative and quantitative data, such as customer interviews, surveys, support tickets, sales call feedback, product usage, and NPS feedback. That data either becomes general company knowledge, such as our user personas, or feeds directly into the Solution Exploration phase.

In Solution Exploration, designers use the data gathered during Problem Understanding to frame the problem to be solved. Designers explicitly write down what the problem is, use cases, target users, and anything that’s out of scope. After getting buy-in from the PM and team, they explore solutions by creating sketches, wireframes, and prototypes. Engineers are looped in to provide feedback on technical feasibility. Researchers do usability testing in this phase as well. Finally, the output is finished designs that are fully thought through. This means there are no open questions about what a feature does that would slow down an engineer during development.

Optimizely’s Discovery kanban board Optimizely’s Discovery kanban board

Is this waterfall?

This process is more structured than our previous Agile process, but not as rigid as a typical waterfall process. We don’t “throw work over the wall” to each other, stop requirements from changing, or rely on documentation to communicate across teams.

Additionally, designers still sit with scrum teams and attend standups. This keeps whole team involved throughout the entire process. Although engineers aren’t building anything during the Discovery phase, they are aware of what problem we’re solving, why we’re solving it, and proposed solutions. And designers are included in the Delivery phase to make sure the finished feature matches what they designed.

Since rolling out this system across our scrum teams, our design and development process has been much smoother. Researchers have a stronger voice in product development, designers have space to iterate, and engineers are shipping faster. By giving designers and researchers an explicit place in our product development process, we’ve improved our planning, increased coordination and alignment across teams, and upped our velocity and quality.

Further Reading

These posts all informed my thinking for why I implemented Discovery Kanban at Optimizely:

This article originally appeared on Medium

by Jeff Zych at July 18, 2016 04:47 AM

July 16, 2016

Ph.D. student

directions to migrate your WebFaction site to HTTPS

Hiya friends using WebFaction,

Securing the Web, even our little websites, is important — to set a good example, to maintain the confidentiality and integrity of our visitors, to get the best Google search ranking. While secure Web connections had been difficult and/or costly in the past, more recently, migrating a site to HTTPS has become fairly straightforward and costs $0 a year. It may get even easier in the future, but for now, the following steps should do the trick.

Hope this helps, and please let me know if you have any issues,

P.S. Yes, other friends, I recommend WebFaction as a host; I’ve been very happy with them. Services are reasonably priced and easy to use and I can SSH into a server and install stuff. Sign up via this affiliate link and maybe I get a discount on my service or something.

P.S. And really, let me know if and when you have issues. Encrypting access to your website has gotten easier, but it needs to become much easier still, and one part of that is knowing which parts of the process prove to be the most cumbersome. I’ll make sure your feedback gets to the appropriate people who can, for realsies, make changes as necessary to standards and implementations.

Updated 16 July 2016: to fix the cron job command, which may not have always worked depending on environment variables

One day soon I hope WebFaction will make many of these steps unnecessary, but the configuring and testing will be something you have to do manually in pretty much any case. You should be able to complete all of this in an hour some evening. You might have to wait a bit on WebFaction installing your certificate and the last two parts can be done on the following day if you like.

Create a secure version of your website in the WebFaction Control Panel

Login to the Web Faction Control Panel, choose the “DOMAINS/WEBSITES” tab and then click “Websites”.

“Add new website”, one that will correspond to one of your existing websites. I suggest choosing a name like existingname-secure. Choose “Encrypted website (https)”. For Domains, testing will be easiest if you choose both your custom domain and a subdomain of (If you don’t have one of those subdomains set up, switch to the Domains tab and add it real quick.) So, for my site, I chose and

Finally, for “Contents”, click “Re-use an existing application” and select whatever application (or multiple applications) you’re currently using for your http:// site.

Click “Save” and this step is done. This shouldn’t affect your existing site one whit.

Test to make sure your site works over HTTPS

Now you can test how your site works over HTTPS, even before you’ve created any certificates, by going to in your browser. Hopefully everything will load smoothly, but it’s reasonably likely that you’ll have some mixed content issues. The debug console of your browser should show them to you: that’s Apple-Option-K in Firefox or Apple-Option-J in Chrome. You may see some warnings like this, telling you that an image, a stylesheet or a script is being requested over HTTP instead of HTTPS:

Mixed Content: The page at ‘’ was loaded over HTTPS, but requested an insecure image ‘’. This content should also be served over HTTPS.

Change these URLs so that they point to (you could also use a scheme-relative URL, like // and update the files on the webserver and re-test.

Good job! Now, should work just fine, but shows a really scary message. You need a proper certificate.

Get a free certificate for your domain

Let’s Encrypt is a new, free, automated certificate authority from a bunch of wonderful people. But to get it to setup certificates on WebFaction is a little tricky, so we’ll use the letsencrypt-webfaction utility —- thanks will-in-wi!

SSH into the server with ssh

To install, run this command:

GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib gem2.2 install letsencrypt_webfaction

For convenience, you can add this as a function to make it easier to call. Edit ~/.bash_profile to include:

function letsencrypt_webfaction {
    PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction $*

Now, let’s test the certificate creation process. You’ll need your email address (preferably not GMail, which has longer instructions), e.g. and the path to the files for the root of your website on the server, e.g. /home/yourusername/webapps/sitename/. Filling those in as appropriate, run this command:

letsencrypt_webfaction --account_email --support_email --domains --public /home/yourusername/webapps/sitename/

It’s important to use your email address for both --account_email and --support_email so that for this test, you’ll get the emails rather than sending them to the WebFaction support staff.

If all went well, you’ll see a new directory in your home directory called le_certs, and inside that a directory with the name of your custom domain (and inside that, a directory named with a timestamp, which has a bunch of cryptographic keys in it that we don’t care much about). You should also have received a couple of emails with appropriate instructions, e.g.:

LetsEncrypt Webfaction has generated a new certificate for The certificates have been placed in /home/yourusername/le_certs/ WebFaction support has been contacted with the following message:

Please apply the new certificate in /home/yourusername/le_certs/ to Thanks!

Now, run the same command again but without the --support_email parameter and this time the email will get sent directly to the WebFaction staff. One of the friendly staff will need to manually copy your certificates to the right spot, so you may need to wait a while. You’ll get a support notification once it’s done.

Test your website over HTTPS

This time you get to test it for real. Load in your browser. (You may need to force refresh to get the new certificate.) Hopefully it loads smoothly and without any mixed content warnings. Congrats, your site is available over HTTPS!

You are not done. You might think you are done, but if you think so, you are wrong.

Set up automatic renewal of your certificates

Certificates from Let’s Encrypt expire in no more than 90 days. (Why? There are two good reasons.) Your certificates aren’t truly set up until you’ve set them up to renew automatically. You do not want to do this manually every few months; you would forget, I promise.

Cron lets us run code on WebFaction’s server automatically on a regular schedule. If you haven’t set up a cron job before, it’s just a fancy way of editing a special text file. Run this command:

EDITOR=nano crontab -e

If you haven’t done this before, this file will be empty, and you’ll want to test it to see how it works. Paste the following line of code exactly, and then hit Ctrl-O and Ctrl-X to save and exit.

* * * * * echo "cron is running" >> $HOME/logs/user/cron.log 2>&1

This will output to that log every single minute; not a good cron job to have in general, but a handy test. Wait a few minutes and check ~/logs/user/cron.log to make sure it’s working. Now, let’s remove that test and add the renewal line, being sure to fill in your email address, domain name and the path to your website’s directory, as you did above:

0 4 15 */2 * PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib /usr/local/bin/ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction --account_email --domains --public /home/yourusername/webapps/sitename/ >> $HOME/logs/user/cron.log 2>&1

You’ll probably want to create the line in a text editor on your computer and then copy and paste it to make sure you get all the substitutions right. Ctrl-O and Ctrl-X to save and close it. Check with crontab -l that it looks correct.

That will create a new certificate at 4am on the 15th of alternating months (January, March, May, July, September, November) and ask WebFaction to install it. New certificates every two months is fine, though one day in the future we might change this to get a new certificate every few days; before then WebFaction will have taken over the renewal process anyway. Debugging cron jobs can be a little tricky (I've had to update the command in this post once already); I recommend adding an alert to your calendar for the day after the first time this renewal is supposed to happen, to remind yourself to confirm that it worked. If it didn't work, any error messages should be stored in the cron.log file.

Redirect your HTTP site (optional, but recommended)

Now you’re serving your website in parallel via http:// and https://. You can keep doing that for a while, but everyone who follows old links to the HTTP site won’t get the added security, so it’s best to start permanently re-directing the HTTP version to HTTPS.

WebFaction has very good documentation on how to do this, and I won’t duplicate it all here. In short, you’ll create a new static application named “redirect”, which just has a .htaccess file with, for example, the following:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
RewriteCond %{HTTP:X-Forwarded-SSL} !on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

This particular variation will both redirect any URLs that have www to the “naked” domain and make all requests HTTPS. And in the Control Panel, make the redirect application the only one on the HTTP version of your site. You can re-use the “redirect” application for different domains.

Test to make sure it’s working!,, and should all end up at (You may need to force refresh a couple of times.)

by at July 16, 2016 09:25 PM

MIMS 2010

"Unboxing the iPad Data," Deconstructed

Yesterday John Gruber linked to an infographic, “Unboxing the iPad Data” by John Kumahara and Johnathan Bonnell. In terms of graphic design it’s visually pleasing, but it falls short in a few areas and highlights common challenges in designing infographics. These are problems that occur all the time in visualizations, so let’s see what we can learn from this example.

I know it’s easy to misinterpret criticism, so I want to emphasize that these are comments about this particular graphic, not about the authors’ skills or ability.


People understand numbers. So when you are evaluating a visualization one of the most important questions is whether the graphic is better than just the numbers. The most successful visualizations show trends, outliers, and insights that a table of numbers wouldn’t.

In some places “Unboxing the iPad Data” clearly shows the numbers of interest. In others, it obscures them for the sake of design.

Display of numbers: 3,122 apps in the store, 300,000 devices sold, 1 million apps sold

The fact that Apple sold 300,000 devices and 1,000,000 applications in the first weekend is a big deal—so these should be big numbers. Instead you have to read the fine print to see that 1 actually means 1 million.

Equally large are numbers that few people care about, like the number of respondents or the specifics of the Likert scale used.

7 point scale, 2,176 audience polled

When the numbers speak for themselves, rely on them without decoration. Clearly show what is important.


Certain aspects of color are a subjective matter. One designer thinks this shade of red is the right choice; another thinks it’s ugly. But there is a science to the perception of color. We know that some colors catch people’s eyes more than others. I would argue that these pie charts would be more readily perceptible if the colors were swapped.

Small pie chart

The intense saturation of the light blue makes it look like it is highlighting something. Here the portion of interest is the small white wedge representing 15%, but the white is overpowered by the blue.

(There is the separate question of whether these pie charts help us understand the difference between the 8% and 15% range represented in the left-most column. The small pie charts are attractive, but does this small multiples grid of pie charts help the viewer understand this dataset better than a table of these numbers alone?)

A similar issue affects the bar chart. Here the viewer must compare between likely (represented by white) and unlikely (represented by blue). Again, the blue stands out and draws the user’s attention.

Number of tweets about the iPad

A minor detail in the bar chart is the orientation of the text. In the U.S., we are more comfortable turning our heads right to read things instead of left. Think of a bookshelf—how do you turn your head to read the books’ titles? My preference (preference!) would be to rotate the labels on this bar chart 180°.


Designers must be careful that their infographics accurately depict meaningful information. Here, for example, we see that the peak rate of tweets about the iPad was 26,668 in an hour.

Number of tweets about the iPad

The depiction juxtaposes this number against a timeline that suggests the peak occurred between 11:00am and 12:00pm. If this is the case, then the segment should be labeled so that the viewer can learn this readily. On other hand, if we don’t know the time of the peak, then this illustration is misleading because it implies a fact where there is ambiguity.

The segment of this infographic that depicts the cost of apps for the iPhone and iPad is less clear still.

The accompanying text reads:

The other notable difference between the iPad and the iPhone, are the app prices. The average price of the initial iPad apps ran around $4.99 (according to Mobclix) while the iPhone apps averaged a steady $1.99.

I’ve looked at this pie chart for some time and I can’t figure out what it is showing. The ratio of average app price to the total average app price? Even if that were the case, 5/7 is 71% and this chart is split into 60% and 40% segments.


There are a variety of visual variables a designer can use to encode a given set of data. Among these are length, position, angle, area, color, lightness, and others. Some of these are better suited to certain kinds of data than others, and some are more readily perceptible than others (see Cleveland and McGill’s “Graphical Perception” for extensive details and experiments).

Sales map

Look at this area comparison of percentage of iPads sold. Before we even consider the accuracy, look at these two circles and ask yourself how much bigger is circle B than circle A? Go ahead, just type your guess in the box.

Two circles

The circle on the right is times bigger than the one on the left.

The area of the circle on the left is 1075 pixels (d = 37, r = 18.5, A = 1075 px) and the circle on the right is 7390 pixels (d = 97, r = 48.5, A = 7390). That’s 6.8 times bigger.

People are much better at comparing length and position than comparing area. This is a very common mistake (one I’ve made myself). Before you represent a variable with area, you should consider that you may be handicapping people’s ability to compare the data.

Area is hard to understand, but it’s hard for designers to get it right as well. Consider the legend for this map:

Area Legend

Are these circles the right size? Let’s construct a table and find out:

Nominal Size Diameter Radius Area × 1%
1% 22 11 380 1
5% 70 35 3848 10
10% 102 51 8171 21.5
20% 160 80 20106 53

The rightmost column shows the area of the circle compared with the area of the 1% circle. It turns out that the area of the 20% circle is 53 times bigger than the area of the 1% circle—more than 2.5 times bigger than it should be. Comparing areas is hard; it’s harder with an inaccurate legend. The difficulty of accurately representing area is another reason to avoid using it.


Maps are such a common form of visualization that we use them even when they are not helpful. On top of that, they’re hard. Maps’ static nature make it hard to show important dimensions of data.

The size of maps is fixed, which can be at odds with what your are trying to communicate. In the map shown above much of the interesting data is crammed into the northeast because that’s where those states are located. Meanwhile, a bunch of sparsely populated states in the northwest of the map use up space without communicating anything.

Data on maps is constrained by realities the map itself cannot express. There’s no convenient way to show population data. Does California have a giant circle because Californians buy more iPads than their counterparts across the country? Or is it because California has more people than any other state?

Here the map isn’t clearly better than just the numbers. A table listing the sales numbers for each state, possibly along with the per capita sales per state, would express the point with greater clarity and more succinctly. In the end, that’s the goal, right?

by Ryan at July 16, 2016 03:43 PM

Seen through the lens

I took this photo of a red panda at the San Diego zoo last month. Notice anything funny about it?

Red Panda at San Diego Zoo

Every person looking at the panda, this photographer included, is seeing it through the lens of a camera. Or on a smartphone LCD screen. It makes you wonder why we are so intent on capturing a moment for posterity that we may never have really seen in person.

by Ryan at July 16, 2016 03:41 PM

FBI seizes MegaUpload, loses opportunity

Last week the FBI seized the file exchange site MegaUpload through a court order. Previously, users could exchange files too large to email through the service. Now visitors to the site see this message:

Screenshot of FBI notice at

The significance of a website seized by law enforcement is heightened in light of the controversial SOPA and PIPA legislation currently being considered in Congress. Given the high stakes—the open exchange of information in a free society, government interference with the Internet—I feel compelled to let the people at the FBI know what I think.

Guys, this is embarrassing. Really amateur hour. Seriously, look at this again:

Close-up of FBI takedown notice

Where have I seen a better looking image? I mean, other than every movie ever made that shows a computer screen?

I don’t even know where to start. Fine, the seals. Any one of these seals by itself is like starting the drive from your own five-yard line. The three together is handing the ball to the other team and carrying them into the end zone. You’re not into football metaphors? OK, a seal crammed full of text like “National Intellectual Rights Coordination Center” is the design equivalent of dividing by zero. All three is taking the limit of bad design as it approaches zero and it evaluates to negative infinity. Math isn’t your thing? No sweat—what I said doesn’t make sense anyway. The point is, the seals are ugly.

But they’re your logos, you say? I feel badly saying this, but they look like someone just slapped them together at 4PM on Friday after a lunchtime happy hour. Take the right one. It says, “Protection is our trademark.” I’m not a IP genius, but it seems to me like if protection really is your trademark, and you want people to take it seriously, you need to use that symbol. Like “Protection is our trademark™” or maybe “PROTECTION™”. But since you’re not actually selling anything or engaging in trade, maybe it would be more accurate to say that protection is your service mark. You don’t see that little SM enough.

As if the seals weren’t texty enough already, someone put “FBI ANTI-PIRACY WARNING” on top of the middle one. Is that part of the seal? Operating under the dubious assumption that there’s any design merit to this logo in the first place, the last thing you want to do is cover up your logo. Can you imagine Nike labeling clothes with its swoosh but then covering half of it up with “GARMENT CARE INSTRUCTIONS”?

Who picked the color scheme for the background? Had this person eaten a hot dog recently? That’s the only way I can figure it out. You can’t even read the complete word “seized” once in this tiled background.

The cited list of alleged crimes at the bottom is a nice touch, but, guys, what are we doing with the typography here? A big block of bold, italic, centered text. I read the first line and I think, “This is heavy stuff—they’re being changed with Conspiracy to Commit” and then I get confused until I realize that it’s Conspiracy to Commit … Copyright Infringement (18 U.S.C. § 371). I know how to continue reading on the next line, but you’re taking some serious liberties with awkward line breaks.

Let’s check out the source of the page:

<img src="banner.jpg"/>

No JavaScript. No AJAX. No CSS. Not even any tables. The image doesn’t have ALT tags. Maybe you’re not worried about Google indexing this page, or visually impaired people being able to read it, but I hope you realize you are just flushing the last 8 years of the Internet down the toilet. Interestingly, you went with the trailing slash that closes empty elements in XHTML but the DOCTYPE is…nothing. Whatever—this stuff is for nerds.

What we need to focus on is what a colossal missed opportunity this is for you. MegaUpload is down and the notice on the site is getting tons of exposure and when you go there it’s like you’re stuck watching the beginning of a movie forever, or at least that’s what it seems like for those people who paid for the movie and have to watch the FBI reminder to pay for the movie.

You must plan these operations, right? I mean, it’s not like you just randomly seize private property on a whim. This is a failure of project management. You can’t just bring in a designer at the last minute and expect them to polish your design turd. This is your chance to shine. Go wild. Animation, maybe a Matrix-style flow of numbers in the background. Ominous type. Here are some ideas:

  • The user goes to MegaUpload. The site looks normal. Suddenly, the eagles from your logos swoop in and the cool one with the arrows in its feet starts attacking the site while the other one hangs a banner over it that says “Seized by the FBI” and then jail bars drop down over the entire site.
  • The user goes to MegaUpload. The screen is filled with static like an old television. Then it looks like the screen is flipping through different TV channels. They’re all static. Finally, you get to a channel with a retro-looking message: “Seized by the FBI”. The retro part here probably plays to your design strengths.
  • The user goes to MegaUpload. The site is covered with sheets of brushed aluminum that look very heavy duty. Etched into the aluminum is the message: “Seized by the FBI”.
  • The user goes to MegaUpload. It says “HTTP Error 460” (this doesn’t exist—you would be making it up): “Seized by the FBI”.
  • The user goes to MegaUpload. A video of Rick Astley singing “Never Going To Give You Up” starts playing. When the video finishes, it fades out and is replaced by the text “Seized by the FBI”.
  • The user goes to MegaUpload. Suddenly, a S.W.A.T truck drives onto the screen. Fighter jets fly overhead. Missiles, bombs—BOOM—the screen explodes. DOM elements lie in a heap at the bottom of the screen. Smokes rises from the ashes and all of a sudden you realize it’s forming words: “Seized by the F.B.I.”

There are probably jQuery plugins that do all these effects already and you could use those as framework to build on. So dust off your copy of Photoshop. Use the mosaic filter. Add some lens flares. Watch Sneakers and Hackers and The Net and The Matrix and Tron and WarGames. Stay away from Papyrus. Then go and take down MegaUpload and put up something amazing. This is your moment: seize it.

by Ryan at July 16, 2016 03:41 PM

3.0 kilotweets

In 2007 I created my Twitter account in an Internet cafe in Santiago, Chile because I read on some blog that it was a big deal at SXSW. I spent some time deliberating between the username @ryangreenberg (which I use on a number of other services) and @greenberg. Eventually I decided on @greenberg because it seemed like being short was a big deal on Twitter. Just a few minutes ago I posted my 3,000th tweet on Twitter. Four years and a few thousand tweets later, not only am I still tweeting, but along the way I somehow ended up working for the company. What a ride.

Profile card on Twitter at 3,000 tweets

Here’s a Harper’s Index-style look at my first 3,000 tweets:

  • Number of tweets I sent between July 10, 2007 and February 27, 2012: 3,000
  • Number of words written: 53,066
  • Kilobytes of text: 302
  • Median time between tweets: 6 hours, 43 minutes
  • Average time between tweets: 13 hours, 32 minutes
  • Longest time between two tweets: 84 days between tweet #1 (“Finally signing up with Twitter.”) and tweet #2 (“Wondering if there is something ironic about Superman bandaids.”)
  • Most tweets in a single day: 13 on January 2, 2010, a top ten list of the best years of the decade
  • Retrospectively, do I wish I sounded less whiny sometimes: a little.
  • Number of URLs posted: 571
  • Number of hashtags used in tweets: 155
  • Number of @mentions used in tweets: 768
  • Most frequently mentioned people: @MIGreenberg (40), @npdoty (36), @caitearly (21), @michaelischool (20), and @kevinweil (16).
  • Number of OHs and quotes: 211

Tweet Length

Graph of distribution of tweet length
  • Number of tweets that are exactly 140 chars: 133 (about 4% of them)


  • Periods: 4,705
  • Single quotes, apostrophes: 1,839
  • Double quotes: 1,618
  • Commas: 1,560
  • Colons: 1,421
  • Ellipses: 143
  • Em dashes: 110
  • Semicolons: 71
  • En dashes: 14


  • Tweets that mention the New Yorker: 18
  • Tweets that mention the Apple or OS X: 47
  • Tweets that mention Twitter: 102

And here are a few of my favorites.

My shortest tweet—four characters—is how I let friends and family know my then-girlfriend’s response when I asked her to marry me:

And the next year when we tied the knot:

A couple graduations:

And starting work at Twitter:

I’m looking forward to the next kilotweet. If you are too, follow @greenberg over on Twitter.

by Ryan at July 16, 2016 03:41 PM