School of Information Blogs

December 15, 2017

Ph.D. student

Net neutrality

What do I think of net neutrality?

I think it’s bad for my personal self-interest. I am, economically, a part of the newer tech economy of software and data. I believe this economy benefits from net neutrality. I also am somebody who loves The Web as a consumer. I’ve grown up with it. It’s shaped my values.

From a broader perspective, I think ending net neutrality will revitalize U.S. telecom and give it leverage over the ‘tech giants’–Google, Facebook, Apple, Amazon—that have been rewarded by net neutrality policies. Telecom is a platform, but it had been turned into a utility platform. Now it can be a full-featured market player. This gives it an opportunity for platform envelopment, moving into the markets of other companies and bundling them in with ISP services.

Since this will introduce competition into the market and other players are very well-established, this could actually be good for consumers because it breaks up an oligopoly in the services that are most user-facing. On the other hand, since ISPs are monopolists in most places, we could also expect Internet-based service experience quality to deteriorate in general.

What this might encourage is a proliferation of alternatives to cable ISPs, which would be interesting. Ending net neutrality creates a much larger design space in products that provision network access. Mobile companies are in this space already. So we could see this regulation as a move in favor of the cell phone companies, not just the ISPs. This too could draw surplus away the big four.

This probably means the end of “The Web”. But we’d already seen the end of “The Web” with the proliferation of apps as a replacement for Internet browsing. IoT provides yet another alternative to “The Web”. I loved the Web as a free, creative place where everyone could make their own website about their cat. It had a great moment. But it’s safe to say that it isn’t what it used to be. In fifteen years it may be that most people no longer visit web sites. They just use connected devices and apps. Ending net neutrality means that the connectivity necessary for these services can be bundled in with the service itself. In the long run, that should be good for consumers and even the possibility of market entry for new firms.

In the long run, I’m not sure “The Web” is that important. Maybe it was a beautiful disruptive moment that will never happen again. Or maybe, if there were many more kinds of alternatives, “The Web” would return to being the quirky, radically free and interesting thing it was before it got so mainstream. Remember when The Web was just The Well (which is still around), and only people who were really curious about it bothered to use it? I don’t, because that was well before my time. But it’s possible that the Internet in its browse-happy form will become something like that again.

I hadn’t really thought about net neutrality very much before, to be honest. Maybe there are some good rebuttals to this argument. I’d love to hear them! But for now, I think I’m willing to give the shuttering of net neutrality a shot.

by Sebastian Benthall at December 15, 2017 04:31 AM

December 14, 2017

Ph.D. student

Marcuse, de Beauvoir, and Badiou: reflections on three strategies

I have written in this blog about three different philosophers who articulated a vision of hope for a more free world, including in their account an understanding of the role of technology. I would like to compare these views because nuanced differences between them may be important.

First, let’s talk about Marcuse, a Frankfurt School thinker whose work was an effective expression of philosophical Marxism that catalyzed the New Left. Marcuse was, like other Frankfurt School thinkers, concerned about the role of technology in society. His proposed remedy was “the transcendent project“, which involves an attempt at advancing “the totality” through an understanding of its logic and action to transform it into something that is better, more free.

As I began to discuss here, there is a problem with this kind of Marxist aspiration for a transformation of all of society through philosophical understanding, which is this: the political and technical totality exists as it does in no small part to manage its own internal information flows. Information asymmetries and differentiation of control structures are a feature, not a bug. The convulsions caused by the Internet as it tears and repairs the social fabric have not created the conditions of unified enlightened understanding. Rather, they have exposed that given nearly boundless access to information, most people will ignore it and maintain, against all evidence to the contrary, the dignity of one who has a valid opinion.

The Internet makes a mockery of expertise, and makes no exception for the expertise necessary for the Marcusian “transcendental project”. Expertise may be replaced with the technological apparati of artificial intelligence and mass data collection, but the latter are a form of capital whose distribution is a part of the totality. If they are having their transcendent effect today, as the proponents of AI claim, this effect is in the hands of a very few. Their motivations are inscrutable. As they have their own opinions and courtiers, writing for them is futile. They are, properly speaking, a great uncertainty that shows that centralized control does not close down all options. It may be that the next defining moment in history is set by the decision of how Jeff Bezos decides to spend his wealth, and that is his decision alone. For “our” purposes–yours, my reader, and mine–this arbitrariness of power must be seen as part of the totality to be transcended, if that is possible.

It probably isn’t. And if it Really isn’t, that may be the best argument for something like the postmodern breakdown of all epistemes. There are at least two strands of postmodern thought coming from the denial of traditional knowledge and university structure. The first is the phenomenological privileging of subjective experience. This approach has the advantage of never being embarrassed by the fact that the Internet is constantly exposing us as fools. Rather, it allows us to narcissistically and uncritically indulge in whatever bubble we find ourselves in. The alternative approach is to explicitly theorize about ones finitude and the radical implications of it, to embrace a kind of realist skepticism or at least acknowledgement of the limitations of the human condition.

It’s this latter approach which was taken up by the existentialists in the mid-20th century. In particular, I keep returning to de Beauvoir as a hopeful voice that recognizes a role for science that is not totalizing, but nevertheless liberatory. De Beauvoir does not take aim, like Marcuse and the Frankfurt School, at societal transformation. Her concern is with individual transformation, which is, given the radical uncertainty of society, a far more tractable problem. Individual ethics are based in local effects, not grand political outcomes. The desirable local effects are personal liberation and liberation of those one comes in contact with. Science, and other activities, is a way of opening new possibilities, not limited to what is instrumental for control.

Such a view of incremental, local, individual empowerment and goodness seems naive in the face of pessimistic views of society’s corruptedness. Whether these be economic or sociological theories of how inequality and oppression are locked into society, and however emotionally compelling and widespread they may be in social media, it is necessary by our previous argument to remember that these views are always mere ideology, not scientific fact, because an accurate totalizing view of society is impossible given real constraints on information flow and use. Totalizing ideologies that are not rigorous in their acceptance of basic realistic points are a symptom of more complex social structure (i.e. the distribution of capitals, the reproduction of many habiti) not a definition of it.

It is consistent for a scientific attitude to deflate political ideology because this deflation is an opening of possibility against both utopian and dystopian trajectories. What’s missing is a scientific proof of this very point, comparable to a Halting Problem or Incompleteness Theorem, but for social understanding.

A last comment, comparing Badiou to de Beauvoir and Marcuse. Badiou’s theory of the Event as the moment that may be seized to effect a transformation is perhaps a synthesis of existentialist and Marxian philosophies. Badiou is still concerned with transcendence, i.e. the moment when, given one assumed structure to life or reality or psychology, one discovers an opening into a renewed life with possibilities that the old model did not allow. But (at least as far as I have read him, which is not enough) he sees the Event as something that comes from without. It cannot be predicted or anticipate within the system but is instead a kind of grace. Without breaking explicitly from professional secularism, Badiou’s work suggests that we must have faith in something outside our understanding to provide an opportunity for transcendence. This is opposed to the more muscular theories described above: Marcuse’s theory of transcendent political activism and de Beauvoir’s active individual projects are not as patient.

I am still young and strong and so prefer the existentialist position on these matters. I am politically engaged to some extent and so, as an extension of my projects of individual freedom, am in search of opportunities for political transcendence as well–a kind of Marcuse light, as politics like science is a field of contest that is reproduced as its games are played and this is its structure. But life has taught me again and again to appreciate Badiou’s point as well, which is the appreciation of the unforeseen opportunity, the scientific and political anomaly.

What does this reflection conclude?

First, it acknowledges the situatedness and fragility of expertise, which deflates grand hopes for transcendent political projects. Pessimistic ideologies that characterize the totality as beyond redemption are false; indeed it is characteristic of the totality that it is incomprehensible. This is a realistic view, and transcendence must take it seriously.

Second, it acknowledges the validity of more localized liberatory projects despite the first point.

Third, it acknowledges that the unexpected event is a feature of the totality to be embraced, contrary to pessimistic ideologies to the contrary. The latter, far from encouraging transcendence, are blinders that prevent the recognition of events.

Because realism requires that we not abandon core logical principles despite our empirical uncertainty, you may permit one more deduction. To the extent that actors in society pursue the de Beauvoiran strategy of engaging in local liberatory projects that affect others, the probability of a Badiousian event in the life of another increases. Solipsism is false, and so (to put it tritely) “random acts of kindness” do have their effect on the totality, in aggregate. In fact, there may be no more radical political agenda than this opening up of spaces of local freedom, which shrugs off the depression of pessimistic ideology and suppression of technical control. Which is not a new view at all. What is perhaps surprising is how easy it may be.

by Sebastian Benthall at December 14, 2017 03:41 PM

December 13, 2017

Ph.D. student

transcending managerialism

What motivates my interest in managerialism?

It may be a bleak topic to study, but recent traffic to this post on Marcuse has reminded me of the terms to explain my intention.

For Marcuse, a purpose of scholarship is the transcendent project, whereby an earlier form of rationality and social totality are superseded by a new one that offers “a greater chance for the free development of human needs and faculties.” In order to accomplish this, it has to first “define[] the established totality in its very structure, basic tendencies, and relations”.

Managerialism, I propose, is a way of defining and articulating the established totality: they way everything in our social world (the totality) has been established. Once this is understood, it may be possible to identify a way of transcending that totality. But, the claim is, you can’t transcend what you don’t understand.

Marx had a deeply insightful analysis of capitalism and then used that to develop an idea of socialism. The subsequent century indeed saw the introduction of many socialistic ideas into the mainstream, including labor organizing and the welfare state. Now it is inadequate to consider the established totality through a traditional or orthodox Marxist lens. It doesn’t grasp how things are today.

Arguably, critiques of neoliberalism, enshrined in academic discourse since the 80’s, have the same problem. The world is different from how it was in the 80’s, and civil society has already given what it can to resist neoliberalism. So a critical perspective that uses the same tropes as those used in the 80’s is going to be part of the established totality, but not definitive of it. Hence, it will fail to live up to the demands of the transcendent project.

So we need a new theory of the totality that is adequate to the world today. It can’t look exactly like the old views.

Gilman’s theory of plutocratic insurgency is a good example of the kind of theorizing I’m talking about, but this obviously leaves a lot out. Indeed, the biggest challenge to defining the established totality is the complexity of the totality; this complexity could makes the transcendent project literally impossible. But to stop there is a tremendous cop out.

Rather, what’s needed is an explicit theorization of the way societal complexity, and society’s response to it, shape the totality in systematic ways. “Complexity” can’t be used in a fuzzy way for this to work. It has to be defined in the mathematically precise ways that the institutions that manage and create this complexity think about it. That means–and this is the hardest thing for a political or social theorist to swallow–that computer science and statistics have to be included as part of the definition of totality. Which brings us back to the promise of computational social science if and when it includes its mathematical methodological concepts into its own vocabulary of theorization.


Benthall, Sebastian. “Philosophy of computational social science.” Cosmos and History: The Journal of Natural and Social Philosophy 12.2 (2016): 13-30.

Gilman, Nils. “The twin insurgency.” American Interest 15 (2014).

Marcuse, Herbert. One-dimensional man: Studies in the ideology of advanced industrial society. Routledge, 2013.

by Sebastian Benthall at December 13, 2017 06:54 PM

Notes on Clark Kerr’s “The ‘City of Intellect’ in a Century for Foxes?”, in The Uses of the University 5th Edition

I am in my seventh and absolutely, definitely last year of a doctoral program and so have many questions about the future of higher education and whether or not I will be a part of it. For insight, I have procured an e-book copy of Clark Kerr’s The Uses of the University (5th Edition, 2001). Clark Kerr was the 20th President of University of California system and became famous among other things for his candid comments on university administration, which included such gems as

“I find that the three major administrative problems on a campus are sex for the students, athletics for the alumni and parking for the faculty.”


“One of the most distressing tasks of a university president is to pretend that the protest and outrage of each new generation of undergraduates is really fresh and meaningful. In fact, it is one of the most predictable controversies that we know. The participants go through a ritual of hackneyed complaints, almost as ancient as academe, while believing that what is said is radical and new.”

The Uses of the University is a collection of lectures on the topic of the university, most of which we given in the second half of the 20th century. The most recent edition contains a lecture given in the year 2000, after Kerr had retired from administration, but anticipating the future of the university in the 21st century. The title of the lecture is “The ‘City of Intellect’ in a Century for Foxes?”, and it is encouragingly candid and prescient.

To my surprise, Kerr approaches the lecture as a forecasting exercise. Intriguingly, Kerr employs the hedgehog/fox metaphor from Isaiah Berlin in a lecture about forecasting five years before the publication of Tetlock’s 2005 book Expert Political Judgment (review link), which used the fox/hedgehog distinction to cluster properties that were correlated with political expert’s predictive power. Kerr’s lecture is structured partly as the description of a series of future scenarios, reminiscent of scenario planning as a forecasting method. I didn’t expect any of this, and it goes to show perhaps how pervasive scenario thinking was as a 20th century rhetorical technique.

Kerr makes a number of warning about the university in the 20th century, especially with respect to the glory of the university in the 20th century. He makes a historical case for this: universities in the 20th century thrived on new universal access to students, federal investment in universities as the sites of basic research, and general economic prosperity. He doesn’t see these guaranteed in the 20th century, though he also makes the point that in official situations, the only thing a university president should do is discuss the past with pride and the future with apprehension. He has a rather detailed analysis of the incentives guiding this rhetorical strategy as part of the lecture, which makes you wonder how much salt to take the rest of the lecture with.

What are the warnings Kerr makes? Some are a continuation of the problems universities experienced in the 20th century. Military and industrial research funding changed the roles of universities away from liberal arts education into research shop. This was not a neutral process. Undergraduate education suffered, and in 1963 Kerr predicted that this slackening of the quality of undergraduate education would lead to student protests. He was half right; students instead turned their attention externally to politics. Under these conditions, there grew to be a great tension between the “internal justice” of a university that attempted to have equality among its faculty and the permeation of external forces that made more of the professiorate face outward. A period of attempted reforms throguh “participatory democracy” was “a flash in the pan”, resulting mainly in “the creation of courses celebrating ethnic, racial, and gender diversities. “This experience with academic reform illustrated how radical some professors can be when they look at the external world and how conservative when they look inwardly at themselves–a split personality”.

This turn to industrial and military funding and the shift of universities away from training in morality (theology), traditional professions (medicine, law), self-chosen intellectual interest for its own sake, and entrance into elite society towards training for the labor force (including business administration and computer science) is now quite old–at least 50 years. Among other things, Kerr predicts, this means that we will be feeling the effects of the hollowing out of the education system that happened as higher education deprioritized teaching in favor of research. The baby boomers who went through this era of vocational university education become, in Kerr’s analysis, an enormous class of retirees by 2030, putting new strain on the economy at large. Meanwhile, without naming computers and the Internet, Kerr acknowledged that the “electronic revolution” is the first major change to affect universities for three hundred years, and could radically alter their role in society. He speaks highly of Peter Drucker, who in 1997 was already calling the university “a failure” that would be made obsolete by long-distance learning.

In an intriguing comment on aging baby boomers, which Kerr discusses under the heading “The Methuselah Scenario”, is that the political contest between retirees and new workers will break down partly along racial lines: “Nasty warfare may take place between the old and the young, parents and children, retired Anglos and labor force minorities.” Almost twenty years later, this line makes me wonder how much current racial tensions are connected to age and aging. Have we seen the baby boomer retirees rise as a political class to vigorously defend the welfare state from plutocratic sabotage? Will we?

Kerr discusses the scenario of the ‘disintegration of the integrated university’. The old model of medicine, agriculture, and law integrated into one system is coming apart as external forces become controlling factors within the university. Kerr sees this in part as a source of ethical crises for universities.

“Integration into the external world inevitably leads to disintegration of the university internally. What are perceived by some as the injustices in the external labor market penetrate the system of economic rewards on campus, replacing policies of internal justice. Commitments to external interests lead to internal conflicts over the impartiality of the search for truth. Ideologies conflict. Friendships and loyalties flow increasingly outward. Spouses, who once held the academic community together as a social unit, now have their own jobs. “Alma Mater Dear” to whom we “sing a joyful chorus” becomes an almost laughable idea.”

A factor in this disintegration is globalization, which Kerr identifies with the mobility of those professors who are most able to get external funding. These professors have increased bargaining power and can use “the banner of departmental autonomy” to fight among themselves for industrial contracts. Without oversight mechanisms, “the university is helpless in the face of the combined onslaught of aggressive industry and entrepreneurial faculty members”.

Perhaps most fascinating for me, because it resonates with some of my more esoteric passions, is Kerr’s section on “The fractionalization of the academic guild“. Subject matter interest breaks knowledge into tiny disconnected topics–"Once upon a time, the entire academic enterprise originated in and remained connected to philosophy." The tension between "internal justice" and the "injustices of the external labor market" creates a conflict over monetary rewards. Poignantly, "fractionalization also increases over differing convictions about social justice, over whether it should be defined as equality of opportunity or equality of results, the latter often taking the form of equality of representation. This may turn out to be the penultimate ideological battle on campus."

And then:

The ultimate conflict may occur over models of the university itself, whether to support the traditional or the “postmodern” model. The traditional model is based on the enlightenment of the eighteenth century–rationality, scientific processes of thought, the search for truth, objectivity, “knowledge for its own sake and for its practical applications.” And the traditional university, to quote the Berkeley philosopher John Searle, “attempts to be apolitical or at least politically neutral.” The university of postmodernism thinks that all discourse is political anyway, and it seeks to use the university for beneficial rather than repressive political ends… The postmodernists are attempting to challenge certain assumptions about the nature of truth, objectivity, rationality, reality, and intellectual quality.”

… Any further politicization of the university will, of course, alienate much of the public at large. While most acknowledge that the traditional university was partially politicized already, postmodernism will further raise questions of whether the critical function of the university is based on political orientation rather than on nonpolitical scientific analysis.”

I could go on endlessly about this topic; I’ll try to be brief. First, as per Lyotard’s early analysis of the term, postmodernism is as much as result of the permeation of the university by industrial interests as anything else. Second, we are seeing, right now today in Congress and on the news etc., the eroded trust that a large portion of the public has of university “expertise”, as they assume (having perhaps internalized a reductivist version of the postmodern message despite or maybe because they were being taught by teaching assistants instead of professors) that the professoriate is politically biased. And now the students are in revolt over Free Speech again as a result.

Kerr entertains for a paragraph the possibility of a Hobbesian doomsday free-for-all over the university before considering more mundane possibilities such as a continuation of the status quo. Adapting to new telecommunications (including “virtual universities”), new amazing discoveries in biological sciences, and higher education as a step in mid-career advancement are all in Kerr’s more pragmatic view of the future. The permeability of the university can bring good as well as bad as it is influenced by traffic back and forth across its borders. “The drawbridge is now down. Who and what shall cross over it?”

Kerr counts three major wildcards determining the future of the university. The first is overall economic productivity, the second is fluctuations in returns to a higher education. The third is the United States’ role in the global economy “as other nations or unions of nations (for example, the EU) may catch up with and even surpass it. The quality of education and training for all citizens will be to this contest. The American university may no longer be supreme.” Fourth, student unrest turning universities into the “independent critic”. And fifth, the battles within the professoriate, “over academic merit versus social justice in treatment of students, over internal justice in the professional reward system versus the pressures of external markets, over the better model for the university–modern or post-modern.”

He concludes with three wishes for the open-minded, cunning, savvy administrator of the future, the “fox”:

  1. Careful study of new information technologies and their role.
  2. “An open, in-depth debate…between the proponents of the traditional and the postmodern university instead of the sniper shots of guerilla warfare…”
  3. An “in-depth discussion…about the ethical systems of the future university”. “Now the ethical problems are found more in the flow of contacts between the academic and the external worlds. There have never been so many ethical problems swirling about as today.”

by Sebastian Benthall at December 13, 2017 01:28 AM

December 12, 2017

Ph.D. student

Re: a personal mission statement

Awesome. I hadn't considered a personal "mission statement" before now, even though I often consider and appreciate organizational mission statements. However, I do keep a yearly plan, including my personal goals.

Doty Plan 2017:
Doty Plan 2016:

I like that your categories let you provide a little more text than my bare-bones list of goals/areas/actions. I especially like the descriptions of role and mission; I feel like I both understand you more and I find those inspiring. That said, it also feels like a lot! Providing a coherent set of beliefs, values and strategies seems like more than I would be comfortable committing to. Is that what you want?

The other difference in my practice that I have found useful is the occasional updates: what is started, what is on track and what is at risk. Would it be useful for you to check in with yourself from time to time? I suppose I picked up that habit from Microsoft's project management practices, but despite its corporate origins, it helps me see where I'm doing well and where I need to re-focus or pick a new approach.


BCC my public blog, because I suppose these are documents that I could try to share with a wider group.

by at December 12, 2017 02:40 AM

Ph.D. student

Contextual Integrity as a field

There was a nice small gathering of nearby researchers (and one important call-in) working on Contextual Integrity at Princeton’s CITP today. It was a nice opportunity to share what we’ve been working on and make plans for the future.

There was a really nice range of different contributions: systems engineering for privacy policy enforcement, empirical survey work testing contextualized privacy expectations, a proposal for a participatory design approach to identifying privacy norms in marginalized communities, a qualitative study on how children understand privacy, and an analysis of the privacy implications of the Cybersecurity Information Sharing Act, among other work.

What was great is that everybody was on the same page about what we were after: getting a better understanding of what privacy really is, so that we can design between policies, educational tools, and technologies that preserve it. For one reason or another, the people in the room had been attracted to Contextual Integrity. Many of us have reservations about the theory in one way or another, but we all see its value and potential.

One note of consensus was that we should try to organize a workshop dedicated specifically to Contextual Integrity, and widening what we accomplished today to bring in more researchers. Today’s meeting was a convenience sample, leaving out a lot of important perspectives.

Another interesting thing that happened today was a general acknowledgment that Contextual Integrity is not a static framework. As a theory, it is subject to change as scholars critique and contribute to it through their empirical and theoretical work. A few of us are excited about the possibility of a Contextual Integrity 2.0, extending the original theory to fill theoretical gaps that have been identified in it.

I’d articulate the aspiration of the meeting today as being about letting Contextual Integrity grow from being a framework into a field–a community of people working together to cultivate something, in this case, a kind of knowledge.

by Sebastian Benthall at December 12, 2017 02:03 AM

December 10, 2017

Ph.D. student

Appearance, deed, and thing: meta-theory of the politics of technology

Flammarion engraving

Much is written today about the political and social consequences of technology. This writing often maintains that this inquiry into politics and society is distinct from the scientific understanding that informs the technology itself. This essay argues that this distinction is an error. Truly, there is only one science of technology and its politics.

Appearance, deed, and thing

There are worthwhile distinctions made between how our experience of the world feels to us directly (appearance), how we can best act strategically in the world (deed), and how the world is “in itself” or, in a sense, despite ourselves (individually) (thing).


The world as we experience it has been given the name “phenomenon” (late Latin from Greek phainomenon ‘thing appearing to view’) and so “phenomenology” is the study of what we colloquially call today our “lived experience”. Some anthropological methods are a kind of social phenomenology, and some scholars will deny that there is anything beyond phenomenology. Those that claim to have a more effective strategy or truer picture of the world may have rhetorical power, powers that work on the lived experience of the more oppressed people because they have not been adequately debunked and shown to be situated, relativized. The solution to social and political problems, to these scholars, is more phenomenology.*


There are others that see things differently. A perhaps more normal attitude is that the outcomes of ones actions are more important that how the world feels. Things can feel one way now and another way tomorrow; does it much matter? If one holds some beliefs that don’t work when practically applied, one can correct oneself. The name for this philosophical attitude is pragmatism, (from Greek pragma, ‘deed’). There are many people, including some scholars, who find this approach entirely sufficient. The solution to social and political problems is more pragmatism. Sometimes this involves writing off impractical ideas and the people who hold them either useless or as mere pawns. It is their loss.


There are others that see things still differently. A perhaps diminishing portion of the population holds theories of how the world works that transcend both their own lived experience and individual practical applications. Scientific theories about the physical nature of the universe, though tested pragmatically and through the phenomena apparent to the scientists, are based in a higher claim about their value. As Bourdieu (2004) argues, the whole field of science depends on the accepted condition that scientists fairly contend for a “monopoly on the arbitration of the real”. Scientific theories are tested through contest, with a deliberate effort by all parties to prove their theory to be the greatest. These conditions of contest hold science to a more demanding standard than pragmatism, as results of applying a pragmatic attitude will depend on the local conditions of action. Scientific theories are, in principle, accountable to the real (from late Latin realis, from Latin res ‘thing’); these scientists may
be called ‘realists’ in general, though there are many flavors of realism as, appropriately, theories of what is real and how to discover reality have come and gone (see post-positivism and critical realism, for example).

Realists may or may not be concerned with social and political problems. Realists may ask: What is a social problem? What do solutions to these problems look like?

By this account, these three foci and their corresponding methodological approaches are not equivalent to each other. Phenomenology concerns itself with documenting the multiplicity of appearances. Pragmatism introduces something over and above this: a sorting or evaluation of appearances based on some goals or desired outcomes. Realism introduces something over and above pragmatism: an attempt at objectivity based on the contest of different theories across a wide range of goals. ‘Disinterested’ inquiry, or equivalently inquiry that is maximally inclusive of all interests, further refines the evaluation of which appearances are valid.

If this account sounds disparaging of phenomenology as merely a part of higher and more advanced forms of inquiry, that is truly how it is intended. However, it is equally notable that to live up to its own standard of disinterestedness, realism must include phenomenology fully within itself.

Nature and technology

It would be delightful if we could live forever in a world of appearances that takes the shape that we desire of it when we reason about it critically enough. But this is not how any but the luckiest live.

Rather, the world acts on us in ways that we do not anticipate. Things appear to us unbidden; they are born, and sometimes this is called ‘nature’ (from Latin natura ‘birth, nature, quality,’ from nat- ‘born’). The first snow of Winter comes as a surprise after a long warm Autumn. We did nothing to summon it, it was always there. For thousands of years humanity has worked to master nature through pragmatic deeds and realistic science. Now, very little of nature has been untouched by human hands. The stars are still things in themselves. Our planetary world is one we have made.

“Technology” (from Greek tekhnologia ‘systematic treatment,’ from tekhnē ‘art, craft’) is what we call those things that are made by skillful human deed. A glance out the window into a city, or at the device one uses to read this blog post, is all one needs to confirm that the world is full of technology. Sitting in the interior of an apartment now, literally everything in my field of vision except perhaps my own two hands and the potted plant are technological artifacts.

Science and technology studies: political appearances

According to one narrative, Winner (1980) famously asked the galling question “Do artifacts have politics?” and spawned a field of study** that questions the social consequences of technology. Science and Technology Studies (STS) is, purportedly, this field.
The insight this field claims as their own is that technology has social impact that is politically interesting, the specifics of this design determine these impacts, and that the social context of the design therefore influences the consequences of the technology. At its most ambitious, STS attempts to take the specifics of the technology out of the explanatory loop, showing instead how politics drives design and implementation to further political ends.

Anthropological methods are popular among STS scholars, who often commit themselves to revealing appearances that demonstrate the political origins and impacts of technology. The STS researcher might asked, rhetorically, “Did you know that this interactive console is designed and used for surveillance?”

We can nod sagely at these observations. Indeed, things appear to people in myriad ways, and critical analysis of those appearances does expose that there is a multiplicity of ways of looking at things. But what does one do with this picture?

The pragmatic turn back to realism

When one starts to ask the pragmatic question “What is to be done?”, one leaves the domain of mere appearances and begins to question the consequences of one’s deeds. This leads one to take actions and observe the unanticipated results. Suddenly, one is engaging in experimentation, and new kinds of knowledge are necessary. One needs to study organizational theory to understand the role of h technology within a firm, economics to understand how it interacts with the economy. One quickly leaves the field of study known as “science and technology studies” as soon as one begins to consider ones practical effects.

Worse (!), the pragmatist quickly discovers that discovering the impact of ones deeds requires an analysis of probabilities and the difficulty techniques of sampling data and correcting for bias. These techniques have been proven through the vigorous contest of the realists, and the pragmatist discovers that many tools–technologies–have been invented and provisioned for them to make it easier to use these robust strategies. The pragmatist begins to use, without understanding them, all the fruits of science. Their successes are alienated from their narrow lived experience, which are not enough to account for the miracles the= world–one others have invented for them–performs for them every day.

The pragmatist must draw the following conclusions. The world is full of technology, is constituted by it. The world is also full of politics. Indeed, the world is both politics and technology; politics is a technology; technology is form of politics. The world that must be mastered, for pragmatic purposes, is this politico-technical*** world.

What is technical about the world is that it is a world of things created through deed. These things manifest themselves in appearances in myriad and often unpredictable ways.

What is political about the world is that it is a contest of interests. To the most naive student, it may be a shock that technology is part of this contest of interests, but truly this is the most extreme naivete. What adolescent is not exposed to some form of arms race, whether it be in sports equipment, cosmetics, transportation, recreation, etc. What adult does not encounter the reality of technology’s role in their own business or home, and the choice of what to procure and use.

The pragmatist must be struck by the sheer obviousness of the observation that artifacts “have” politics, though they must also acknowledge that “things” are different from the deeds that create them and the appearances they create. There are, after all, many mistakes in design. The effects of technology may as often be due to incompetence as they are to political intent. And to determine the difference, one must contest the designer of the technology on their own terms, in the engineering discourse that has attempted to prove which qualities of a thing survive scrutiny across all interests. The pragmatist engaging the politico-technical world has to ask: “What is real?”

The real thing

“What is real?” This is the scientific question. It has been asked again and again for thousands of years for reasons not unlike those traced in this essay. The scientific struggle is the political struggle for mastery over our own politico-technical world, over the reality that is being constantly reinvented as things through human deeds.

There are no short cuts to answering this question. There are only many ways to cop out. These steps take one backward into striving for ones local interest or, further, into mere appearance, with its potential for indulgence and delusion. This is the darkness of ignorance. Forward, far ahead, is a horizon, an opening, a strange new light.

* This narrow view of the ‘privilege of subjectivity’ is perhaps a cause of recent confusion over free speech on college campuses. Realism, as proposed in this essay, is a possible alternative to that.

** It has been claimed that this field of study does not exist, much to the annoyance of those working within it.

*** I believe this term is no uglier than the now commonly used “sociotechnical”.


Bourdieu, Pierre. Science of science and reflexivity. Polity, 2004.

Winner, Langdon. “Do artifacts have politics?.” Daedalus (1980): 121-136.

by Sebastian Benthall at December 10, 2017 05:22 PM

December 08, 2017

Ph.D. student

managerialism, continued

I’ve begun preliminary skimmings of Enteman’s Managerialism. It is a dense work of analytic philosophy, thick with argument. Sporadic summaries may not do it justice. That said, the principle of this blog is that the bar for ‘publication’ is low.

According to its introduction, Enteman’s Managerialism is written by a philosophy professor (Willard Enteman) who kept finding that the “great thinkers”–Adam Smith, Karl Marx–and the theories espoused in their writing kept getting debunked by his students. Contemporary examples showed that, contrary to conventional wisdom, the United States was not a capitalist country whose only alternative was socialism. In his observation, the United States in 1993 was neither strictly speaking capitalist, nor was it socialist. There was a theoretical gap that needed to be filled.

One of the concepts reintroduced by Enteman is Robert Dahl‘s concept of polyarchy, or “rule by many”. A polyarchy is neither a dictatorship nor a democracy, but rather is a form of government where many different people with different interests, but then again probably not everybody, is in charge. It represents some necessary but probably insufficient conditions for democracy.

This view of power seems evidently correct in most political units within the United States. Now I am wondering if I should be reading Dahl instead of Enteman. It appears that Dahl was mainly offering this political theory in contrast to a view that posited that political power was mainly held by a single dominant elite. In a polyarchy, power is held by many different kinds of elites in contest with each other. At its democratic best, these elites are responsive to citizen interests in a pluralistic way, and this works out despite the inability of most people to participate in government.

I certainly recommend the Wikipedia articles linked above. I find I’m sympathetic to this view, having come around to something like it myself but through the perhaps unlikely path of Bourdieu.

This still limits the discussion of political power in terms of the powers of particular people. Managerialism, if I’m reading it right, makes the case that individual power is not atomic but is due to organizational power. This makes sense; we can look at powerful individuals having an influence on government, but a more useful lens could look to powerful companies and civil society organizations, because these shape the incentives of the powerful people within them.

I should make a shift I’ve made just now explicit. When we talk about democracy, we are often talking about a formal government, like a sovereign nation or municipal government. But when we talk about powerful organizations in society, we are no longer just talking about elected officials and their appointees. We are talking about several different classes of organizations–businesses, civil society organizations, and governments among them–interacting with each other.

It may be that that’s all there is to it. Maybe Capitalism is an ideology that argues for more power to businesses, Socialism is an ideology that argues for more power to formal government, and Democracy is an ideology that argues for more power to civil society institutions. These are zero-sum ideologies. Managerialism would be a theory that acknowledges the tussle between these sectors at the organizational level, as opposed to at the atomic individual level.

The reason why this is a relevant perspective to engage with today is that there has probably in recent years been a transfer of power (I might say ‘control’) from government to corporations–especially Big Tech (Google, Amazon, Facebook, Apple). Frank Pasquale makes the argument for this in a recent piece. He writes and speaks with a particular policy agenda that is far better researched than this blog post. But a good deal of the work is framed around the surprise that ‘governance’ might shift to a private company in the first place. This is a framing that will always be striking to those who are invested in the politics of the state; the very word “govern” is unmarkedly used for formal government and then surprising when used to refer to something else.

Managerialism, then, may be a way of pointing to an option where more power is held by non-state actors. Crucially, though, managerialism is not the same thing as neoliberalism, because neoliberalism is based on laissez-faire market ideology and contempory information infrastructure oligopolies look nothing like laissez-faire markets! Calling the transfer of power from government to corporation today neoliberalism is quite anachronistic and misleading, really!

Perhaps managerialism, like polyarchy, is a descriptive term of a set of political conditions that does not represent an ideal, but a reality with potential to become an ideal. In that case, it’s worth investigating managerialism more carefully and determining what it is and isn’t, and why it is on the rise.

by Sebastian Benthall at December 08, 2017 01:20 AM

December 06, 2017

Ph.D. student

beginning Enteman’s Managerialism

I’ve been writing about managerialism without having done my homework.

Today I got a new book in the mail, Willard Enteman’s Managerialism: The Emergence of a New Ideology, a work of analytic political philosophy that came out in 1993. The gist of the book is that none of the dominant world ideologies of the time–capitalism, socialism, and democracy–actually describe the world as it functions.

Enter Enteman’s managerialism, which considers a society composed of organizations, not individuals, and social decisions as a consequence of the decisions of organizational managers.

It’s striking that this political theory has been around for so long, though it is perhaps more relevant today because of large digital platforms.

by Sebastian Benthall at December 06, 2017 07:49 PM

Ph.D. student

Assembling Critical Practices Reading List Posted

At the Berkeley School of Information, a group of researchers interested in the areas of critically-oriented design practices, critical social theory, and STS have hosted a reading group called “Assembling Critical Practices,” bringing together literature from these fields, in part to track their historical continuities and discontinuities, as well as to see new opportunities for design and research when putting them in conversation together.
I’ve posted our reading list from our first iterations of this group. Sections 1-3 focus on critically-oriented HCI, early critiques of AI, and an introduction to critical theory through the Frankfurt School. This list comes from an I School reading group put together in collaboration with Anne Jonas and Jenna Burrell.

Section 4 covers a broader range of social theories. This comes from a reading group sponsored by the Berkeley Social Science Matrix organized by myself and Anne Jonas with topic contributions from Nick Merrill, Noura Howell, Anna Lauren Hoffman, Paul Duguid, and Morgan Ames (Feedback and suggestions are welcome! Send an email to

Table of Contents:

See the whole reading list on this page.

by Richmond at December 06, 2017 07:29 AM

December 02, 2017

Ph.D. student

How to promote employees using machine learning without societal bias

Though it may at first read as being callous, a managerialist stance on inequality in statistical classification can help untangle some of the rhetoric around this tricky issue.

Consider the example that’s been in the news lately:

Suppose a company begins to use an algorithm to make decisions about which employees to promote. It uses a classifier trained on past data about who has been promoted. Because of societal bias, women are systematically under-promoted; this is reflected in the data set. The algorithm, naively trained on the historical data, reproduces the historical bias.

This example describes a bad situation. It is bad from a social justice perspective; by assumption, it would be better if men and women had equal opportunity in this work place.

It is also bad from a managerialist perspective. Why? Because if the point of using an algorithm were not to correct for societal biases introducing irrelevancies into the promotion decision, then it would not make managerial sense to change business practices over to using an algorithm. The whole point of using an algorithm is to improve on human decision-making. This is a poor match of an algorithm to a problem.

Unfortunately, what makes this example compelling is precisely what makes it a bad example of using an algorithm in this context. The only variables discussed in the example are the socially salient ones thick with political implications: gender, and promotion. What are more universal concerns than gender relations and socioeconomic status?!

But from a managerialist perspective, promotions should be issued based on a number of factors not mentioned in the example. What factors are these? That’s a great and difficult question. Promotions can reward hard work and loyalty. They can also be issued to those who demonstrate capacity for leadership, which can be a function of how well they get along with other members of the organization. There may be a number of features that predict these desirable qualities, most of which will have to do with working conditions within the company as opposed to qualities inherent in the employee (such as their past education, or their gender).

If one were to start to use machine learning intelligently to solve this problem, then one would go about solving it in a way entirely unlike the procedure in the problematic example. One would rather draw on soundly sourced domain expertise to develop a model of the relationship between relevant, work-related factors. For many of the key parts of the model, such as general relationships between personality type, leadership style, and cooperation with colleagues, one would look outside the organization for gold standard data that was sampled responsibly.

Once the organization has this model, then it can apply it to its own employees. For this to work, employees would need to provide significant detail about themselves, and the company would need to provide contextual information about the conditions under which employees work, as these may be confounding factors.

Part of the merit of building and fitting such a model would be that, because it is based on a lot of new and objective scientific considerations, it would produce novel results in recommending promotions. Again, if the algorithm merely reproduced past results, it would not be worth the investment in building the model.

When the algorithm is introduced, it ideally is used in a way that maintains traditional promotion processes in parallel so that the two kinds of results can be compared. Evaluation of the algorithm’s performance, relative to traditional methods, is a long, arduous process full of potential insights. Using the algorithm as an intervention at first allows the company to develop a causal understanding its impact. Insights from the evaluation can be factored back into the algorithm, improving the latter.

In all these cases, the company must keep its business goals firmly in mind. If they do this, then the rest of the logic of their method falls out of data science best practices, which are grounded in mathematical principles of statistics. While the political implications of poorly managed machine learning are troubling, effective management of machine learning which takes the precautions necessary to develop objectivity is ultimately a corrective to social bias. This is a case where sounds science and managerialist motives and social justice are aligned.

by Sebastian Benthall at December 02, 2017 03:40 PM

Enlightening economics reads

Nils Gilman argues that the future of the world is wide open because neoliberalism has been discredited. So what’s the future going to look like?

Given that neoliberalism is for the most part an economic vision, and that competing theories have often also been economic visions (when they have not been political or theological theories), a compelling futurist approach is to look out for new thinking about economics. The three articles below have recently taught me something new about economics:

Dani Rodrik. “Rescuing Economics from Neoliberalism”, Boston Review. (link)

This article makes the case that the association frequently made between economics as a social science and neoliberalism as an ideology is overdrawn. Of course, probably the majority of economists are not neoliberals. Rodrik is defending a view of economics that keeps its options open. I think he overstates the point with the claim, “Good economists know that the correct answer to any question in economics is: it depends.” This is just simply incorrect, if questions have their assumptions bracketed well enough. But since Rodrik’s rhetorical point appears to be that economists should not be dogmatists, he can be forgiven this overstatement.

As an aside, there is something compelling but also dangerous to the view that a social science can provide at best narrowly tailored insights into specific phenomena. These kinds of ‘sciences’ wind up being unaccountable, because the specificity of particular events prevent the repeated testing of the theories that are used to explain them. There is a risk of too much nuance, which is akin to the statistical concept of overfitting.

A different kind of article is:

Seth Ackerman. “The Disruptors” Jacobin. (link)

An interview with J.W. Mason in the smart socialist magazine, Jacobin, that had the honor of a shout out from Matt Levine’s popular “Money Talk” Bloomberg column (column?). On of the interesting topics it raises is whether or not mutual funds, in which many people invest in a fund that then owns a wide portfolio of stocks, are in a sense socialist and anti-competitive because shareholders no longer have an interest in seeing competition in the market.

This is original thinking, and the endorsement by Levine is an indication that it’s not a crazy thing to consider even for the seasoned practical economists in the financial sector. My hunch at this point in life is that if you want to understand the economy, you have to understand finance, because they are the ones whose job it is to profit from their understanding of the economy. As a corollary, I don’t really understand the economy because I don’t have a great grasp of the financial sector. Maybe one day that will change.

Speaking of expertise being enhanced by having ‘skin in the game’, the third article is:

Nassim Nicholas Taleb. “Inequality and Skin in the Game,” Medium. (link)

I haven’t read a lot of Taleb though I acknowledge he’s a noteworthy an important thinker. This article confirmed for me the reputation of his style. It was also a strikingly fresh look at economics of inequality, capturing a few of the important things mainstream opinion overlooks about inequality, namely:

  • Comparing people at different life stages is a mistake when analyzing inequality of a population.
  • A lot of the cause of inequality is randomness (as opposed to fixed population categories), and this inequality is inevitable

He’s got a theory of what kinds of inequality people resent versus what they tolerate, which is a fine theory. It would be nice to see some empirical validation of it. He writes about the relationship between ergodicity and inequality, which is interesting. He is scornful of Piketty and everyone who was impressed by Piketty’s argument, which comes off as unfriendly.

Much of what Taleb writes about the need to understand the economy through a richer understanding of probability and statistics strikes me as correct. If it is indeed the case that mainstream economics has not caught up to this, there is an opportunity here!

by Sebastian Benthall at December 02, 2017 02:28 AM

November 28, 2017

Ph.D. student

mathematical discourse vs. exit; blockchain applications

Continuing my effort to tie together the work on this blog into a single theory, I should address the theme of an old post that I’d forgotten about.

The post discusses the discourse theory of law, attributed to the later, matured Habermas. According to it, the law serves as a transmission belt between legitimate norms established by civil society and a system of power, money, and technology. When it is efficacious and legitimate, society prospers in its legitimacy. The blog post toys with the idea of normatively aligned algorithm law established in a similar way: through the norms established by civil society.

I wrote about this in 2014 and I’m surprised to find myself revisiting these themes in my work today on privacy by design.

What this requires, however, is that civil society must be able to engage in mathematical discourse, or mathematized discussion of norms. In other words, there has to be an intersection of civil society and science for this to make sense. I’m reminded by how inspired I’ve felt by Nick Doty’s work on multistakerholderism in Internet standards as a model.

I am more skeptical of this model than I have been before, if only because in the short term I’m unsure if a critical mass of scientific talent can engage with civil society well enough to change the law. This is because scientific talent is a form of capital which has no clear incentive for self-regulation. Relatedly, I’m no longer as confident that civil society carries enough clout to change policy. I must consider other options.

The other option, besides voicing ones concerns in civil society, is, of course, exit, in Hirschmann‘s sense. Theoretically an autonomous algorithmic law could be designed such that it encourages exit from other systems into itself. Or, more ecologically, competing autonomous (or decentralized, …) systems can be regulated by an exit mechanism. This is in fact what happens now with blockchain technology and cryptocurrency. Whenever there is a major failure of one of these currencies, there is a fork.

by Sebastian Benthall at November 28, 2017 04:25 PM

November 27, 2017

Ph.D. student

Re: Tear down the new institutions

Hiya Ben,

And with enough social insight, you can build community standards into decentralized software.

Yes! I might add, though, that community standards don't need to be enacted entirely in the source code, although code could certainly help. I was in New York earlier this month talking with Cornell Tech folks (for example, Helen Nissenbaum, a philosopher) about exactly this thing: there are "handoffs" between human and technical mechanisms to support values in sociotechnical systems.

What makes federated social networking like Mastodon most of interest to me is that different smaller communities can interoperate while also maintaining their own community standards. Rather than every user having to maintain massive blocklists or trying alone to encourage better behavior in their social network, we can support admins and moderators, self-organize into the communities we prefer and have some investment in, and still basically talk with everyone we want to.

As I understand it, one place to have this design conversation is the Social Web Incubator Community Group (SocialCG), which you can find on W3C IRC (#social) and Github (but no mailing list!), and we talked about harassment challenges at a small face-to-face Social Web meeting at TPAC a few weeks back. Or I'm; there is a special value (in a Kelty recursive publics kind of way) in using a communication system to discuss its subsequent design decisions. I think, as you note, that working on mitigations for harassment and abuse (whether it's dogpiling or fake news distribution) in the fediverse is an urgent and important need.

In a way, then, I guess I'm looking to the creation of new institutions, rather than their dismantling. Or, as cwebber put it:

I'm not very interested in how to tear systems down nearly as much as what structure to replace them with (and how you realistically think we'll get there)

While I agree that the outsize power of large social networking platforms can be harmful even as it seemed to disrupt old gatekeepers, I do want to create new institutions, institutions that reflect our values and involve widespread participation from often underserved groups. The utopia that "everything would be free" doesn't really work for autonomy, free expression and democracy, rather, we need to build the system we really want. We need institutions both in the sense of valued patterns of behavior and in the sense of community organizations.

If you're interested in helping or have suggestions of people that are, do let me know.

Some links:

by at November 27, 2017 11:55 PM

November 26, 2017

MIMS 2012

My Talk at Lean Kanban Central Europe 2017

On a chilly fall day a few weeks back, I gave a talk at the cozy Lean Kanban Central Europe in Hamburg, Germany. I was honored to be invited to give a reprise of the talk I gave with Keith earlier this year at Lean Kanban North America.

I spoke about Optimizely’s software development process, and how we’ve used ideas from Lean Kanban and ESP (Enterprise Service Planning) to help us ship faster, with higher quality, to better meet customer needs. Overall it went well, but I had too much content and rushed at the end. If I do this talk again, I would cut some slides and make the presentation more focused and concise. Watch the talk below.

Jeff Zych - From 20/20 Hindsight to ESP at Optimizely @ LKCE17 from Lean Kanban Central Europe on Vimeo.


One of the cool things this conference does is give the audience green, yellow, and red index cards they can use to give feedback to the speakers. Green indicates you liked the talk, red means you didn’t like it, and yellow is neutral.

I got just one red card, with the comment, “topic title not accurate (this is not ESP?!).” In retrospect, I realized this person is correct — my talk really doesn’t talk about ESP much. I touch on it, but that was what Keith covered. Since he dropped out, I mostly cut those sections of the presentation since I can’t speak as confidently about them. If I did this talk solo again, I would probably change the title. So thank you, anonymous commenter 🙏

I also got two positive comments on green cards:

Thanks for sharing. Some useful insights + good to see it used in industry. - Thanks.


Thank you! Great examples, (maybe less slides next time?) but this was inspiring

I also got some good tweets, like this and this.

by Jeff Zych at November 26, 2017 10:57 PM

Ph.D. student


Sometimes traffic on this blog draws attention to an old post from years ago. This can be a reminder that I’ve been repeating myself, encountering the same themes over and over again. This is not necessarily a bad thing, because I hope to one day compile the ideas from this blog into a book. It’s nice to see what points keep resurfacing.

One of these points is that liberalism assumes equality, but this challenged by society’s need for control structures, which creates inequality, which then undermines liberalism. This post calls in Charles Taylor (writing about Hegel!) to make the point. This post makes the point more succinctly. I’ve been drawing on Beniger for the ‘society needs control to manage its own integration’ thesis. I’ve pointed to the term managerialism as referring to an alternative to liberalism based on the acknowledgement of this need for control structures. Managerialism looks a lot like liberalism, it turns out, but it justifies things on different grounds and does not get so confused. As an alternative, more Bourdieusian view of the problem, I consider the relationship between capital, democracy, and oligarchy here. There are some useful names for what happens when managerialism goes wrong and people seem disconnected from each other–anomie–or from the control structures–alienation.

A related point I’ve made repeatedly is the tension between procedural legitimacy and getting people the substantive results that they want. That post about Hegel goes into this. But it comes up again in very recent work on antidiscrimination law and machine learning. What this amounts to is that attempts to come up with a fair, legitimate procedure are going to divide up the “pie” of resources, or be perceived to divide up the pie of resources, somehow, and people are going to be upset about it, however the pie is sliced.

A related theme that comes up frequently is mathematics. My contention is that effective control is a technical accomplishment that is mathematically optimized and constrained. There are mathematical results that reveal necessary trade-offs between values. Data science has been misunderstood as positivism when in fact it is a means of power. Technical knowledge and technology are forms of capital (Bourdieu again). Perhaps precisely because it is a rare form of capital, science is politically distrusted.

To put it succinctly: lack of mathematics education, due to lack of opportunity or mathophobia, lead to alienation and anomie in an economy of control. This is partly reflected in the chaotic disciplinarity of the social sciences, especially as they react to computational social science, at the intersection of social sciences, statistics, and computer science.

Lest this all seem like an argument for the mathematical certitude of totalitarianism, I have elsewhere considered and rejected this possibility of ‘instrumentality run amok‘. I’ve summarized these arguments here, though this appears to have left a number of people unconvinced. I’ve argued this further, and think there’s more to this story (a formalization of Scott’s arguments from Seeing Like a State, perhaps), but I must admit I don’t have a convincing solution to the “control problem” yet. However, it must be noted that the answer to the control problem is an empirical or scientific prediction, not a political inclination. Whether or not it is the most interesting or important question regarding technological control has been debated to a stalemate, as far as I can tell.

As I don’t believe singleton control is a likely or interesting scenario, I’m more interested in practical ways of offering legitimacy or resistance to control structures. I used to think the “right” political solution was a kind of “hacker class consciousness“; I don’t believe this any more. However, I still think there’s a lot to the idea of recursive publics as actually existing alternative power structures. Platform coops are interesting for the same reason.

All this leads me to admit my interest in the disruptive technology du jour, the blockchain.

by Sebastian Benthall at November 26, 2017 05:44 AM

November 24, 2017

Ph.D. student

Values in design and mathematical impossibility

Under pressure from the public and no doubt with sincere interest in the topic, computer scientists have taken up the difficulty task of translating commonly held values into the mathematical forms that can be used for technical design. Commonly, what these researches discover is some form of mathematical impossibility of achieving a number of desirable goals at the same time. This work has demonstrated the impossibility of having a classifier that is fair with respect to a social category without data about that very category (Dwork et al., 2012), having a fair classifier that is both statistically well calibrated for the prediction of properties of persons and equalizing the false positive and false negative rates of partitions of that population (Klienberg et al., 2016), of preserving privacy of individuals after an arbitrary number of queries to a database, however obscured (Dwork, 2008), or of a coherent notion of proxy variable use in privacy and fairness applications that is based on program semantics (as opposed to syntax) (Datta et al., 2017).

These are important results. An important thing about them is that they transcend the narrow discipline in which they originated. As mathematical theorems, they will be true whether or not they are implemented on machines or in human behavior. Therefore, these theorems have a role comparable to other core mathematical theorems in social science, such as Arrow’s Impossibility Theorem (Arrow, 1950), a theorem about the impossibility of having a voting system with reasonable desiderata for determining social welfare.

There can be no question of the significance of this kind of work. It was significant a hundred years ago. It is perhaps of even more immediate, practical importance when so much public infrastructure is computational. For what computation is is automation of mathematics, full stop.

There are some scholars, even some ethicists, for whom this is an unwelcome idea. I have been recently told by one ethics professor that to try to mathematize core concepts in ethics is to commit a “category mistake”. This is refuted by the clearly productive attempts to do this, some of which I’ve cited above. This belief that scientists and mathematicians are on a different plane than ethicists is quite old: Hannah Arendt argued that scientists should not be trusted because their mathematical language prevented them from engaging in normal political and ethical discourse (Arendt, 1959). But once again, this recent literature (as well as much older literature in such fields as theoretical economics) demonstrates that this view is incorrect.

There are many possible explanations for the persistence of the view that mathematics and the hard sciences do not concern themselves with ethics, are somehow lacking in ethical education, or that engineers require non-technical people to tell them how to engineer things more ethically.

One reason is that the sciences are much broader in scope than the ethical results mentioned here. It is indeed possible to get a specialist’s education in a technical field without much ethical training, even in the mathematical ethics results mentioned above.

Another reason is that whereas understanding the mathematical tradeoffs inherent in certain kinds of design is an important part of ethics, it can be argued by others that what’s most important about ethics is some substantive commitment that cannot be mathematically defended. For example, suppose half the population believes that it is most ethical for members of the other half to treat them with special dignity and consideration, at the expense of the other half. It may be difficult to arrive at this conclusion from mathematics alone, but this group may advocate for special treatment out of ethical consideration nonetheless.

These two reasons are similar. The first states that mathematics includes many things that are not ethics. The second states that ethics potentially (and certainly in the minds of some people) includes much that is not mathematical.

I want to bring up a third reason, which is perhaps more profound than the other two, which is this: what distinguishes mathematics as a field is its commitment to logical non-contradiction, which means that it is able to baldly claim when goals a impossible to achieve, Acknowledging tradeoffs is part of what mathematicians and scientists do.

Acknowledging tradeoffs is not something that everybody else is trained to do, and indeed many philosophers are apparently motivated by the ability to surpass limitations. Alain Badiou, who is one of the living philosophers that I find to be most inspiring and correct, maintains that mathematics is the science of pure Being, of all possibilities. Reality is just a subset of these possibilities, and much of Badiou’s philosophy is dedicated to the Event, those points where the logical constraints of our current worldview are defeated and new possibilities open up.

This is inspirational work, but it contradicts what many mathematicians do in fact, which is identity impossibility. Science forecloses possibilities where a poet may see infinite potential.

Other ethicists, especially existentialist ethicists, see the limitation and expansion of possibility, especially in the possibility of personal accomplishment, as fundamental to ethics. This work is inspiring precisely because it states so clearly what it is we hope for and aspire to.

What mathematical ethics often tells us is that these hopes are fruitless. The desiderata cannot be met. Somebody will always get the short stick. Engineers, unable to triumph against mathematics, will always disappoint somebody, and whoever that somebody is can always argue that the engineers have neglected ethics, and demand justice.

There may be good reasons for making everybody believe that they are qualified to comment on the subject of ethics. Indeed, in a sense everybody is required to act ethically even when they are not ethicists. But the preceding argument suggests that perhaps mathematical education is an essential part of ethical education, because without it one can have unrealistic expectations of the ethics of others. This is a scary thought because mathematics education is so often so poor. We live today, as we have lived before, in a culture with great mathophobia (Papert, 1980) and this mathophobia is perpetuated by those who try to equate mathematical training with immorality.


Arendt, Hannah. The human condition:[a study of the central dilemmas facing modern man]. Doubleday, 1959.

Arrow, Kenneth J. “A difficulty in the concept of social welfare.” Journal of political economy 58.4 (1950): 328-346.

Benthall, Sebastian. “Philosophy of computational social science.” Cosmos and History: The Journal of Natural and Social Philosophy 12.2 (2016): 13-30.

Datta, Anupam, et al. “Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs.” arXiv preprint arXiv:1705.07807 (2017).

Dwork, Cynthia. “Differential privacy: A survey of results.” International Conference on Theory and Applications of Models of Computation. Springer, Berlin, Heidelberg, 2008.

Dwork, Cynthia, et al. “Fairness through awareness.” Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. ACM, 2012.

Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. “Inherent trade-offs in the fair determination of risk scores.” arXiv preprint arXiv:1609.05807 (2016).

Papert, Seymour. Mindstorms: Children, computers, and powerful ideas. Basic Books, Inc., 1980.

by Sebastian Benthall at November 24, 2017 11:09 PM

November 22, 2017

Ph.D. student

Pondering “use privacy”

I’ve been working carefully with Datta et al.’s “Use Privacy” work (link), which makes a clear case for how a programmatic, data-driven model may be statically analyzed for its use of a proxy of a protected variable, and repaired.

Their system has a number of interesting characteristics, among which are:

  • The use of a normative oracle for determining which proxy uses are prohibited.
  • A proof that there is no coherent definition of proxy use which has all of a set of very reasonable properties defined over function semantics.

Given (2), they continue with a compelling study of how a syntactic definition of proxy use, one based on the explicit contents of a function, can support a system of detecting and repairing proxies.

My question is to what extent the sources of normative restriction on proxies (those characterized by the oracle in (1)) are likely to favor syntactic proxy use restrictions, as opposed to semantic ones. Since ethicists and lawyers, who are the purported sources of these normative restrictions, are likely to consider any technical system a black box for the purpose of their evaluation, they will naturally be concerned with program semantics. It may be comforting for those responsible for a technical program to be able to, in a sense, avoid liability by assuring that their programs are not using a restricted proxy. But, truly, so what? Since these syntactic considerations do not make any semantic guarantees, will they really plausibly address normative concerns?

A striking result from their analysis which has perhaps broader implications is the incoherence of a semantic notion of proxy use. Perhaps sadly but also substantively, this result shows that a certain plausible normative is impossible for a system to fulfill in general. Only restricted conditions make such a thing possible. This seems to be part of a pattern in these rigorous computer science evaluations of ethical problems; see also Kleinberg et al. (2016) on how it’s impossible to meet several plausible definitions of “fairness” in the risk-assessment scores across social groups except under certain conditions.

The conclusion for me is that what this nobly motivated computer science work reveals is that what people are actually interested in normatively is not the functioning of any particular computational system. They are rather interested in social conditions more broadly, which are rarely aligned with our normative ideals. Computational systems, by making realities harshly concrete, are disappointing, but it’s a mistake to make that a disappointment with the computing systems themselves. Rather, there are mathematical facts that are disappointing regardless of what sorts of systems mediate our social world.

This is not merely a philosophical consideration or sociological observation. Since the the interpretation of laws are part of the process of informing normative expectations (as in a normative oracle), it is an interesting an perhaps open question how lawyers and judges, in their task of legal interpretation, make use of the mathematical conclusions about normative tradeoffs being offered up by computer scientists.


Datta, Anupam, et al. “Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs.” arXiv preprint arXiv:1705.07807 (2017).

Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. “Inherent trade-offs in the fair determination of risk scores.” arXiv preprint arXiv:1609.05807 (2016).

by Sebastian Benthall at November 22, 2017 06:02 PM

Ph.D. student

Interrogating Biosensing Privacy Futures with Design Fiction (video)


I presented this talk in November 2017, at the Berkeley I School PhD Research Reception. The talk discusses findings from 2 of our papers:

Richmond Y. Wong, Ellen Van Wyk and James Pierce. (2017). Real-Fictional Entanglements: Using Science Fiction and Design Fiction to Interrogate Sensing Technologies. In Proceedings of the ACM Conference on Designing Interactive Systems (DIS ’17).

Richmond Y. Wong, Deirdre K. Mulligan, Ellen Van Wyk, James Pierce and John Chuang. (2017). Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks. Proceedings of the ACM Human Computer Interaction (CSCW 2018 Online First). 1, 2, Article 111 (November 2017), 27 pages.

More about this project and some of the designs can be found here:

by Richmond at November 22, 2017 05:25 PM

November 19, 2017

Ph.D. student

On achieving social equality

When evaluating a system, we have a choice of evaluating its internal functions–the inside view–or evaluating its effects situated in a larger context–the outside view.

Decision procedures (whether they are embodied by people or performed in concert with mechanical devices–I don’t think this distinction matters here) for sorting people are just such a system. If I understand correctly, the question of which principles animate antidiscrimination law hinge on this difference between the inside and outside view.

We can look at a decision-making process and evaluate whether as a procedure it achieves its goals of e.g. assigning credit scores without bias against certain groups. Even including processes of the gathering of evidence or data in such a system, it can in principle be bounded and evaluated by its ability to perform its goals. We do seem to care about the difference between procedural discrimination and procedural nondiscrimination. For example, an overtly racist policy that ignores truly talent and opportunity seems worse than a bureaucratic system that is indifferent to external inequality between groups that then gets reflected in decisions made according to other factors that are merely correlated with race.

The latter case has been criticized in the outside view. The criticism is captured by the phrasing that “algorithms can reproduce existing biases”. The supposedly neutral algorithm (which can, again, be either human or machine) is not neutral in its impact because in making its considerations of e.g. business interest are indifferent to the conditions outside it. The business is attracted to wealth and opportunity, which are held disproportionately by some part of the population, so the business is attracted to that population.

There is great wisdom in recognizing that institutions that are neutral in their inside view will often reproduce bias in the outside view. But it is incorrect to therefore conflate neutrality in the inside view with a biased inside view, even though their effects may be under some circumstances the same. When I say it is “incorrect”, I mean that they are in fact different because, for example, if the external conditions of procedurally neutral institution change, then it will reflect those new conditions. A procedurally biased institution will not reflect those new conditions in the same way.

Empirically it is very hard to tell when an institution is being procedurally neutral and indeed this is the crux of an enormous amount of political tension today. The first line of defense of an institution blamed of bias is to claim that their procedural neutrality is merely reflecting environmental conditions outside of its control. This is unconvincing for many politically active people. It seems to me that it is now much more common for institutions to avoid this problem by explicitly declaring their bias. Rather than try to accomplish the seemingly impossible task of defending their rigorous neutrality, it’s easier to declare where one stands on the issue of resource allocation globally and adjust ones procedure accordingly.

I don’t think this is a good thing.

One consequence of evaluating all institutions based on their global, “systemic” impact as opposed to their procedural neutrality is that it hollows out the political center. The evidence is in that politics has become more and more polarized. This is inevitable if politics becomes so explicitly about maintaining or reallocating resources as opposed to about building neutrally legitimate institutions. When one party in Congress considers a tax bill which seems designed mainly to enrich ones own constituencies at the expense of the other’s things have gotten out of hand. The idea of a unified idea of ‘good government’ has been all but abandoned.

An alternative is a commitment to procedural neutrality in the inside view of institutions, or at least some institutions. The fact that there are many different institutions that may have different policies is indeed quite relevant here. For while it is commonplace to say that a neutral institution will “reproduce existing biases”, “reproduction” is not a particularly helpful word here. Neither is “bias”. What we can say more precisely is that the operations of procedurally neutral institution will not change the distribution of resources even though they are unequal.

But if we do not hold all institutions accountable for correcting the inequality of society, isn’t that the same thing as approving of the status quo, which is so unequal? A thousand times no.

First, there’s the problem that many institutions are not, currently, procedurally neutral. Procedural neutrality is a higher standard than what many institutions are currently held to. Consider what is widely known about human beings and their implicit biases. One good argument for transferring decision-making authority to machine learning algorithms, even standard ones not augmented for ‘fairness’, is that they will not have the same implicit, inside, biases as the humans that currently make these decisions.

Second, there’s the fact that responsibility for correcting social inequality can be taken on by some institutions that are dedicated to this task while others are procedurally neutral. For example, one can consistently believe in the importance of a progressive social safety net combined with procedurally neutral credit reporting. Society is complex and perhaps rightly has many different functioning parts; not all the parts have to reflect socially progressive values for the arc of history to bend towards justice.

Third, there is reason to believe that even if all institutions were procedurally neutral, there would eventually be social equality. This has to do with the mathematically bulletproof but often ignored phenomenon of regression towards the mean. When values are sampled from a process at random, their average will approach the mean of the distribution as more values are accumulated. In terms of the allocation of resources in a population, there is some random variation in the way resources flow. When institutions are fair, inequality in resource allocation will settle into an unbiased distribution. While their may continue to be some apparent inequality due to disorganized heavy tail effects, these will not be biased, in a political sense.

Fourth, there is the problem of political backlash. Whenever political institutions are weak enough to be modified towards what is purported to be a ‘substantive’ or outside view neutrality, that will always be because some political coalition has attained enough power to swing the pendulum in their favor. The more explicit they are about doing this, the more it will mobilize the enemies of this coallition to try to swing the pendulum back the other way. The result is war by other means, the outcome of which will never be fair, because in war there are many who wind up dead or injured.

I am arguing for a centrist position on these matters, one that favors procedural neutrality in most institutions. This is not because I don’t care about substantive, “outside view” inequality. On the contrary, it’s because I believe that partisan bickering that explicitly undermines the inside neutrality of institutions undermines substantive equality. Partisan bickering over the scraps within narrow institutional frames is a distraction from, for example, the way the most wealthy avoid taxes while the middle class pays even more. There is a reason why political propaganda that induces partisan divisions is a weapon. Agreement about procedural neutrality is a core part of civic unity that allows for collective action against the very most abusively powerful.


Zachary C. Lipton, Alexandra Chouldechova, Julian McAuley. “Does mitigating ML’s disparate impact require disparate treatment?” 2017

by Sebastian Benthall at November 19, 2017 06:03 PM

November 18, 2017

Ph.D. student

what to do about the blog

Initially, I thought, I needed to get to load over HTTPS. Previously I had been using TLS transit part of the way using Cloudflare, but I've moved away from that, I'd rather not have the additional service, it was only a partial solution, and I'm tired of seeing Certificate Transparency alerts from Facebook when CloudFlare creates a new cert every week for my domain name and a thousand others, but now I've heard that Google has announced good HTTPS support for custom domain names when using Google App Engine and so I should be good to go. HTTPS is important, and I should fix that before I post more on this blog.

I was plagued for weeks trying to use Google's new developer console, reading through various documentation that was out of date, confronted by the vaguest possible error messages. Eventually, I discover that there's just a bug for most or all long-time App Engine users who created custom domains on applications years ago using a different system; the issue is acknowledged; no timeline for a fix; no documentation; no workaround.* Just a penalty for being a particularly long-time customer. Meanwhile, Google is charging me for server time on the blog that sees no usage, for some other reason I haven't been able to nail down.

I start to investigate other blogging software: is Ghost the preferred customizable blogging platform these days? What about static-site generation, from Jekyll, or Hugo? Can I find something written in a language where I could comfortably customize it (JavaScript, Python) and still have a well-supported and simple infrastructure for creating static pages that I can easily host on my existing simple infrastructure? I go through enough of the process to actually set up a sample Ghost installation on WebFaction, before realizing (and I really credit the candor of their documentation here) that this is way too heavyweight for what I'm trying to do.

Ah, I fell into that classic trap! This isn't blogging. This isn't even working on building a new and better blogging infrastructure or social media system. This isn't writing prose, this isn't writing code. This is meta-crap, this is clicking around, comparing feature lists, being annoyed about technology. So, to answer the original small question to myself "what to do about the blog", how about, for now, "just fucking post on whatever infrastructure you've got".


* I see that at least one of the bugs has some updates now, and maybe using a different (command-line) tool I could unblock myself with that particular sub-issue.
Maybe. Or maybe I would hit their next undocumented error message and get stuck again, having invested several more hours in it. And it does actually seem important to move away from this infrastructure; I'm not really sure to what extent Google is supporting it, but I do know that when I run into completely blocking issues that there is no way for me to contact Google's support team or get updates on issues (beyond, search various support forums for hours to reverse-engineer your problem, see if there's an open bug on their issue tracker, click Star), and that in the meantime they are charging me what I consider a significant amount of money.

by at November 18, 2017 09:43 PM

November 15, 2017

Ph.D. student

Notes on fairness and nondiscrimination in machine learning

There has been a lot of work done lately on “fairness in machine learning” and related topics. It cannot be a coincidence that this work has paralleled a rise in political intolerance that is sensitized to issues of gender, race, citizenship, and so on. I more or less stand by my initial reaction to this line of work. But very recently I’ve done a deeper and more responsible dive into this literature and it’s proven to be insightful beyond the narrow problems which it purports to solve. These are some notes on the subject, ordered so as to get to the point.

The subject of whether and to what extent computer systems can enact morally objectionable bias goes back at least as far as Friedman and Nissenbaum’s 1996 article, in which they define “bias” as systematic unfairness. They mean this very generally, not specifically in a political sense (though inclusive of it). Twenty years later, Kleinberg et al. (2016) prove that there are multiple, competing notions of fairness in machine classification which generally cannot be satisfied all at once; they must be traded off against each other. In particular, a classifier that uses all available information to optimize accuracy–one that achieves what these authors call calibration–cannot also have equal false positive and false negative rates across population groups (read: race, sex), properties that Hardt et al. (2016) call “equal opportunity”. This is no doubt inspired by a now very famous ProPublica article asserting that a particular kind of commercial recidivism prediction software was “biased against blacks” because it had a higher false positive rate for black suspects than white offenders. Because bail and parole rates are set according to predicted recidivism, this led to cases where a non-recidivist was denied bail because they were black, which sounds unfair to a lot of people, including myself.

While I understand that there is a lot of high quality and well-intentioned research on this subject, I haven’t found anybody who could tell me why the solution to this problem was to stop using predicted recidivism to set bail, as opposed to futzing around with a recidivism prediction algorithm which seems to have been doing its job (Dieterich et al., 2016). Recidivism rates are actually correlated with race (Hartney and Vuong, 2009). This is probably because of centuries of systematic racism. If you are serious about remediating historical inequality, the least you could do is cut black people some slack on bail.

This gets to what for me is the most baffling aspect of this whole research agenda, one that I didn’t have the words for before reading Barocas and Selbst (2016). A point well-made by them is that the interpretation anti-discrimination law, which motivates a lot of this research, is fraught with tensions that complicate its application to data mining.

“Two competing principles have always undergirded anti-discrimination law: nondiscrimination and antisubordination. Nondiscrimination is the narrower of the two, holding that the responsibility of the law is to eliminate the unfairness individuals experience a the hands of decisionmakers’ choices due to membership in certain protected classes. Antisubordination theory, in contrast, holds that the goal of antidiscrimination law is, or at least should be, to eliminate status-based inequality due to membership in those classes, not as a matter of procedure, but substance.” (Barocas and Selbst, 2016)

More specifically, these two principles motivate different interpretations of the two pillars of anti-discrimination law, disparate treatment and disparate impact. I draw on Barocas and Selbst for my understanding of each:

A judgment of disparate treatment requires either a formal disparate treatment (across protected groups) of similarly situated people, or an intent to discriminate. Since in a large data mining application protected group membership will be proxied by many other factors, it’s not clear if the ‘formal’ requirement makes much sense here. And since machine learning applications only very rarely have racist intent, that option seems challengeable as well. While there are interpretations of these criteria that are tougher on decision-makers (i.e. unconscious intents), these seem to be motivated by antisubordination rather than the weaker nondiscrimination principle.

A judgment of disparate impact is perhaps more straightforward, but it can be mitigated in cases of “business necessity”, which (to get to the point) is vague enough to plausibly include optimization in a technical sense. Once again, there is nothing to see here from a nondiscrimination standpoint, though a nonsubordinationist would rather that these decision-makers have to take correcting for historical inequality into account.

I infer from their writing that Barocas and Selbst believe that nonsubordination is an important principle for nondiscrimination. In any case, they maintain that making the case for applying nondiscrimination laws to data mining effectively requires a commitment to “substantive remediation”. This is insightful!

Just to put my cards on the table: as much as I may like the idea of substantive remediation in principle, I personally don’t think that every application of nondiscrimination law needs to be animated by it. For many institutions, narrow nondiscrimination seems to be adequate if not preferable. I’d prefer remediation to occur through other specific policies, such as more public investment in schools in low-income districts. Perhaps for this reason, I’m not crazy about “fairness in machine learning” as a general technical practice. It seems to me to be trying to solve social problems with a technical fix, which despite being quite technical myself I don’t always see as a good idea. It seems like in most cases you could have a machine learning mechanism based on normal statistical principles (the learning step) and then use a decision procedure separately that achieves your political ends.

I wish that this research community (and here I mean more the qualitative research community surrounding it more than the technical community, which tends to define its terms carefully) would be more careful about the ways it talks about “bias”, because often it seems to encourage a conflation between statistical or technical senses of bias and political senses. The latter carry so much political baggage that it can be intimidating to try to wade in and untangle the two senses. And it’s important to do this untangling, because while bad statistical bias can lead to political bias, it can, depending on the circumstances, lead to either “good” or “bad” political bias. But it’s important, from the sake of numeracy (mathematical literacy) to understand that even if a statistically bad process has a politically “good” outcome, that is still, statistically speaking, bad.

My sense is that there are interpretations of nondiscrimination law that make it illegal to make certain judgments taking into account certain facts about sensitive properties like race and sex. There are also theorems showing that if you don’t take into account those sensitive properties, you are going to discriminate against them by accident because those sensitive variables are correlated with anything else you would use to judge people. As a general principle, while being ignorant may sometimes make things better when you are extremely lucky, in general it makes things worse! This should be a surprise to nobody.


Barocas, Solon, and Andrew D. Selbst. “Big data’s disparate impact.” (2016).

Dieterich, William, Christina Mendoza, and Tim Brennan. “COMPAS risk scales: Demonstrating accuracy equity and predictive parity.” Northpoint Inc (2016).

Friedman, Batya, and Helen Nissenbaum. “Bias in computer systems.” ACM Transactions on Information Systems (TOIS) 14.3 (1996): 330-347.

Hardt, Moritz, Eric Price, and Nati Srebro. “Equality of opportunity in supervised learning.” Advances in Neural Information Processing Systems. 2016.

Hartney, Christopher, and Linh Vuong. “Created equal: Racial and ethnic disparities in the US criminal justice system.” (2009).

Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. “Inherent trade-offs in the fair determination of risk scores.” arXiv preprint arXiv:1609.05807 (2016).

by Sebastian Benthall at November 15, 2017 03:41 AM

November 10, 2017

Ph.D. student

Stewart+Brown Production Managment

Style Archive:  Early in my Stewart+Brown career, I was the production assistant and part of my job was tracking samples and production.  I kept a list on excel of all the styles we had ever made and one of my first programs ever was putting that list online.  I just looked at some of the other code on the site and after figuring out what an array was, made it work for me.  Over the years I built more and more functions into the system and by the time I left it had a life of it’s own and was responsible for tracking nearly every aspect of design, development and production.  It was so efficient that we even opted out of purchasing an expensive out of the box system that was really popular within the industry.  I like to pat myself on the back for that and the fact that even a year after I’ve left the system is still up and running with no major problems or errors.  What you see to the left is a list of all the styles for a season with boxes representing every season they have been produced in.

Style Info Page: This is the page you get to when you click from the previous page.  It displays all of the product information and is continually updated as the product is developed.  Available colors are specific to the season (as tracked by another tool) and are specific to each delivery.  This information is used everywhere this style is shown on the site and so updating information is as easy as updating it in this single location.
Autogenerated Line Sheets: This information in the style archive populates a linesheet that is used to send to buyers and showrooms to place their orders.  Previously, we build these files in Illustrator and they took forever to make and were always filled with errors since information is constantly changing.  Since all of the information was most up to date, I proposed converting the linesheet to being something created online.  I had to fight for it since others were afraid that it would compromise the design and layout.  I created the layout based on an existing line sheet and made if fully customizable.  You can control what styles go on which page, the order of the styles and even control which distributors could see which styles.  To create the pdf, one just hits print and the prints it to a pdf with out all the editing marks.  Works like a charm.
Style Orders Page:  This shows the order grids for this style.  Orders are input online in a common area and are used by the system to calculate the amount of fabric and materials to order in addition to providing a central location in which to view an share information.  Before this, all orders were on paper and revisions were lost and mistakes  rampant.  Even when we were just testing the system, we found a discrepancy between orders, hazzah!  Orders are also subdivided for delivery and tracked by projections and actuals.
Style Materials: This tab on the style shows the materials and amounts of materials used based on the order grids.  This aids the production team in tracking orders and pricing.
Color Archive:  Similar to the styles archive, another archive  exists for each color Stewart+Brown developed showing which season it was used as well as a swatch that is used in every other place on the site where this color is shown.  When adding a color, it checks to make sure the color code or name hasn’t been used.
Pattern Specs Tracking: Before a style can be sampled there has to be an approved pattern for the style.  It works as follows, the designer comes up with an idea for a garment.  The pattern maker (who’s craft is amazingly interesting to me) makes a pattern for the idea.  The pattern is sewn and fit and the pattern is adjusted for a better fit.  This database tracks these revisions and all the files are hosted on the server so they can be downloaded and shared at anytime.  Status updates are also applied to let others know when the pattern is approved or what adjustments it needed.

Fabric Usage Chart: This chart is an extension to our production tracking system. I love how colorful it is, I like to think that It makes looking at the information a little more fun. What we are showing in this chart is how much of each fabric it takes to make 1 of a certain style. They are ordered by fabrication (i.e. organic jersey, hemp-jersey, fleece) and totals are shown at the bottom of each fabrication as well as a grand total grid at the bottom of the sheet. This helps the girls in development get an accurate picture of how much fabric they will have to order for a given season. Before orders are actually placed, they can just get a rough idea by style and after orders are input, they get an even more accurate number because for each production run, the system will automatically calculate how much of each style was ordered and multiply it by the yield. This tool also references our “Spec Archive” where the girls in development and the pattern maker go to upload spec’s for each style and, if the spec is approved, the style shows up in yellow.


This Buyers Area is a password protected area on the Stewart+Brown site for retailers to go to preview the incoming season early. The design was based on our e-store but formatted so that the buyer can see each fabrication on it’s own page. Currently, I have the buyers area linked into a number of back end tools that manage the production and development process. If someone in production decides we’re not going run a color in a style, they can update it in the back-end and it will automatically update in the Buyers Area as well. This has really helped us cut down on communication errors and it ensures that buyers are always getting the most up to date information.

Buyers Area Style Selection: I built this tool that guides users through the process of adding a new season to the buyer area.  Each link on the left is a step in the order it should be performed and the first step is selecting the styles that you want to be shown.
Edit Fabrications and Sidebar Ordering:  In the buyers area, styles are organized by fabrication.  This interface allows you to edit the images shown at the header for each fabrication, the text used to describe the interface as well as the order of the fabrication on the sidebar.

Buyers Area Editing Sandbox: I realize the fact that using admin tools isn’t the ideal way for many people to add, edit and view information so I build this “sandbox” to use for editing.  It is an exact replica of the buyers area except that each editable field has a link to the place that information can be edited.

Admin Documentation: I set another wordpress blog to serve as a help area for anyone using the system.  Every tool I built has it’s own “how to” page and since it was built in WordPress, it came with the search functionality, the categorization and comment capabilities.  The comments have been used as a way to add to or comment on the help instructions.

by admin at November 10, 2017 01:39 AM

November 09, 2017

Center for Technology, Society & Policy

Data for Good Competition — Call for Proposals

The Center for Technology, Society & Policy (CTSP) seeks proposals for a Data for Good Competition. The competition will be hosted and promoted by CTSP in coordination with the UC Berkeley School of Information IMSA, and made possible through funds provided by Facebook.

Team proposals will apply data science skills to address a social good problem with public open data. The objective of the Data for Good Competition is to incentivize students from across the UC Berkeley campus to apply their data science skills towards a compelling public policy or social justice issue.

The competition is intended to encourage the creation of data tools or analyses of open data. Open datasets may be local, state, national, or international so long as they are publicly accessible. The data tool or analysis may include, but is not limited to:

  1. integration or combination of two or more disparate datasets, including integration with private datasets;
  2. data conversions into more accessible formats;
  3. visualization of data graphically, temporally, and/or spatially;
  4. data validations or verifications with other open data sources;
  5. platforms that help citizens access and/or manipulate data without coding experience; etc.

Issues that may be relevant and addressed via this competition include environmental issues, civic engagement (e.g., voting), government accountability, land use (e.g., housing challenges, agriculture), criminal justice, access to health care, etc. CTSP suggests that teams should consider using local or California state data since there may be additional opportunities for access and collaboration with agencies who produce and maintain these datasets.

The competition will consist of three phases:

  • an initial proposal phase when teams work on developing proposals
  • seed grant execution phase when selected teams execute on their proposals
  • final competition and presentation of completed projects at an event in early April 2018

Teams selected for the seed grant must be able to complete a working prototype or final product ready for demonstration at the final competition and presentation event. It is acceptable for submitted proposals to already have some groundwork already completed or serve as a substantial extension of an existing project, but we are looking to fund something novel and not already completed work.

Initial Proposal Phase

The initial proposal phase ends at 11:59pm (PST) on January 28th, 2018 when proposals are due. Proposals will then be considered against the guidelines below. CTSP will soon announce events to support teams in writing proposals and to share conversations on data for good and uses of public open data.

Note: This Data for Good Competition is distinct from the CTSP yearlong fellowship RFP.

Proposal Guidelines

Each team proposal (approximately 2-3 pages) is expected to answer the following questions:

Project Title and Team Composition

  • What is the title of your project, and the names, department affiliations, student classification (undergraduate/graduate), and email contact information?


  • What is the social good problem?
  • How do you know it is a real problem?
  • If you are successful how will your data science approach address this problem?  Who will use the data and how will they use it to address the problem?  


  • What public open data will you be using?

Output & Projected Timeframe

  • What will your output be? How may this be used by the public, stakeholders, or otherwise used to address your social good problem?
  • Outline a timeframe of how the project will be executed in order to become a finished product or working prototype by the April competition. Will any additional resources be needed in order to achieve the outlined goal?

Privacy Risks and Social Harms

  • What, if any, are the potential negative consequences of your project and how do you propose to minimize them? For example, does your project create new privacy risks?  Are there other social harms?  Is the risk higher for any particular group?  Alternatively, does your project aim to address known privacy risks, social harms, and/or aid open data practitioners in assessing risks associated with releasing data publicly?

Proposals will be submitted through the CTSP website. Successful projects will demonstrate knowledge of the proposed subject area by explaining expertise and qualifications of team members and/or citing sources that validate claims presented. This should be a well-developed proposal, and the team should be prepared to execute the project in a short timeframe before the competition. Please include all relevant information needed for CTSP evaluation–a bare bones proposal is unlikely to advance to the seed funding stage.

Seed Grant Phase

Four to six teams will advance to the seed grant phase. This will be announced in February 2018. Each member of an accepted project proposal team becomes a CTSP Data for Good grantee, and each team will receive $800 to support development of their project. If you pass to the seed grant phase we will be working with you to connect you with stakeholder groups and other resources to help improve the final product. CTSP will not directly provide teams with hardware, software, or data.

Final Competition and Presentation Phase

This phase consists of an April evening of public presentation before judges from academia, Facebook, and the public sector and a decision on the competition winner. The top team will receive $5000 and the runner-up will receive $2000. 

Note: The presentation of projects will support the remote participation of distance-learning Berkeley students, including Master of Information and Data Science (MIDS) students in the School of Information.

Final Judging Criteria

In addition to examining continued consideration of the project proposal guidelines, final projects will be judged by the following criteria and those judgments are final:

  • Quality of the application of data science skills
  • Demonstration of how the proposal or project addresses a social good problem
  • Advancing the use of public open data

After the Competition

Materials from the final event (e.g., video) and successful projects will be hosted on a public website for use by policymakers, citizens, and students. Teams will be encouraged to publish a blogpost on CTSP’s Citizen Technologist Blog sharing their motivation, process, and lessons learned.

General Rules

  • Open to current UC Berkeley students (undergraduate and graduate) from all departments (Teams with outside members will not be considered. However, teams that have a partnership with an external organization who might use the tool or analysis will be considered.)
  • Teams must have a minimum of two participants
  • Participants must use data sets that are considered public or open.

Code of Conduct

This code of conduct has been adapted from the 2017 Towards Inclusive Tech conference held at the UC Berkeley School of Information:

The organizers of this competition are committed to principles of openness and inclusion. We value the participation of every participant and expect that we will show respect and courtesy to one another during each phase and event in the competition. We aim to provide a harassment-free experience for everyone, regardless of gender, sexual orientation, disability, physical appearance, body size, race, or religion. Attendees who disregard these expectations may be asked to leave the competition. Thank you for helping make this a respectful and collaborative event for all.


Please direct all questions about the application or competition process to


Please submit your application at this link.

by Daniel Griffin at November 09, 2017 11:58 PM

Ph.D. student

Personal data property rights as privacy solution. Re: Cofone, 2017

I’m working my way through Ignacio Cofone’s “The Dynamic Effect of Information Privacy Law” (2017) (link), which is an economic analysis of privacy. Without doing justice to the full scope of the article, it must be said that it is a thorough discussion of previous information economics literature and a good case for property rights over personal data. In a nutshell, one can say that markets are good for efficient and socially desirable resource allocation, but they are only good at this when there are well crafted property rights to the goods involved. Personal data, like intellectual property, is a tricky case because of the idiosyncrasies of data–its has zero-ish marginal cost, it seems to get more valuable when it’s aggregated, etc. But like intellectual property, we should expect under normal economic rationality assumptions that the more we protect the property rights of those who create personal data, the more they will be incentivized to create it.

I am very warm to this kind of argument because I feel there’s been a dearth of good information economics in my own education, though I have been looking for it! I do believe there are economic laws and that they are relevant for public policy, let alone business strategy.

I have concerns about Cofone’s argument specifically, which are these:

First, I have my doubts that seeing data as a good in any classical economic sense is going to work. Ontologically, data is just too weird for a lot of earlier modeling methods. I have been working on a different way of modeling information flow economics that tries to capture how much of what we’re concerned with are information services, not information goods.

My other concern is that Cofone’s argument gives users/data subjects credit for being rational agents, capable of addressing the risks of privacy and acting accordingly. Hoofnagle and Urban (2014) show that this is empirically not the case. In fact, if you take the average person who is not that concerned about their privacy on-line and start telling them facts about how their data is being used by third-parties, etc., they start to freak out and get a lot more worried about privacy.

This throws a wrench in the argument that stronger personal data property rights would lead to more personal data creation, therefore (I guess it’s implied) more economic growth. People seem willing to create personal data and give it away, despite actual adverse economic incentives, because cat videos are just so damn appealing. Or something. It may generally be the case that economic modeling is used by information businesses but not information policy people because average users are just so unable to act rationally; it really is a domain better suited to behavioral economics and usability research.

I’m still holding out though. Just because big data subjects are not homo economicus doesn’t mean that an economic analysis of their activity is pointless. It just means we need to have a more sophisticated economic model, on that takes into account how there are many different classes of user that are differently informed. This kind of economic modeling, and empirically fitting it to data, is within our reach. We have the technology.


Cofone, Ignacio N. “The Dynamic Effect of Information Privacy Law.” Minn. JL Sci. & Tech. 18 (2017): 517.

Hoofnagle, Chris Jay, and Jennifer M. Urban. “Alan Westin’s privacy homo economicus.” (2014).

by Sebastian Benthall at November 09, 2017 08:44 PM

November 07, 2017

Ph.D. student

Why managerialism: it acknowledges political role of internal corporate policies

One modern difficulty with political theory in contemporary times is the confusion between government and corporate policy. This is due in no small part to the extent to which large corporations now mediate social life. Telecommunications, the Internet, mobile phones, and social media all depend on layers and layers of operating organizations. The search engine, which didn’t exist thirty years ago, now is arguably an essential cultural and political facility (Pasquale, 2011), which sharpens the concerns that have been raised about their politics (Introna and Nissenbaum, 2000; Bracha and Pasquale, 2007).

Corporate policies influence customers when those policies drive product design or are put into contractual agreements. They can also govern employees and shape corporate culture. Sometimes these two kinds of policies are not easily demarcated. For example, Uber has an internal privacy policy about who can access which users’ information, like most companies with a lot of user data. The privacy features that Uber implicitly guarantees to their customers are part of their service. But their ability to provide this service is only as good as their company culture is reliable.

Classically, there are states, which may or may not be corrupt, and there are markets, which may or may not be competitive. With competitive markets, corporate policies are part of what make firms succeed or fail. One point of success is a company’s ability to attract and maintain customers. This should in principle drive companies to improve their policies.

An interesting point made recently by Robert Post is that in some cases, corporate policies can adopt positions that would be endorsed by some legal scholars even if the actual laws state otherwise. His particular example was a case enforcing the right to be forgotten in Spain against Google.

Since European law is statute driven, the judgments of its courts are not amenable to creative legal reasoning as they are in the United States. Post’s criticism of the EU’s judgment in this case is because of their rigid interpetation of data protection directives. Post argues a different legal perspective on privacy is better at balancing other social interests. But putting aside the particulars of the law, Post makes the point that Google’s internal policy matches his own legal and philosophical framework (which prefers dignitary privacy over data privacy) more than EU statutes do.

One could argue that we should not trust the market to make Google’s policies just. But we could also argue that Google’s market share, which is significant, depends so much on its reputation and users trust that in fact it is under great pressure to adjucate disputes with its users wisely. It is a company that must set its own policies, which do have political significance. It has the benefits of more direct control over the way these policies get interpreted and enforced in the state, faster feedback on whether the policies are successful, and a less chaotic legislative process for establishing policy in the first place.

Political liberals would dismiss this kind of corporate control as just one commercial service among many, or else wring their hands with concern over a company coming to have such power over the public sphere. But managerialists would see the emergence of search engines as an organization among others, comparable to other private entities that have been part of the public sphere, such as newspapers.

But a sound analysis of the politics of search engines need not depend on analogies with past technologies. This is a function of legal reasoning. Managerialism, which is perhaps more a descendent of business reasoning, would ask how, in fact, search engines make policy decisions and how does this affect political outcomes. It does not prima facie assume that a powerful or important corporate policy is wrong. It does ask what the best corporate policy is, given a particular sector.


Bracha, Oren, and Frank Pasquale. “Federal Search Commission-Access, Fairness, and Accountability in the Law of Search.” Cornell L. Rev. 93 (2007): 1149.

Introna, Lucas D., and Helen Nissenbaum. “Shaping the Web: Why the politics of search engines matters.” The information society 16.3 (2000): 169-185.

Pasquale, Frank A. “Dominant search engines: an essential cultural & political facility.” (2011).

by Sebastian Benthall at November 07, 2017 04:15 AM

November 06, 2017

Ph.D. student

Why managerialism: it’s tolerant and meritocratic

In my last post, I argued that we should take managerialism seriously as a political philosophy. A key idea in managerialism (as I’m trying to define it) is that it acknowledges that sociotechnical organizations are relevant units of political power, and is concerned with the relationship between these organizations. These organizations can be functionally specific. They can have hierarchical, non-democratic control in limited, not totalitarian ways. They check and balance each other, probably. Managerialism tends to think that organizations can be managed well, and that good management matters, politically.

This is as opposed to liberalism, which is grounded in rights of the individual, which then becomes a foundation for democracy. It’s also opposed to communitarianism, which holds the political unit of interest to be a family unit or other small community. I’m positioning managerialism as a more cybernetic political idea, as well as one more adapted to present economic conditions.

It may sound odd to hear somebody argue in favor of managerialism. I’ll admit that I am doing so tentatively, to see what works and what doesn’t. Given that a significant percentage of American political thought now is considering such baroque alternatives to liberalism as feudalism and ethnic tribalism, perhaps because liberalism everywhere has been hijacked by plutocracy, it may not be crazy to discuss alternatives.

One reason why somebody might be attracted to managerialism is that it is (I’d argue) essentially tolerant and meritocratic. Sociotechnical organizations that are organized efficiently to perform their main function need not make a lot of demands of their members besides whatever protocols are necessary for the functioning of the whole. In many cases, this should lead to a basic indifference to race, gender, and class background, from the internal perspective of the organization. As there’s good research indicating that diversity leads to greater collective intelligence in organizations, there’s a good case for tolerant policies in managerial institutions. Merit, defined relative to the needs of the particular organization, would be the privileged personal characteristic here.

I’d like to distinguish managerialism from technocracy in the following sense, which may be a matter of my own terminological invention. Technocracy is the belief that experts should run the state. It offers an expansion of centralized power. Managerialism is, I want to argue, not compatible with centralized state control. Rather, it recognizes many different spheres of life that nevertheless need to be organized to be effective. These spheres or sectors will be individually managed, perhaps by competing organizations, but regulate each other more than they require central regulation.

The way these organizations can regulate each other is Exit, in Hirschman’s sense. While the ideas of Exit, Loyalty, and Voice are most commonly used to discuss how individuals can affect the organizations they are a part of, similar ideas can function at higher scales of analysis, as organizations interact with each other. Think about international trade agreements, and sanctions.

The main reason to support managerialism is not that it is particularly just or elegant. It’s that it is more or less the case that the political structures in place now are some assemblage of sociotechnical organizations interacting with each other. Those people who have power are those with power within one or more of these organizations. And to whatever extent there is a shared ideological commitment among people, it is likely because a sociotechnical organization has been turned to the effect of spreading that ideology. This is a somewhat abstract way of saying what lots of people say in a straightforward way all the time: that certain media institutions are used to propagate certain ideologies. This managerialist framing is just intended to abstract away from the particulars in order to develop a political theory.

by Sebastian Benthall at November 06, 2017 03:18 AM

November 05, 2017

Ph.D. student

Managerialism as political philosophy

Technologically mediated spaces and organizations are frequently described by their proponents as alternatives to the state. From David Clark’s maxim of Internet architecture, “We reject: kings, presidents and voting. We believe in: rough consensus and running code”, to cyberanarchist efforts to bypass the state via blockchain technology, to the claims that Google and Facebook, as they mediate between billions of users, are relevant non-state actor in international affairs, to Lessig’s (1999) ever prescient claim that “Code is Law”, there is undoubtedly something going on with technology’s relationship to the state which is worth paying attention to.

There is an intellectual temptation (one that I myself am prone to) to take seriously the possibility of a fully autonomous technological alternative to the state. Something like a constitution written in source code has an appeal: it would be clear, precise, and presumably based on something like a consensus of those who participate in its creation. It is also an idea that can be frightening (Give up all control to the machines?) or ridiculous. The example of The DAO, the Ethereum ‘distributed autonomous organization’ that raised millions of dollars only to have them stolen in a technical hack, demonstrates the value of traditional legal institutions which protect the parties that enter contracts with processes that ensure fairness in their interpretation and enforcement.

It is more sociologically accurate, in any case, to consider software, hardware, and data collection not as autonomous actors but as parts of a sociotechnical system that maintains and modifies it. This is obvious to practitioners, who spend their lives negotiating the social systems that create technology. For those for whom it is not obvious, there’s reams of literature on the social embededness of “algorithms” (Gillespie, 2014; Kitchin, 2017). These themes are recited again in recent critical work on Artificial Intelligence; there are those that wisely point out that a functioning artificially intelligent system depends on a lot of labor (those who created and cleaned data, those who built the systems they are implemented on, those that monitor the system as it operates) (Kelkar, 2017). So rather than discussing the role of particular technologies as alternatives to the state, we should shift our focus to the great variety of sociotechnical organizations.

One thing that is apparent, when taking this view, is that states, as traditionally conceived, are themselves sociotechnical organizations. This is, again, an obvious point well illustrated in economic histories such as (Beniger, 1986). Communications infrastructure is necessary for the control and integration of society, let alone effective military logistics. The relationship between those industrial actors developing this infrastructure, whether it be building roads, running a postal service, laying rail or telegram wires, telephone wires, satellites, Internet protocols, and now social media–and the state has always been interesting and a story of great fortunes and shifts in power.

What is apparent after a serious look at this history is that political theory, especially liberal political theory as it developed in the 1700’s an onward as a theory of the relationship between individuals bound by social contract emerging from nature to develop a just state, leaves out essential scientific facts of the matter of how society has ever been governed. Control of communications and control infrastructure has never been equally dispersed and has always been a source of power. Late modern rearticulations of liberal theory and reactions against it (Rawls and Nozick, both) leave out technical constraints on the possibility of governance and even the constitution of the subject on which a theory of justice would have its ground.

Were political theory to begin from a more realistic foundation, it would need to acknowledge the existence of sociotechnical organizations as a political unit. There is a term for this view, “managerialism“, which, as far as I can tell is used somewhat pejoratively, like “neoliberalism”. As an “-ism”, it’s implied that managerialism is an ideology. When we talk about ideologies, what we are doing is looking from an external position onto an interdependent set of beliefs in their social context and identifying, through genealogical method or logical analysis, how those beliefs are symptoms of underlying causes that are not precisely as represented within those beliefs themselves. For example, one critiques neoliberal ideology, which purports that markets are the best way to allocate resources and advocates for the expansion of market logic into more domains of social and political life, but pointing out that markets are great for reallocating resources to capitalists, who bankroll neoliberal ideologues, but that many people who are subject to neoliberal policies do not benefit from them. While this is a bit of a parody of both neoliberalism and the critiques of it, you’ll catch my meaning.

We might avoid the pitfalls of an ideological managerialism (I’m not sure what those would be, exactly, having not read the critiques) by taking from it, to begin with, only the urgency of describing social reality in terms of organization and management without assuming any particular normative stake. It will be argued that this is not a neutral stance because to posit that there is organization, and that there is management, is to offend certain kinds of (mainly academic) thinkers. I get the sense that this offendedness is similar to the offense taken by certain critical scholars to the idea that there is such a thing as scientific knowledge, especially social scientific knowledge. Namely, it is an offense taken to the idea that a patently obvious fact entails ones own ignorance of otherwise very important expertise. This is encouraged by the institutional incentives of social science research. Social scientists are required to maintain an aura of expertise even when their particular sub-discipline excludes from its analysis the very systems of bureaucratic and technical management that its university depends on. University bureaucracies are, strangely, in the business of hiding their managerialist reality from their own faculty, as alternative avenues of research inquiry are of course compelling in their own right. When managerialism cannot be contested on epistemic grounds (because the bluff has been called), it can be rejected on aesthetic grounds: managerialism is not “interesting” to a discipline, perhaps because it does not engage with the personal and political motivations that constitute it.

What sets managerialism aside from other ideologies, however, is that when we examine its roots in social context, we do not discover a contradiction. Managerialism is not, as far as I can tell, successful as a popular ideology. Managerialism is attractive only to that rare segment of the population that work closely with bureaucratic management. It is here that the technical constraints of information flow and its potential uses, the limits of autonomy especially as it confronts the autonomies of others, the persistence of hierarchy despite the purported flattening of social relations, and so on become unavoidable features of life. And though one discovers in these situations plenty of managerial incompetence, one also comes to terms with why that incompetence is a necessary feature of the organizations that maintain it.

Little of what I am saying here is new, of course. It is only new in relation to more popular or appealing forms of criticism of the relationship between technology, organizations, power, and ethics. So often the political theory implicit in these critiques is a form of naive egalitarianism that sees a differential in power as an ethical red flag. Since technology can give organizations a lot of power, this generates a lot of heat around technology ethics. Starting from the perspective of an ethicist, one sees an uphill battle against an increasingly inscrutable and unaccountable sociotechnical apparatus. What I am proposing is that we look at things a different way. If we start from general principles about technology its role in organizations–the kinds of principles one would get from an analysis of microeconomic theory, artificial intelligence as a mathematical discipline, and so on–one can try to formulate managerial constraints that truly confront society. These constraints are part of how subjects are constituted and should inform what we see as “ethical”. If we can broker between these hard constraints and the societal values at stake, we might come up with a principle of justice that, if unpopular, may at least be realistic. This would be a contribution, at the end of the day, to political theory, not as an ideology, but as a philosophical advance.


Beniger, James R. “The Control Revolution: Technological and Economic Origins of the.” Information Society (1986).

Bird, Sarah, et al. “Exploring or Exploiting? Social and Ethical Implications of Autonomous Experimentation in AI.” (2016).

Gillespie, Tarleton. “The relevance of algorithms.” Media technologies: Essays on communication, materiality, and society 167 (2014).

Kelkar, Shreeharsh. “How (Not) to Talk about AI.” Platypus, 12 Apr. 2017,

Kitchin, Rob. “Thinking critically about and researching algorithms.” Information, Communication & Society 20.1 (2017): 14-29.

Lessig, Lawrence. “Code is law.” The Industry Standard 18 (1999).

by Sebastian Benthall at November 05, 2017 06:45 PM

November 02, 2017

Ph.D. student

Robert Post on Data vs. Dignitary Privacy

I was able to see Robert Post present his article, “Data Privacy and Dignitary Privacy: Google Spain, the Right to Be Forgotten, and the Construction of the Public Sphere”, today. My other encounter with Post’s work was quite positive, and I was very happy to learn more about his thinking at this talk.

Post’s argument was based off of the facts of the Google Spain SL v. Agencia Española de Protección de Datos (“Google Spain”) case in the EU, which set off a lot of discussion about the right to be forgotten.

I’m not trained as a lawyer, and will leave the legal analysis to the verbatim text. There were some broader philosophical themes that resonate with topics I’ve discussed on this blog andt in my other research. These I wanted to note.

If I follow Post’s argument correctly, it is something like this:

  • According to EU Directive 95/46/EC, there are two kinds of privacy. Data privacy rules over personal data, establishing control and limitations on use of it. The emphasis is on the data itself, which is property reasoned about analogously to. Dignitary privacy is about maintaining appropriate communications between people and restricting those communications that may degrade, humiliate, or mortify them.
  • EU rules about data privacy are governed by rules specifying the purpose for which data is used, thereby implying that the use of this data must be governed by instrumental reason.
  • But there’s the public sphere, which must not be governed by instrumental reason, for Habermasian reasons. The public sphere is, by definition, the domain of communicative action, where actions must be taken with the ambiguous purpose of open dialogue. That is why free expression is constitutionally protected!
  • Data privacy, formulated as an expression of instrumental reason, is incompatible with the free expression of the public sphere.
  • The Google Spain case used data privacy rules to justify the right to be forgotten, and in this it developed an unconvincing and sloppy precedent.
  • Dignitary privacy is in tension with free expression, but not incompatible with it. This is because it is based not on instrumental reason, but rather on norms of communication (which are contextual)
  • Future right to be forgotten decisions should be made on the basis of dignitary privac. This will result in more cogent decisions.

I found Post’s argument very appealing. I have a few notes.

First, I had never made the connection between what Hildebrandt (2013, 2014) calls “purpose binding” in EU data protection regulation and instrumental reason, but there it is. There is a sense in which these purpose clauses are about optimizing something that is externally and specifically defined before the privacy judgment is made (cf. Tschantz, Datta, and Wing, 2012, for a formalization).

This approach seems generally in line with the view of a government as a bureaucracy primarily involved in maintaining control over a territory or population. I don’t mean this in a bad way, but in a literal way of considering control as feedback into a system that steers it to some end. I’ve discussed the pervasive theme of ‘instrumentality run amok’ in questions of AI superintelligence here. It’s a Frankfurt School trope that appears to have made its way in a subtle way into Post’s argument.

The public sphere is not, in Habermasian theory, supposed to be dictated by instrumental reason, but rather by communicative rationality. This has implications for the technical design of networked publics that I’ve scratched the surface of in this paper. By pointing to the tension between instrumental/purpose/control based data protection and the free expression of the public sphere, I believe Post is getting at a deep point about how we can’t have the public sphere be too controlled lest we lose the democratic property of self-governance. It’s a serious argument that probably should be addressed by those who would like to strengthen rights to be forgotten. A similar argument might be made for other contexts whose purposes seem to transcend circumscription, such as science.

Post’s point is not, I believe, to weaken these rights to be forgotten, but rather to put the arguments for them on firmer footing: dignitary privacy, or the norms of communication and the awareness of the costs of violating them. Indeed, the facts behind right to be forgotten cases I’ve heard of (there aren’t many) all seem to fall under these kinds of concerns (humiliation, etc.).

What’s very interesting to me is that the idea of dignitary privacy as consisting of appropriate communication according to contextually specific norms feels very close to Helen Nissenbaum’s theory of Contextual Integrity (2009), with which I’ve become very familiar in past year through my work with Prof. Nissenbaum. Contextual integrity posits that privacy is about adherence to norms of appropriate information flow. Is there a difference between information flow and communication? Isn’t Shannon’s information theory a “mathematical theory of communication”?

The question of whether and under what conditions information flow is communication and/or data are quite deep, actually. More on that later.

For now though it must be noted that there’s a tension, perhaps a dialectical one, between purposes and norms. For Habermas, the public sphere needs to be a space of communicative action, as opposed to instrumental reason. This is because communicative action is how norms are created: through the agreement of people who bracket their individual interests to discuss collective reasons.

Nissenbaum also has a theory of norm formation, but it does not depend so tightly on the rejection of instrumental reason. In fact, it accepts the interests of stakeholders as among several factors that go into the determination of norms. Other factors include societal values, contextual purposes, and the differentiated roles associated with the context. Because contexts, for Nissenbaum, are defined in part by their purposes, this has led Hildebrandt (2013) to make direct comparisons between purpose binding and Contextual Integrity. They are similar, she concludes, but not the same.

It would be easy to say that the public sphere is a context in Nissenbaum’s sense, with a purpose, which is the formation of public opinion (which seems to be Post’s position). Properly speaking, social purposes may be broad or narrow, and specially defined social purposes may be self-referential (why not?), and indeed these self-referential social purposes may be the core of society’s “self-consciousness”. Why shouldn’t there be laws to ensure the freedom of expression within a certain context for the purpose of cultivating the kinds of public opinions that would legitimize laws and cause them to adapt democratically? We could possibly make these frameworks more precise if we could make them a little more formal and could lose some of the baggage; that would be useful theory building in line with Nissenbaum and Post’s broader agendas.

A test of this perhaps more nuanced but still teleological (indeed, instrumental, but maybe actually more properly speaking pragmatic (a la Dewey), in that it can blend several different metaethical categories) is to see if one can motivate a right to be forgotten in a public sphere by appealing to the need for communicative action, thereby especially appropriate communication norms around it, and dignitary privacy.

This doesn’t seem like it should be hard to do at all.


Hildebrandt, Mireille. “Slaves to big data. Or are we?.” (2013).

Hildebrandt, Mireille. “Location Data, Purpose Binding and Contextual Integrity: What’s the Message?.” Protection of Information and the Right to Privacy-A New Equilibrium?. Springer International Publishing, 2014. 31-62.

Nissenbaum, Helen. Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press, 2009.

Post, Robert, Data Privacy and Dignitary Privacy: Google Spain, the Right to Be Forgotten, and the Construction of the Public Sphere (April 15, 2017). Duke Law Journal, Forthcoming; Yale Law School, Public Law Research Paper No. 598. Available at SSRN: or

Tschantz, Michael Carl, Anupam Datta, and Jeannette M. Wing. “Formalizing and enforcing purpose restrictions in privacy policies.” Security and Privacy (SP), 2012 IEEE Symposium on. IEEE, 2012.

by Sebastian Benthall at November 02, 2017 02:41 AM

October 30, 2017

MIMS 2012

My Progress with Hand Lettering

I started hand lettering about a year and half ago, and I thought it would be fun to see the progress I’ve made by comparing my early, crappy work to my recent work. I started hand lettering because a coworker of mine is a great letterer and I was inspired by the drawings he would make. I tried a few of his pens and found that trying to recreate the words he drew forced me to focus on the shape of the letter and the movement of the pen, which was intoxicating and meditative. So I bought a few pens and started practicing. Here’s the progress I’ve made so far.

Early Shiz

A bunch of shitty G’s. Poor control of the pen; poor letter shapes.

Goldsmiths. Better pen control and shapes, but still pretty bad.

A lot of really inconsistent and shaky “a’s” and “n’s”.

Happy Holidays. Better, but still some pretty poor loops and letter spacing.

Some really shitty looking M’s.

Newer Shiz

Fitter, Happier, More Productive Much cleaner and smoother. Better loops and letter spacing. You can also see my warm-ups at the top :)

Aluminum Better spacing, and fairly consistent thicks, thins, and letter shapes. I like drawing “aluminum” because it has a lot of repeating letters and shapes that require consistent strokes and spacing.

I still have a lot of room to improve, but compared to a year ago I’ve made a lot of progress. You can see all of my work at

by Jeff Zych at October 30, 2017 04:21 AM

October 29, 2017

Ph.D. student

A short introduction to existentialism

I’ve been hinting that a different moral philosophical orientation towards technical design, one inspired by existentialism, would open up new research problems and technical possibilities.

I am trying to distinguish this philosophical approach from consequentialist approaches that aim for some purportedly beneficial change in objective circumstances and from deontological approaches that codify the rights and duties of people towards each other. Instead of these, I’m interested in a philosophy that prioritizes individual meaningful subjective experiences. While it is possible that this reduces to a form of consequentialism, because of the shift of focus from objective consequences to individual situations in the phenomenological sense, I will bracket that issue for now and return to it when the specifics of this alternative approach have been fleshed out.

I have yet to define existentialism and indeed it’s not something that’s easy to pin down. Others have done it better than I will ever do; I recommend for example the Stanford Encyclopedia of Philosophy article on the subject. But here is what I am getting at by use of the term, in a nutshell:

In the mid-19th century, there was (according to Badiou) a dearth of good philosophy due to the new prestige of positivism, on the one hand, and the high quality of poetry, on the other. After the death of Hegel, who claimed to have solved all philosophical problems through his phenomenology of Spirit and its corollary, the science of Logic, arts and sciences became independent of each other. And as it happens during such periods, the people (of Europe, we’re talking about now) became disillusioned. The sciences undermined Christian metanarratives that had previously given life its meaningful through the promise of a heavenly afterlife to those who lived according to moral order. There was what has been called by subsequent scholars a “nihilism crisis”.

Friedrich Nietzsche began writing and shaking things up by proposing a new radical form of individualism that placed self-enhancement over social harmony. An important line of argumentation showed that the moral assumptions of conventional philosophy in his day contained contradictions and false promises that would lead the believer to either total disorientation or life-negating despair. What was needed was an alternative, and Nietzsche began working on one. It made the radical step of not grounding morality in abolishing suffering (which he believed was a necessary part of life) but rather in life itself. In his conception, what was most characteristic of life was the will to power, which has been characterized (by Bernard Reginster, I believe) as a second-order desire to overcome resistance in the pursuit of other, first-order desires. In other words, Nietzsche’s morality is based on the principle that the greatest good in life is to overcome adversity.

Nietzsche is considered one of the fathers of existentialist thought (though he is also considered many other things, as he is a writer known for his inconsistency). Another of these foundational thinkers is Søren Kierkegaard. Now that I look him up, I see that his life falls within what Badiou characterizes” the “age of poets” and/or the darkp age of 19th century philosophy, and I wonder if Badiou would consider him an exception. A difficult thing about Kierkegaard in terms of his relevance to today’s secular academic debates is that he was explicitly and emphatically working within a Christian framework. Without going too far into it, it’s worth noting a couple things about his work. In The Sickness Unto Death (1849), Kierkegaard also deals with the subject of despair and its relationship to ones capabilities. For Kierkegaard, a person is caught between their finite (which means “limited” in this context) existence with all of its necessary limitations and their desire to transcend these limitations and attain the impossible, the infinite. In his terminology, he discusses the finite self and the infinite self, because his theology allows for the idea that there is an infinite self, which is God, and that the important philosophical crisis is about establishing ones relationship to God despite the limitations of ones situation. Whereas Nietzsche proposes a project of individual self-enhancement to approach what was impossible, Kierkegaard’s solution is a Christian one: to accept Jesus and God’s love as the bridge between infinite potential and ones finite existence. This is not a universally persuasive solution, though I feel it sets up the problem rather well.

The next great existentialist thinker, and indeed to one who promoted the term “existentialism” as a philosophical brand, is
Jean-Paul Sartre. However, I find Sartre uninspiring and will ignore his work for now.

On the other hand, Simone de Beauvoir, who was closely associated with Sartre, has one of the best books on ethics and the human condition I’ve ever read, the highly readable The Ethics of Ambiguity (1949), the Marxists have kindly put on-line for your reading pleasure. This work lays out the ethical agenda of existentialism in phenomenological terms that resonate well with more contemporary theory. The subject finds itself in a situation (cf. theories of situated learning common now in HCI), in a place and time and a particular body with certain capacities. What is within the boundaries of their conscious awareness and capacity for action is their existence, and they are aware that beyond the boundaries of their awareness is Being, which is everything else. And what the subject strives for is to expand their existence in being, subsuming it. One can see how this synthesizes the positions of Nietzsche and Kierkegaard. Where de Beauvoir goes farther is the demonstration of how one can start from this characterization of the human condition and derive from it an substantive ethics about how subjects should treat each other. It is true that the subject can never achieve the impossible of the infinite…alone. However, by investing themselves through their “projects”, subjects can extend themselves. And when these projects involve the empowerment of others, this allows a finite subject to extend themselves through a larger and less egoistic system of life.

De Beauvoirian ethics are really nice because they are only gently prescriptive, are grounded very closely in the individual’s subjective experience of their situation, and have social justice implications that are appealing to many contemporary liberal intellectuals without grounding these justice claims in resentment or zero-sum claims for reparation or redistribution. Rather, its orientation is the positive-sum, win-win relationship between the one who empowers another and the one being empowered. This is the relationship, not of master and slave, but of master and apprentice.

When I write about existentialism in design, I am talking about using an ethical framework similar to de Beauvoir’s totally underrated existentialist ethics and using them as principles for technical design.


Brown, John Seely, Allan Collins, and Paul Duguid. “Situated cognition and the culture of learning.” Educational researcher 18.1 (1989): 32-42.

De Beauvoir, Simone. The ethics of ambiguity, tr. Citadel Press, 1948.

Lave, Jean, and Etienne Wenger. Situated learning: Legitimate peripheral participation. Cambridge university press, 1991.

by Sebastian Benthall at October 29, 2017 05:45 PM

October 28, 2017

Ph.D. student

Subjectivity in design

One of the reason why French intellectuals have developed their own strange way of talking is because they have implicitly embraced a post-Heideggerian phenomenological stance which deals seriously with the categories of experience of the individual subject. Americans don’t take this sort of thing so seriously because our institutions have been more post-positivist and now, increasingly, computationalist. If post-positivism makes the subject of science the powerful bureaucratic institution able leverage statistically sound and methodologically responsible survey methodology, computationalism makes the subject of science the data analyst operating a cloud computing platform with data sourced from wherever. These movements are, probably, increasingly alienating to “regular people”, including humanists, who are attracted to phenomenology precisely because they have all the tools for it already.

To the extent that humanists are best informed about what it really means to live in the world, their position must be respected. It is really out of deference to the humble (or, sometimes, splendidly arrogant) representatives of the human subject as such that I have written about existentialism in design, which is really an attempt to ground technical design in what is philosophically “known” about the human condition.

This approach differs from “human centered design” importantly because human centered design wisely considers design to be an empirically rigorous task that demands sensitivity to the particular needs of situated users. This is wise and perfectly fine except for one problem: it doesn’t scale. And as we all know, the great and animal impulse of technology progress, especially today, is to develop the one technology that revolutionizes everything for everyone, becoming new essential infrastructure that reveals a new era of mankind. Human centered designers have everything right about design except for the maniacal ambition of it, without which it will never achieve technology’s paramount calling. So we will put it to one side and take a different approach.

The problem is that computationalist infrastructure projects, and by this I’m referring to the Googles, the Facebooks, the Amazons, Tencents, the Ali Babas, etc., are essentially about designing efficient machines and so they ultimately become about objective resource allocation in one sense or another. The needs of the individual subject are not as relevant to the designers h of these machines as are the behavioral responses of their users to their use interfaces. What will result in more clicks, more “conversions”? Asking users what they really want on the scale that it would affect actual design is secondary and frivolous when A/B s testing can optimize practical outcomes as efficiently as they do.

I do not mean to cast aspersions at these Big Tech companies by describing their operations so baldly. I do not share the critical perspective of many of my colleagues who write as if they have discovered, for the first time, that corporate marketing is hypocritical and that businesses are mercenary. This is just the way things are; what’s more, the engineering accomplishments involved are absolutely impressive and worth celebrating, as is the business management.

What I would like to do is propose that a technology of similar scale can be developed according to general principles that nevertheless make more adept use of what is known about the human condition. Rather than be devoted to cheap proxies of human satisfaction that address his or her objective condition, I’m proposing a service that delivers something tailored to the subjectivity of the user.

by Sebastian Benthall at October 28, 2017 10:56 PM

October 27, 2017

Ph.D. student

Alain Badiou and artificial intelligence

Last week I saw Alain Badiou speak at NYU on “Philosophy between Mathematics and Poetry”, followed by a comment by Alexander Galloway, and then questions fielded from the audience.

It was wonderful to see Badiou speak as ever since I’ve become acquainted with his work (which was rather recently, Summer of 2016) I have seen it as a very hopeful direction for philosophy. As perhaps implied by the title of his talk, Badiou takes mathematics very seriously, perhaps more seriously than most mathematicians, and this distinguishes him from many other philosophers for whom mathematics is somewhat of an embarrassment. There are few fields more intellectually rarified than mathematics, philosophy, and poetry, and yet somehow Badiou treats each fairly in a way that reflects how broader disciplinary and cultural divisions between the humanities and technical fields may be reconciled. (This connects to some of my work on Philosophy of Computational Social Science)

I have written a bit recently about existentialism in design only to falter at the actual definition of existentialism. While it would I’m sure be incorrect to describe Badiou as an existentialist, there’s no doubt that he represents the great so-called Continental philosophical tradition, is familiar with Heidegger and Nietzsche, and so on. I see certain substantive resonances between Badiou and other existentialist writers, though I think to make the comparison now would be putting the cart before the horse.

Badiou’s position, in a nutshell, is like this:

Mathematics is a purely demonstrative form of writing and thinking. It communicates by proof, and has a special kind of audience to it. It is a science. In particular it is a science of all the possible forms of multiplicity, which is the same thing as saying as it is the science of all being, or ontology.

Poetry, on the other hand, is not about being but rather about becoming. “Becoming” for Badiou is subjective: the conscious subject encounters something new, experiences a change, sees an unrealized potential. These are events, and perhaps the greatest contribution of Badiou is his formulation and emphasis on the event as a category. In reference to earlier works, the event might be when through Hegelian dialectic a category is sublated. It could also perhaps correspond to when existence overcomes being in de Beauvoir’s ethics (hence the connection to existentialism I’m proposing). Good poetry, in Badiou’s thought, shows how the things we experience can break out of the structures that objectify them, turning the (subjectively perceived) impossible into a new reality.

Poetry is also, perhaps because it is connected to realizing the impossible but perhaps just because it’s nice to listen to (I’m unclear on Badiou’s position on this point) is “seductive”, encouraging psychological connections to the speaker (such as transference) whether or not it’s “true”. Classically, poetry meant epic poems and tragic theater. It could be cinema today.

Philosophy has the problem that it has historically tried to be both demonstrative, like mathematics, and seductive, like poetry. It’s this impurity or tension that defines it. Philosophers need to know mathematics because it is ontology, but have to go beyond mathematics because their mission is to create events in subjectively experienced reality, which is historically situated, and therefore not merely a matter of mathematical abstraction. Philosophers are in the business of creating new forms of subjectivity, which is not the same as creating a new form of being.

I’m fine with all this.

Galloway made some comments I’m somewhat skeptical of, though I may not have understood them since he seems to build mostly on Deleuze and Lacan, who are two intellectual sources I’ve never gotten into. But Galloway’s idea is to draw a connection between the “digital”, with all of its connections to computing technology, algorithms, the Internet, etc., with Badiou’s understanding of the mathematical, and to connect the “analog”, which is not discretized like the digital, to poetry. He suggested that Badiou’s sense of mathematics was arithmetic and excluded the geometric.

I take this interpretation of Galloway’s as clever, but incorrect and uncharitable. It’s clever because it co-opts a great thinker’s work into the sociopolitical agenda of trying to bolster the cultural capital of the humanities against the erosion of algorithmic curation and diminution relative to the fortunes of technology industries. This has been the agenda of professional humanists for a long time and it is annoying (to me) but I suppose necessary for the maintenance of the humanities, which are important.

However, I believe the interpretation is incorrect and uncharitable to Badiou because though Badiou’s paradigmatic example of mathematics is set theory, he seems to have a solid enough grasp of Kurt Godel’s main points to understand that mathematics includes the great variety of axiomatic systems and these, absolutely, indisputably, include geometry and real analysis and all the rest. The fact that logical proof is a discrete process which can be reduced to and from Boolean logic and automated in an electric circuit is, of course, the foundational science of computation that we owe to Turing, Church, Von Neumann, and others. It’s for these reasons that the potential of computation is so impressive and imposing: it potentially represents all possible forms of being. There are no limits to AI, at least none based on these mathematical foundations.

There were a number of good questions from the audience which led Badiou to clarify his position. The Real is relational, it is for a subject. This distinguishes it from Being, which is never relational (though of course, there are mathematical theories of relations, and this would seem to be a contradiction in Badiou’s thought?) He acknowledges that a difficult question is the part of Being in the the real.

Meanwhile, the Subject is always the result of an event.

Physics is a science of the existing form of the real, as opposed to the possible forms. Mathematics describes the possible forms of what exists. So empirical science can discover which mathematical form is the one that exists for us.

Another member of the audience asked about the impossibility of communism, which was on point because Badiou has at times defended communism or argued that the purpose of philosophy is to bring about communism. He made the point that one could not mathematically disprove the possibility of communism.

The real question, I may be so bold as to comment afterwards, is whether communism can exist in our reality. Suppose that economics is like physics in that it is a science of the real as it exists for us. What if economics shows that communism is impossible in our reality?

Though it wasn’t quite made explicitly, here is the subtle point of departure Badiou makes from what is otherwise conventionally unobjectionable. He would argue, I believe, that the purpose of philosophy is to create a new subjective reality where the impossible is made real, and he doesn’t see this process as necessarily bounded by, say, physics in its current manifestation. There is the possibiliity of a new event, and of seizing that event, through, for example, poetry. This is the article of faith in philosophy, and in poets, that has established them as the last bastion against dehumanization, objectification, reification, and the dangers of technique and technology since at least Heidegger’s Question Concerning Technology.

Which circles us back to the productive question: how would we design a technology that furthers this objective of creating new subjective realities, new events? This is what I’m after.

by Sebastian Benthall at October 27, 2017 09:52 PM

October 26, 2017

Ph.D. student

education and intelligibility

I’ve put my finger on the problem I’ve had with scholarly discourse about intelligibility over the years.

It is so simple, really.

Sometimes, some group of scholars, A, will argue that the work of another group of scholars, B, is unintelligible. Because it is unintelligible, it should not be trusted. Rather, it has to be held accountable to the scholars in A.

Typically, the scholars in B are engaged in some technical science, while the scholars in A are writers.

Scholars in B meanwhile say: well, if you want to understand what we do, then you could always take some courses in it. Here (in the modern day): we’ve made an on-line course which you can take if you want to understand what we do.

The existence of the on-line course or whatever other resources expressing the knowledge of B tend to not impress those in A. If A is persistent, they will come up with reasons why these resources are insufficient, or why there are barriers to people in A making proper use of those resources.

But ultimately, what A is doing is demanding that B make itself understood. What B is offering is education. And though some people are averse to the idea that some things are just inherently hard to understand, this is a minority opinion that is rarely held by, for example, those who have undergone arduous training in B.

Generally speaking, if everybody were educated in B, then there wouldn’t be so much of a reason for demanding its intelligibility. Education, not intelligibility, seems to be the social outcome we would really like here. Naturally, only people in B will really understand how to educate others in B; this leaves those in A with little to say except to demand, as a stopgap, intelligibility.

But what if the only way for A to truly understand B is for A to be educated by B? Or to educate itself in something essentially equivalent to B?

by Sebastian Benthall at October 26, 2017 09:21 PM

October 21, 2017

Ph.D. student

nothing is intelligible, nothing is legible

I’ve been revisiting comments by Arendt about the unintelligibly of science and comparing them with my own experiences as a scientist. I’ve also found myself working closely with researchers who are involved in the Fairness, Accountability, and Transparency in Machine Learning (FATML, or now more generally FAT*) research field, despite my original resistance to the topic. I see now that my original attitude was a bit curmudgeonly, and that there are good legal reasons to focus on interpretability of some algorithms. I would also continue to argue that the senses of ‘fairness’ and ‘accountability’ at work in this field are essentially bourgeois, through I would love to be proven wrong by a reference to a counterexample if there’s one out there. In any case, it seems impossible for me to escape this topic, even though it is truly one I would like to escape.

I think that what is happening is that scholars more broadly are starting to own up to the unintelligibility of the world in its buzzing, blooming complexity. All these predictions of Arendt and Horkheimer have played out and indeed nobody really understands what’s going on in thought, and as a consequence we now really are turning over our confidence in the self-regulatory nature of society to machines and hoping it doesn’t destroy us.

There’s some research from David Wolpert, who I’m convinced has really had it all figured out for a long time, that a physical system in a universe can’t understand the world that it’s in. My hunch is that the faster the world spins (so to speak), the harder it is for it to understand itself. As the world gets bigger, and faster, and more computationally mediated, the less anybody can really claim to know what’s going on. The exceptions are those who sit on top of significantly sized data sets and can sift through it, and there are only a handful of these.

The upshot of this is that though it may be the case that science is not speaking normal language and so is inaccessible to politics in the normal way (Arendt’s point), people are not speaking normal language relative to each other and so they too are politically inaccessible to each other in the grand scheme of things. Certainly this is true in the narrow “linguistic” communities of academic disciplines, which are fun-house, distorted microcosms of the public sphere. But what I’m pointing to really is the ignorance of even these so-called experts given the immense complexity and individuality of everybody else. Except for macro-scale trends, social scientists can’t sample complex reality with the multi-dimensional penetration they would need to come up with credible predictive claims. They can be “better than nothing”.

What this is doing, in effect, is forcing scholars (and presumably businesses, politicians, and activists as well) to articulate their agendas as mechanisms. This is values in design in a serious way. To articulate values and advocate for them in traditional humanistic ways is still a noble pursuit. But it doesn’t have the same promise of potential revolutionary human impact as it once did. All ideas, more or less, have been expressed already. Communication media are so fragmented, and seemingly irredeemably fragmented, that the man-made disasters of the 20th century that happened when mass media met with mass ignorance and naivete can never happen again. We have inoculated ourselves against that kind of idiocy; there will always be some amount of the virus active in the system, but that is held in equilibrium with the immune system that stops it.

All this means that it really is now time to think about how to build ethical machines, as well as the sociotechnical conditions for sustaining the ones that we like. This is not a topic considered far-fetched science fiction, as it was just a few years ago (!). That these machines act with autonomy much of the time is as true as the fact that they are responding all the time to human inputs (their users, their owners).

by Sebastian Benthall at October 21, 2017 03:33 AM

October 19, 2017

adjunct professor

About FTC PL&P

“Chris Hoofnagle has written the definitive book about the FTC’s involvement in privacy and security. This is a deep, thorough, erudite, clear, and insightful work – one of the very best books on privacy and security.”

Daniel J. Solove, John Marshall Harlan Research Professor of Law, George Washington University, Washington DC

“A landmark work for anyone interested in privacy or consumer protection law.”

Paul M. Schwartz, Jefferson E. Peyser Professor of Law, Berkeley Law School

“This well-written, comprehensive history of the Federal Trade Commission shows once again the primary importance the agency has played in shaping the regulatory environment of the United States. It is essential reading for anyone who deals regularly with the FTC, and is a good primer for those coming in contact with the agency for the first time. Clear, thoughtful and engaging.”

Kirstin Downey, Editor, FTC:WATCH

“A timely and insightful analysis of the FTC as a key actor in protecting information privacy. The historical context provides a solid basis for Hoofnagle’s well-supported policy recommendations.”

Priscilla M. Regan, George Mason University, Virginia

“A welcome perspective on challenges facing a great agency designed to “rein in” the American market.”

Norman I. Silber, Hofstra University, New York

“Hoofnagle masterfully distills and concentrates the major steps in the development of the FTC’s consumer protection authority…This is a serious work of historical scholarship.”

Aaron Burstein, Partner, Wilkinson Barker Knauer LLP

“This book offers a fascinating, informed exploration into the dangers of the Internet and the problems and potentials of the FTC in effectively dealing with them. It is well worth our attention.”

William L. Wilkie, Aloysius and Eleanor Nathe Professor of Marketing Strategy, University of Notre Dame, Indiana

“Chris Hoofnagle has done an enormous public service by writing a comprehensive and critical guide to the Federal Trade Commission’s consumer protection efforts, which started over a century ago in reaction to a changing economy and industrialization […] we could not ask for a better primer than this incisive and informative book.

Astra Taylor, Author of The People's Platform
“Chris Hoofnagle has put together an impressive, authoritative and useful treatise on the law of consumer privacy in the U.S. and the role of the Federal Trade Commission.  This book is an excellent read for all those interested in consumer privacy, and should prove to be a valuable resource for years to come.”

Dee Pridgen, Professor of Law, University of Wyoming
“This book succeeds as a work of history, a deep analysis of law and institutions, and advocacy for a better regime for the key issue of our times.”
Spencer Weber Waller, Professor of Law, Loyola University Chicago School

…Through his analysis of the role played by the courts, Congress, and the Commission itself, he illustrates the doctrines and dynamics that have contributed to shaping this agency. This makes the book a valuable tool for European privacy experts who wish to better understand the US regulatory approach to privacy protection and understand how political and social forces have affected the powers given to the Commission.
Alessandro Mantelero, Professor of Private Law and of Innovation & International Transactions Law at the Polytechnic University of Turin
…Overall, Chris Hoofnagle’s Federal Trade Commission Privacy Law and Policy is a fascinating read and a treasure trove of useful references for further research.
Bilyana Petkova, Max Weber Fellow, European University Institute

Federal Trade Commission Privacy Law and Policy (FTCPL&P) is my 2016 book on the FTC.  It is really two books. The first part details the agency’s consumer protection history from its founding, and in so doing, it sets the context for the FTC’s powers and how it is apt to apply them. The book has an institutional analysis discussing the internal dynamics that shape agency behavior. It details how the FTC policed advertising with treatments of substantiation, the Chicago School debates, the problem of advertising to children, and the Reagan revolution. The second part of the book explains the FTC’s approach to privacy in different contexts (online privacy, security, financial, children’s, marketing, and international). One thesis of the book is that the FTC has adapted its decades of advertising law cases to the problem of privacy. There are advantages and disadvantages to the advertising law approach, but do understand that if you are a privacy lawyer, you are really an advertising law lawyer 🙂

FTCPL&P has been reviewed in the Journal of Economic Literature, the ABA Antitrust Source, the European Data Protection Law Review, World Competition, and the International Journal of Constitutional Law.

by web at October 19, 2017 07:04 PM

October 18, 2017

Ph.D. student

“To be great is to be misunderstood.”

A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. With consistency a great soul has simply nothing to do. He may as well concern himself with his shadow on the wall. Speak what you think now in hard words, and to-morrow speak what to-morrow thinks in hard words again, though it contradict every thing you said to-day. — `Ah, so you shall be sure to be misunderstood.’ — Is it so bad, then, to be misunderstood? Pythagoras was misunderstood, and Socrates, and Jesus, and Luther, and Copernicus, and Galileo, and Newton, and every pure and wise spirit that ever took flesh. To be great is to be misunderstood. –
Emerson, Self-Reliance

Lately in my serious scientific work again I’ve found myself bumping up against the limits of intelligibility. This time, it is intelligibility from within a technical community: one group of scientists who are, I’ve been advised, unfamiliar with another, different technical formalism. As a new entrant, I believe the latter would be useful to understand the domain of the former. But to do this, especially in the context of funders (who need to explain things to their own bosses in very concrete terms), would be unproductive, a waste of precious time.

Reminded by recent traffic of some notes I wrote long ago in frustration at Hannah Arendt, I found something apt about her comments. Science in the mode of what Kuhn calls “normal science” must be intelligible to itself and its benefactors. But that is all. It need not be generally intelligible to other scientists; it need not understand other scientists. It need only be a specialized and self-sustaining practice, a discipline.

Programming (which I still study) is actually quite different from science in this respect. Because software code is a medium used for communication by programmers, and software code is foremost interpreted by a compiler, one relates as a programmer to other programmers differently than the way scientists relate to other scientists. To some extent the productive formal work has moved over into software, leaving science to be less formal and more empirical. This is, in my anecdotal experience, now true even in the fields of computer science, which were once one of the bastions of formalism.

Arendt’s criticism of scientists, that should be politically distrusted because “they move in a world where speech has lost its power”, is therefore not precisely true because scientific operations are, certainly, mediated by language.

But this is normal science. Perhaps the scientists who Arendt distrusted politically were not normal scientists, but rather those sorts of scientists that were responsible for scientific revolutions. These scientist must not have used language that was readily understood by their peers, at least initially, because they were creating new concepts, new ideas.

Perhaps these kinds of scientists are better served by existentialism, as in Nietzsche’s brand, as an alternative to politics. Or by Emerson’s transcendentalism, which Sloterdijk sees as very spiritually kindred to Nietzsche but more balanced.

by Sebastian Benthall at October 18, 2017 03:14 AM

October 17, 2017

Ph.D. student

A quick recap: from political to individual reasoning about ends

So to recap:

Horkheimer warned in Eclipse of Reason that formalized subjective reason that optimizes means was going to eclipse “objective reason” about social harmony, the good life, the “ends” that really matter. Technical efficacy which is capitalism which is AI would expose how objective reason is based in mythology and so society would be senseless and miserable forever.

There was at one point a critical reaction against formal, technical reason that was called the Science Wars in the 90’s, but though it continues to have intellectual successors it is for the most part self-defeating and powerless. Technical reasoning is powerful because it is true, not true because it is powerful.

It remains an open question whether it’s possible to have a society that steers itself according to something like objective reason. One could argue that Habermas’s project of establishing communicative action as a grounds for legitimate pluralistic democracy was an attempt to show the possibility of objective reason after all. This is, for some reason, an unpopular view in the United States, where democracy is often seen as a way of mediating agonistic interests rather than finding common ones.

But Horkheimer’s Frankfurt School is just one particularly depressing and insightful view. Maybe there is some other way to go. For example, one could decide that society has always been disappointing, and that determining ones true “ends” is an individual, rather than collective, endeavor. Existentialism is one such body of work that posits a substantive moral theory (or at least works at one) that is distrustful of political as opposed to individual solutions.

by Sebastian Benthall at October 17, 2017 03:29 AM

MIMS 2014

Front Page Clues

They say the best place to hide a dead body is on page two of the Google search results. I’d argue that a similar rule applies to reading the news, especially online. If a story is not on the landing page of whatever news site I’m looking at, chances are I’m not gonna find it. All this is to say: news outlets wield considerable power to direct our attention where they want it simply by virtue of how they organize content on their sites.

During presidential elections, the media is often criticized for giving priority to the political horse race between dueling candidates, preoccupying us with pageantry over policy. But to what extent is this true? And if it is true, which specific policy issues suffer during election cycles? Do some suffer more than others? What are we missing out on because we are too busy keeping up with the horse race instead?

If you want to go straight to the answers to these questions (and some other interesting stuff), skip down to the Findings section of the post. For those interested in the technical details, the next two sections are for you.


Lucky for us (or maybe just me), the New York Times generously makes a ton of its data available online for free, easily retrievable via calls to a REST API (specifically, their Archive API). Just a few dozen calls and I was in business. This amazing resource not only has information going back to 1851 (!!), it also includes keywords from each article as part of its metadata. Even better, since 2006, they have ranked the keywords in each article by their importance. This means that for any article that is keyword-ranked, you can easily extract its main topic—whatever person, place, or subject it might be.

Having ranked keywords makes this analysis much easier. For one thing, we don’t have to sift through words from mountains of articles in order to surmise what each article is about using fuzzy or inexact NLP methods. And since they’ve been ranking keywords since 2006, this gives us three presidential elections to include as part of our analysis (2008, 2012, and 2016).

The other crucial dimension included in the NYT article metadata is the print page. Personally, I don’t ever read the NYT on paper anymore (or any newspaper, for that matter—they’re just too unwieldy), so you might argue that the print page is irrelevant. Possibly, but unfortunately we don’t have data about placement on the NYT’s website. And moreover, I would argue that the print page is a good proxy for this. It gets at the essence of what we’re trying to measure, which is the importance NYT editors place on a particular topic over others.


logit(\pi_t) = log(\frac{\pi_t}{1-\pi_t}) = \alpha + \sum_{k=1}^{K} \beta_{k} * Desk_k + \beta * is\_election

A logistic regression model underpins the analysis here. The log-odds that a topic \textit{t} will appear on the front page of the NYT is modeled as a function of the other articles appearing on the front page (the Desk variables, more on those below), as well as a dummy variable indicating whether or not the paper was published during an election cycle.

Modeling the other articles on the front page is essential since they have obvious influence over whether topic \textit{t} will make the front page on a given day. But in modeling these other articles, a choice is made to abstract from the topics of the articles to the news desk from which they originated. Using the topics themselves unfortunately leads to two problems: sparsity and singularity. Singularity is a problem that arises when your data has too many variables and too few observations. Fortunately, there are statistical methods to overcome this issue—namely penalized regression. Penalized regression is often applied to machine learning problems, but recent developments in statistics have extended the methodology of significance testing to penalized models like ridge regression. This is great since we are actually concerned with interpreting our model rather just pure prediction—the more common aim in machine learning applications.

Ultimately though, penalized methods do not overcome the sparsity problem. Simply put, there are too many other topics that might appear (and appear too infrequently) on the front page to get a good read on the situation. Therefore as an alternative, we aggregate the other articles on the front page according to the news desk they came from (things like Foreign, Style, Arts & Culture, etc). Doing so allows our model to be readily interpretable while retaining information about the kinds of articles that might be crowding out topic \textit{t}.

The  is\_election variable is a dummy variable indicating whether or not the paper was published in an election season. This is determined via a critical threshold illustrated by the red line in the graph below. The same threshold was applied across all three elections.


In some specifications, the is\_election variable might be broken into separate indicators, one for each election. In other specifications, these indicators might be interacted with one or several news desk variables—though only when the interactions add explanatory value to the overall model as determined by an analysis of deviance.

Two other modeling notes. First, for some topics, the model might suffer from quasi or complete separation. This occurs when for example, all instances of topic \textit{t} appearing on page one occur when there are also less than two Sports desk articles appearing on page one. Separation can mess up logistic regression coefficient estimates, but fortunately, a guy named Firth (not Colin, le sigh) came up with a clever workaround, which is known as Firth Regression. In cases where separation is an issue, I switch out the standard logit model for Firth’s alternative. This is easily done using R‘s logistf package, and reinforces why I favor R over python when it comes to doing serious stats.

Second, it should be pointed out that our model does run somewhat afoul of one of the basic assumptions of logistic regression—namely, independence. In regression models, it is regularly assumed that the observations are independent of (i.e. don’t influence) each other. That is probably not true in this case, since news cycles can stretch over the span of several newspaper editions. And whether a story makes front page news is likely influenced by whether it was on the front page the day before.

Model-wise, this is a tough nut to crack since the data is not steadily periodic—as is the case with regular time series data. It might be one, two, or sixty days between appearances of a given topic. In the absence of a completely different approach, I test the robustness of my findings by including an additional variable in my specification—a dummy indicating whether or not topic \textit{t} appeared on the front page the day before.


For this post, I focused on several topics I believe are consistently relevant to national debate, but which I suspected might get less attention during a presidential election cycle. It appears the 2016 election cycle was particularly rough on healthcare coverage. The model finds a statistically significant effect (\beta = -4.519; p = 0.003), which means that for the average newspaper, the probability that a healthcare article made the front page dropped by 60% during the 2016 election season—from a probability of 0.181 to 0.071. This calculation is made by comparing the predicted values with and without the 2016 indicator activated—while holding all other variables fixed at their average levels during the 2016 election season.

Another interesting finding is the significant coefficient (p = 0.045) found on the interaction term between the 2016 election and articles from the NYT’s National desk, which is actually positive (\beta = 0.244).  Given that the National desk is one of the top-five story-generating news desks at the New York Times, you would think that more National stories would come at the expense of just about any other story. And this is indeed the case outside of the 2016 election season, where the probability healthcare will make the front page drops 45% when an additional National article is run on the front page of the average newspaper. During the 2016 election season, however, the probability actually increases by 25%.  These findings were robust to whether or not healthcare was front page news the day before.

The flip on the effect of National coverage here is curious, and raises the question as to why it might be happening. Perhaps NYT editors had periodic misgivings about the adequacy of their National coverage during the 2016 election and decided to make a few big pushes to give more attention to domestic issues including healthcare. Finding the answer requires more digging. In the end though, even if healthcare coverage was buoyed by other National desk articles, it still suffered overall during the 2016 election.

The other topic strongly associated with election cycles is gun control (\beta = -3.822; p = 0.007). Articles about gun control are 33% less likely to be run on the front page of the average newspaper during an election cycle. One thing that occurred to me about gun control however is that it generally receives major coverage boons in the wake of mass shootings. It’s possible that the association here is being driven by a dearth of mass shootings during presidential elections, but I haven’t looked more closely to see whether a drop off in mass shootings during election cycles actually exists.

Surprisingly, coverage about the U.S. economy is not significantly impacted by election cycles, which ran against my expectations. However, coverage about the economy was positively associated with coverage about sports, which raises yet more interesting questions. For example, does our attention naturally turn to sports when the economic going is good?

Unsurprisingly, elections don’t make a difference to coverage about terrorism. However, when covering stories about terrorism in foreign countries, other articles from the Foreign desk significantly influence whether the story will make the front page cut (\beta = -1.268; p = 8.49e-12). Starting with zero Foreign stories on page one, just one other Foreign article will lower the chances that an article about foreign terrorism will appear on page one by 40%. In contrast, no news desk has any systematic influence on whether stories about domestic terrorism make it on to page one.

Finally, while elections don’t make a difference to front page coverage about police brutality and misconduct, interestingly, articles from the NYT Culture desk do. There is a significant and negative effect (\beta = -0.364; p = 0.033), which for the average newspaper, means a roughly 17% drop in the probability that a police misconduct article will make the front page with an additional Culture desk article present. Not to knock the Culture desk or nothing, but this prioritization strikes me as somewhat problematic.

In closing, while I have managed to unearth several insights in this blog post, many more may be surfaced using this rich data source from the New York Times. Even if some of these findings raise questions about how the NYT does its job, it is a testament to the paper as an institution that they are willing to open themselves to meta-analyses like this. Such transparency enables an important critical discussion about the way we consume our news. More informed debate—backed by hard numbers—can hopefully serve the public good in an era when facts in the media are often under attack.


by dgreis at October 17, 2017 01:53 AM

October 16, 2017

Ph.D. student

Notes on Sloterdijk’s “Nietzsche Apostle”

Fascisms, past and future, are politically nothing than insurrections of energy-charged losers, who, for a time of exception, change the rules in order to appear as victors.
— Peter Sloterdijk, Nietzsche Apostle

Speaking of existentialism, today I finished reading Peter Sloterdijk’s Semiotext(e) issue, “Nietzsche Apostle”. A couple existing reviews can better sum it up than I can. These are just some notes.

Sloterdijk has a clear-headed, modern view of the media and cultural complexes around writing and situates his analysis of Nietzsche within these frames. He argues that Nietzsche created an “immaterial product”, a “brand” of individualism that was a “market maker” because it anticipated what people would crave when they realized they were allowed to want. He does this through a linguistic innovation: blatant self-aggrandizement on a level that had been previously taboo.

One of the most insightful parts of this analysis is Sloterdijk’s understanding of the “eulogistic function” of writing, something about which I have been naive. He’s pointing to the way writing increases its authority by referencing other authorities and borrowing some of their social capital. This was once done, in ancient times, through elaborate praises of kings and ancestors. There have been and continue to be (sub)cultures where references to God or gods or prophets or scriptures give a text authority. In the modern West among the highly educated this is no longer the case. However, in the academy citations of earlier scholars serves some of this function: citing a classic work still gives scholarship some gravitas, though I’ve noted this seems to be less and less the case all the time. Most academic work these days serves its ‘eulogistic function’ in a much more localized way of mutually honoring peers within a discipline and the still living and active professors who might have influence over ones hiring, grants, and/or tenure.

Sloterdijk’s points about the historical significance of Nietzsche are convincing, and he succeeds in building an empathetic case for the controversial and perhaps troubled figure. Sloterdijk also handles most gracefully the dangerous aspects of Nietzsche’s legacy, most notably when in a redacted and revised version his work was coopted by the Nazis. Partly through references to Nietzsche’s text and partly by illustrating the widespread phenomenon of self-serving redactionist uses of hallowed texts (he goes into depth about Jefferson’s bible, for example), he shows that any use of his work to support a movement of nationalist resentment is a blatant misappropriation.

Indeed, Sloterdijk’s discussion of Nietzsche and fascism is prescient for U.S. politics today (I’ve read this volume was based on a lecture in 2000). For Sloterdijk, both far right and far left politics are often “politics of resentment”, which is why it is surprisingly easy for people to switch from one side to the other when the winds and opportunities change. Nietzsche’s famously denounced “herd morality” as that system of morality that deplores the strong and maintains the moral superiority of the weak. In Nietzsche’s day, this view was represented by Christianity. Today, it is (perhaps) represented by secular political progressivism, though it may just as well be represented by those reactionary movements that feed on resentment towards coastal progressive elites. All these political positions that are based on arguments about who is entitled to what and who isn’t getting their fair share are the same for Sloterdijk’s Nietzsche. They miss the existential point.

Rather, Nietzsche advocates for an individualism that is free to pursue self-enhancement despite social pressures to the contrary. Nietzsche is anti-egalitarian, at least in the sense of not prioritizing equality for its own sake. Rather, he proposes a morality that is libertarian without any need for communal justification through social contract or utilitarian calculus. If there is social equality to be had, it is through the generosity of those who have excelled.

This position is bound to annoy the members of any political movement whose modus operandi is mobilization of resentful solidarity. It is a rejection of that motive and tactic in favor of more joyful and immediate freedom. It may not be universally accessible; it does not brand itself that way. Rather, it’s a lifestyle option for “the great”, and it’s left open who may self-identify as such.

Without judging its validity, it must be noted that it is a different morality than those based on resentment or high-minded egalitarianism.

by Sebastian Benthall at October 16, 2017 01:35 AM

October 10, 2017

MIDS student

Privacy matters of nations..conclusion

Apologies for the delay in posting this piece . 


Espionage, or spying stands in stark contrast with Intelligence. Intelligence is essentially gathering of information which are public or private in nature whereas espionage involves obtaining classified information through human or other sources. Stephen Grey, in his wonderfully written masterpiece “The new Spymasters” calls spies “ the best ever liars ”

Espionage, by definition, may violate a number of international treaties concerning Human Rights as well as civil liberties such as the right to privacy amongst others. There are many national laws such as the Espionage Act of 1917/Sedition Act (United States) that intend to tame incidents of its sensitive information being leaked to other nations while remaining silent about it collecting information about other nations.Edward Snowden, the NSA whistleblower was charged under this act. Recently Germany had passed a controversial espionage act that allows their intelligence agency (BND) to spy under ambiguous conditions such as “early interception of dangers”. In fact, none of the international laws have been able to include espionage directly as a matter of direct concern. As a 2007 paper on Espionage and International Law states , “Espionage and international law are not in harmony.” Diplomatic missions, under the protection of diplomatic immunity are a common way of carrying out such clandestine activities. In fact, espionage is probably the only institutionalised clandestine activity carried out for political or military gains.It needs to be noted that spying occurs not only on rogue nations or known enemy states but also on the so-called allies.

As mentioned earlier, I will not be focussing on internal surveillance carried on by governments on their own citizens.

This field of spying was the domain of government agencies but the equation is now more fluid and complex with private players like WikiLeaks coming into play.

The question is why has espionage become a necessary part of a nation’s policy. Thomas Finger, in his 2011 publication points out that a short answer to this question is “to reduce uncertainty”. Reducing uncertainty involves research and analysis to gain “new knowledge” i.e. better understanding and new insights derived by using existing information in possession or efforts to substantiate or disconfirm a hunch regarding another nation. Historical origins aside, all countries feel that they may be left behind if they do not have the inside information about what their enemies and allies were up to at all times. In this sense, it acts as a deterrent on the same lines as acquisition of nuclear warfare.It also acts as a equaliser between countries of uneven economic might.It also seems to be the only way to get reliable information about “Rogue nations” such as North Korea.

From the days of its origin till the very recent past, the Intelligence community in the US were focussed on the actions of other nation states (such as the former USSR during the cold war era) . September 11, 2001 attacks brought attention to non-state entities that seemed to be causing more damage and disruption. Thus the evolution of the source of national threat convinced nations to be more vigilant i.e. increase spying.

However, it is important to understand that more espionage does not imply a more secure nation. The field of international espionage is rife with examples of failures .One such case was the Iraq invasion by the USA on the basis of the Weapons of Mass Destruction National Intelligence estimates produced in 2002. Chemical weapons analysts mistook a fire truck for a “special” truck  for transfer of munitions . This was a very costly mistake in both financial and human life terms.

It is easy to see that the issues in the field of espionage, especially due to its pervasiveness in today’s technologically connected world is not an easy choice of black or white but consists of all possible hues of grey.

Ethical Spy

The classification of moral/immoral acts in the field of spying are difficult to achieve due to lack of clarity of laws. As an example, the National Intelligence Strategy of the United States of America says very little about ethical code to be followed by agents on field. It states that the members of the Intelligence Community (IC) need to uphold the “Principles of Professional Ethics for the Intelligence Community” that include respect for civil liberties and integrity. These seem to be applicable only in relation to domestic matters but seem contrary to the job requirements of a spy placed in a foreign country by United States , as an example. More details can be found here. It is an impossible task to list down all possible scenarios that a spy may face when on duty especially in relation to a foreign nation. However, it is imperative to acknowledge that in practise, the guiding principles do not remain the same in the two situations. Having taken this step, the basic guidelines for conduct will be an easier task to undertake. Such a moral framework is essential to contain harm done to international relationships due to agent’s on-field actions based on his/her own moral compass. Author Bruce Schneider in his book “Data and Goliath-the hidden battles to collect your data and control your world (2015)” suggests that the NSA, for example should in fact be split into two divisions – one focussed on surveillance and the other on espionage to clearly demarcate guidelines and duties .    

Such a framework can be built by leaning on principles that are used in related fields such as competitive intelligence . The Society for Competitive Intelligence lists a detailed guideline and code of conduct to differentiate Competitive Intelligence vs Corporate Espionage. The laws for corporate espionage are not very well developed and hence companies need to fall back on their own code of ethics.As the scope of corporate espionage and its effects are much smaller in extent to that of a national espionage activities, it is imperative that such guidelines be built with a sense of urgency. However, as with any other global initiative, this will succeed only if ratified by all nations – a monumental if not an impossible task.

Reuse & Recycle?

Is it possible to reuse the privacy laws and frameworks that work for the protection of individual’s privacy to protect privacy of a nation against peeping toms? It would be useful to see if the current privacy definitions and laws work well for aggregated levels beyond an individual.

Extensive work has been done to provide framework for definition and protection of individual’s privacy . In fact, the laws have become very specialized and focussed – eg law for protection of children’s privacy (Children’s Online Privacy Protection Act of 1998) . These laws cover a wide range of fields of application such as healthcare, social media, finance among others.

As the guiding concerns and effects are the same, these principles can be easily extended to higher aggregates such as a family unit. For example, the concerns raised in the Google’s “Wi-Fi Sniffing Debacle” were linked to the tracking of the wi-fi payload of various homes as the Street View cars were being driven around. The payload was linked to the computer and not necessarily to an individual. Federal Communications Commission made references to the federal Electronic Communications Privacy Act (ECPA) in its report. Similar concerns were raised elsewhere in the world in relation to this unconsented collection of data . Another incident which highlighted the concerns for addressing family level privacy was the famous HeLa genome study . Henrietta Lacks was a woman from Baltimore suffering from cervical cancer. Her cells were taken in 1951 without her consent. Scientists have since been studying her genome sequences to solve some challenging medical concerns. By publishing the genome sequence of her cells, the scientists had inadvertently advertised this private aspect of everyone connected to Henrietta by genes i.e. her family. The study had to be taken down when it became clear that the family’s consent had not been sought. These cases highlight the fact that the guidelines that protect individuals can also be used as a guiding principle in the context of families as a unit.

As a next level of aggregation, we look at society as a unit. Society, as a concept, can be quite ambiguous. We assume that any group of people bound together by a common thread such as residents of a given neighbourhood, consumers of a certain product etc can be thought of as belonging to a society. For example, in the case of the website Ashley Madison’s data breach, the whole user group’s privacy ( or in this case, secrecy) was at stake. Hackers had threatened to release private information of many of its users unless the website was shut down. While this was related to the personally identifiable information for each individual, the issue escalated drastically as it affected a majority of the 36 million users of the website . The Privacy commissioner of Canada stated that the Toronto-based company had in fact breached many privacy laws in Canada and elsewhere. Thus, any privacy violation that is not specific to one particular individual but a much larger group of which the individual is a member, is also looked through the lens of the same privacy laws.There are many other instances of “us vs the nosy corporates” that have been spoken about recently . For eg, due to the privacy setup and the inherent nature of the product, location of all users of Foursquare can be tracked in real time . Additionally the concept of society and privacy are quite intertwined as pointed out by sociologist Barrington Moore(1984), “the need for privacy is a socially created need. Without society there would be no need for privacy.” As an interesting observation, Dan Solove states “Society is fraught with conflict and friction. Individuals, institutions, and governments can all engage in activities that have problematic effects on the lives of others.”

Let us now turn our attention to the next higher level of aggregation – nations. The scale of impact of any privacy violation is enormous as it affects not just the nation’s population but based on the nature of the violation, it also affects its allies and enemy states and eventually can have a global impact (e.g. Weapons of Mass destruction “discovery” in Iraq). Additional complications arise from the fact that the privacy of a nation affects the economic development, defence strategies, regional power imbalances and other world-wide impacts. Thus, while we can take inspirations from existing frameworks, the scale of impact makes it imperative to modify them dramatically.  


As the Sun Microsystem’s CEO  Scott McNealy said in 1999,”You have zero privacy anyway. Get over it”, it seems that the nations of the world have accepted it as a reality and are in a race to outdo each other. A new challenge in this arena are the non-state entities such as terrorist organizations, private players like WikiLeaks which are forcing a rethink of the level of cooperation required between nations against such “outsiders”. In spite of the popular notion developed partly due to the popular spy thrillers on the mainstream cinema, Intelligence agencies prefer to use publicly available data due to low risk and low cost. It may be combined with some clandestinely acquired information to improve accuracy of information . There are only a few cases such as terrorist activities where the agencies have to rely exclusively on the latter. More details can be found hereTaking a cue from Raab and Wright’s 4 level Privacy Impact Assessment (PIA) from their 2012 work, the inclusion of a PIA within the relevant Intelligence organization will influence its culture, structure and behaviour  – helping make necessity of an espionage more palpable even though the PIA cannot act as a panacea for all types of privacy violations. PIA, in its usual form can only assess the impact that a given functionality has on an individual’s privacy. However, with a measure with widespread potential impact like espionage, the PIA needs to be done at multiple levels. These levels are incrementally built on top of one another. In the suggested approach, the following four levels are a must in order for the PIA to be effective .

Level 1 : PIA1 This follows the common wisdom of assessing impact of spying operation on any individual who may have been a subject of it at a personal level

Level 2 : PIA2 This covers the impacts from PIA1 and additionally covers the effect such an operation will have on the individual’s social and political standing and relationships

Level 3 : PIA3 This includes PIA2 impacts as well as the impact on any groups or categories that they may belong to. For example, “vulnerable population” such as children and adults who are not in a decision-making capacity at the moment .

Level 4: PIA4 This has PIA3 and the impact of spying on the working of society and the political system per se.

It is important to note that different spying activities will affect the PIAs in differing forms. It is vital to understand the context before making any recommendations based on the PIA. As expected, effects on privacy in terms of severity will differ across the spectrum of espionage activities. However, it is necessary to have this structure in place for a common thread of assessment across different agents and departments bringing in uniformity of structure and ease of transfer of information across  sister agencies.

Imbalance of economic power between countries also bring in an additional level of fear i.e. the fear of having non-existent bargaining powers in any bilateral or multilateral disputes. This fear can cause countries to behave irrationally and hence it is imperative for the economically advanced nations to be more active in region based or commerce based organisations . This will help ease concerns for the smaller nations in the group. Additionally, there are more efforts being made to have regional co-operations on Intelligence exercises especially for battling issues like terrorism.


by arvinsahni at October 10, 2017 10:52 AM

October 06, 2017

Ph.D. student

Existentialism in Design: Comparison with “Friendly AI” research

Turing Test [xkcd]

I made a few references to Friendly AI research in my last post on Existentialism in Design. I positioned existentialism as an ethical perspective that contrasts with the perspective taken by the Friendly AI research community, among others. This prompted a response by a pseudonymous commenter (in a sadly condescending way, I must say) who linked me to a a post, “Complexity of Value” on what I suppose you might call the elite rationalist forum Arbital. I’ll take this as an invitation to elaborate on how I think existentialism offers an alternative to the Friendly AI perspective of ethics in technology, and particularly the ethics of artificial intelligence.

The first and most significant point of departure between my work on this subject and Friendly AI research is that I emphatically don’t believe the most productive way to approach the problem of ethics in AI is to consider the problem of how to program a benign Superintelligence. This is for reasons I’ve written up in “Don’t Fear the Reaper: Refuting Bostrom’s Superintelligence Argument”, which sums up arguments made in several blog posts about Nick Bostrom’s book on the subject. This post goes beyond the argument in the paper to address further objections I’ve heard from Friendly AI and X-risk enthusiasts.

What superintelligence gives researchers is a simplified problem. Rather than deal with many of the inconvenient contingencies of humanity’s technically mediated existence, superintelligence makes these irrelevant in comparison to the limiting case where technology not only mediates, but dominates. The question asked by Friendly AI researchers is how an omnipotent computer should be programmed so that it creates a utopia and not a dystopia. It is precisely because the computer is omnipotent that it is capable of producing a utopia and is in danger of creating a dystopia.

If you don’t think superintelligences are likely (perhaps because you think there are limits to the ability of algorithms to improve themselves autonomously), then you get a world that looks a lot more like the one we have now. In our world, artificial intelligence has been incrementally advancing for maybe a century now, starting with the foundations of computing in mathematical logic and electrical engineering. It proceeds through theoretical and engineering advances in fits and starts, often through the application of technology to solve particular problems, such as natural language processing, robotic control, and recommendation systems. This is the world of “weak AI”, as opposed to “strong AI”.

It is also a world where AI is not the great source of human bounty or human disaster. Rather, it is a form of economic capital with disparate effects throughout the total population of humanity. It can be a source of inspiring serendipity, banal frustration, and humor.

Let me be more specific, using the post that I was linked to. In it, Eliezer Yudkowsky posits that a (presumeably superintelligent) AI will be directed to achieve something, which he calls “value”. The post outlines a “Complexity of Value” thesis. Roughly, this means that the things that we want AI to do cannot be easily compressed into a brief description. For an AI to not be very bad, it will need to either contain a lot of information about what people really want (more than can be easily described) or collect that information as it runs.

That sounds reasonable to me. There’s plenty of good reasons to think that even a single person’s valuations are complex, hard to articulate, and contingent on their circumstances. The values appropriate for a world dominating supercomputer could well be at least as complex.

But so what? Yudkowsky argues that this thesis, if true, has implications for other theoretical issues in superintelligence theory. But does it address any practical questions of artificial intelligence problem solving or design? That it is difficult to mathematically specify all of values or normativity, and that to attempt to do so one would need to have a lot of data about humanity in its particularity, is a point that has been apparent to ethical philosophy for a long time. It’s a surprise or perhaps disappointment only to those who must mathematize everything. Articulating this point in terms of Kolmogorov complexity does not particularly add to the insight so much as translate it into an idiom used by particular researchers.

Where am I departing from this with “Existentialism in Design”?

Rather than treat “value” as a wholly abstract metasyntactic variable representing the goals of a superintelligent, omniscient machine, I’m approaching the problem more practically. First, I’m limiting myself to big sociotechnical complexes wherein a large number of people have some portion of their interactions mediated by digital networks and data centers and, why not, smartphones and even the imminent dystopia of IoT devices. This may be setting my work up for obsolescence, but it also grounds the work in potential action. Since these practical problems rely on much of the same mathematical apparatus as the more far-reaching problems, there is a chance that a fundamental theorem may arise from even this applied work.

That restriction on hardware may seem banal; but it’s a particular philosophical question that I am interested in. The motivation for considering existentialist ethics in particular is that it suggests new kinds of problems that are relevant to ethics but which have not been considered carefully or solved.

As I outlined in a previous post, many ethical positions are framed either in terms of consequentialism, evaluating the utility of a variety of outcomes, or deontology, concerned with the consistency of behavior with more or less objectively construed duties. Consequentialism is attractive to superintelligence theorists because they imagine their AI’s to have to ability to cause any consequence. The critical question is how to give it a specification the leads to the best or adequate consequences for humanity. This is a hard problem, under their assumptions.

Deontology is, as far as I can tell, less interesting to superintelligence theorists. This may be because deontology tends to be an ethics of human behavior, and for superintelligence theorists human behavior is rendered virtually insignificant by superintelligent agency. But deontology is attractive as an ethics precisely because it is relevant to people’s actions. It is intended as a way of prescribing duties to a person like you and me.

With Existentialism in Design (a term I may go back and change in all these posts at some point; I’m not sure I love the phrase), I am trying to do something different.

I am trying to propose an agenda for creating a more specific goal function for a limited but still broad-reaching AI, assigning something to its ‘value’ variable, if you will. Because the power of the AI to bring about consequences is limited, its potential for success and failure is also more limited. Catastrophic and utopian outcomes are not particularly relevant; performance can be evaluated in a much more pedestrian way.

Moreover, the valuations internalized by the AI are not to be done in a directly consequentialist way. I have suggested that an AI could be programmed to maximize the meaningfulness of its choices for its users. This is introducing a new variable, one that is more semantically loaded than “value”, though perhaps just as complex and amorphous.

Particular to this variable, “meaningfulness”, is that it is a feature of the subjective experience of the user, or human interacting with the system. It is only secondarily or derivatively an objective state of the world that can be evaluated for utility. To unpack in into a technical specification, we will require a model (perhaps a provisional one) of the human condition and what makes life meaningful. This very well may include such things as the autonomy, or the ability to make one’s own choices.

I can anticipate some objections along the lines that what I am proposing still looks like a special case of more general AI ethics research. Is what I’m proposing really fundamentally any different than a consequentialist approach?

I will punt on this for now. I’m not sure of the answer, to be honest. I could see it going one of two different ways.

The first is that yes, what I’m proposing can be thought of as a narrow special case of a more broadly consequentialist approach to AI design. However, I would argue that the specificity matters because of the potency of existentialist moral theory. The project of specify the latter as a kind of utility function suitable for programming into an AI is in itself a difficult and interesting problem without it necessarily overturning the foundations of AI theory itself. It is worth pursuing at the very least as an exercise and beyond that as an ethical intervention.

The second case is that there may be something particular about existentialism that makes encoding it different from encoding a consequentialist utility function. I suspect, but leave to be shown, that this is the case. Why? Because existentialism (which I haven’t yet gone into much detail describing) is largely a philosophy about how we (individually, as beings thrown into existence) come to have values in the first place and what we do when those values or the absurdity of circumstances lead us to despair. Existentialism is really a kind of phenomenological metaethics in its own right, one that is quite fluid and resists encapsulation in a utility calculus. Most existentialists would argue that at the point where one externalizes one’s values as a utility function as opposed to living as them and through them, one has lost something precious. The kinds of things that existentialism derives ethical imperatives from, such as the relationship between one’s facticity and transcendence, or one’s will to grow in one’s potential and the inevitability of death, are not the kinds of things a (limited, realistic) AI can have much effect on. They are part of what has been perhaps quaintly called the human condition.

To even try to describe this research problem, one has to shift linguistic registers. The existentialist and AI research traditions developed in very divergent contexts. This is one reason to believe that their ideas are new to each other, and that a synthesis may be productive. In order to accomplish this, one needs a charitably considered, working understanding of existentialism. I will try to provide one in my next post in this series.

by Sebastian Benthall at October 06, 2017 01:15 PM

October 03, 2017

Ph.D. student

“The Microeconomics of Complex Economies”

I’m dipping into The microeconomics of complex economies: Evolutionary, institutional, neoclassical, and complexity perspectives, by Elsner, Heinrich, and Schwardt, all professors at the University of Bremen.

It is a textbook, as one would teach a class from. It is interesting because it is self-consciously written as a break from neoclassical microeconomics. According to the authors, this break had been a long time coming but the last straw was the 2008 financial crisis. This at last, they claim, showed that neoclassical faith in market equilibrium was leaving something important out.

Meanwhile, “heterodox” economics has been maturing for some time in the economics blogosphere, while complex systems people have been interested in economics since the emergence of the field. What Elsner, Heinrich, and Schwardt appear to be doing with this textbook is providing a template for an undergraduate level course on the subject, legitimizing it as a discipline. They are not alone. They cite Bowles’s Microeconomics as worthy competition.

I have not yet read the chapter of the Elsner, Heinirch, and Schwardt book that covers philosophy of science and its relationship to the validity of economics. It looks from a glance at it very well done. But I wanted to note my preliminary opinion on the matter given my recent interest in Shapiro and Varian‘s information economics and their claim to be describing ‘laws of economics’ that provide a reliable guide to business strategy.

In brief, I think Shapiro and Varian are right: they do outline laws of economics that provide a reliable guide to business strategy. This is in fact what neoclassical economics is good for.

What neoclassical economics is not always great at is predicting aggregate market behavior in a complex world. It’s not clear if any theory could ever be good at predicting aggregate market behavior in a complex world. It is likely that if there were one, it would be quickly gamed by investors in a way that would render it invalid.

Given vast information asymmetries it seems the best one could hope for is a theory of the market being able to assimilate the available information and respond wisely. This is the Hayekian view, and it’s not mainstream. It suffers the difficulty that it is hard to empirically verify that a market has performed optimally given that no one actor, including the person attempting the verify Hayekian economic claims, has all the information to begin with. Meanwhile, it seems that there is no sound a priori reason to believe this is the case. Epstein and Axtell (1996) have some computational models where they test when agents capable of trade wind up in an equilibrium with market-clearing prices and in their models this happens under only very particular an unrealistic conditions.

That said, predicting aggregate market outcomes is a vastly different problem than providing strategic advice to businesses. This is the point where academic critiques of neoclassical economics miss the mark. Since phenomena concerning supply and demand, pricing and elasticity, competition and industrial organization, and so on are part of the lived reality of somebody working in industry, formalizations of these aspects of economic life can be tested and propagated by many more kinds of people than the phenomena of total market performance. The latter is actionable only for a very rare class of policy-maker or financier.


Bowles, S. (2009). Microeconomics: behavior, institutions, and evolution. Princeton University Press.

Elsner, W., Heinrich, T., & Schwardt, H. (2014). The microeconomics of complex economies: Evolutionary, institutional, neoclassical, and complexity perspectives. Academic Press.

Epstein, Joshua M., and Robert Axtell. Growing artificial societies: social science from the bottom up. Brookings Institution Press, 1996.

by Sebastian Benthall at October 03, 2017 02:41 PM

September 29, 2017

Center for Technology, Society & Policy

Join CTSP for social impact Un-Pitch Day on October 27th

Are you a local nonprofit or community organization that has a pressing challenge that you think technology might be able to address, but you don’t know where to start?

If so, join us and the UC Berkeley School of Information’s IMSA (Information Management Student Association) for Un-Pitch Day on October 27th from 4 – 7pm, where graduate students will offer their technical expertise to help address your organization’s pressing technology challenges. During the event, we’ll have you introduce your challenge(s) and desired impact and partner you with grad students with activities to explore your challenge(s) and develop refined questions to push the conversation forward.

You’d then have the opportunity to pitch your challenge(s) with the goal of potentially matching with a student project group to adopt your project. By attending Un-Pitch day, you would gain a more defined sense of how to address your technology challenge, and, potentially, a team of students interested in working with your org to develop a prototype or a research project to address it.

Our goal is to both help School of Information grad students (and other UCB grad students) identify potential projects they can adopt for the 2017-2018 academic year (ending in May). Working in collaboration with your organization, our students can help develop a technology-focused project or conduct technology-related research to aid your organization.

There is also the possibility of qualifying for funding ($2000 per project team member) for technology projects with distinct public interest/public policy goals through the Center for Technology, Society & Policy (funding requires submitting an application to the Center, due in late November). Please note that we cannot guarantee that each project presented at Un-Pitch Day will match with an interested team.

Event Agenda

Friday, October 27th from 4 – 7pm at South Hall on the UC Berkeley campus

Light food & drinks will be provided for registered attendees.

Registration is required for this event; click here to register.

4:00 – 4:45pm Social impact organization introductions and un-pitches of challenges

4:45 – 5:00pm CTSP will present details about public interest project funding opportunities and deadlines.

5:00 – 6:00pm Team up with grad students through “speed dating” activities to break the ice and explore challenge definitions and develop fruitful questions from a range of diverse perspectives.

6:00 – 7:00pm Open house for students and organizations to mingle and connect over potential projects. Appetizers and refreshments provided by CTSP.

by Daniel Griffin at September 29, 2017 05:45 PM

September 27, 2017

Ph.D. student

New article about algorithmic systems in Wikipedia and going ‘beyond opening up the black box’

I'm excited to share a new article, "Beyond opening up the black box: Investigating the role of algorithmic systems in Wikipedian organizational culture" (open access PDF here). It is published in Big Data & Society as part of a special issue on "Algorithms in Culture," edited by Morgan Ames, Jason Oakes, Massimo Mazzotti, Marion Fourcade, and Gretchen Gano. The special issue came out of a fantastic workshop of the same name held last year at UC-Berkeley, where we presented and workshopped our papers, which were all taking some kind of socio-cultural approach to algorithms (broadly defined). This was originally a chapter of my dissertation based on my ethnographic research into Wikipedia, and it has gone through many rounds of revision across a few publications, as I've tried to connect what I see in Wikipedia to broader conversations about the role of highly-automated, data-driven systems across platforms and domains.

I use the case of Wikipedia's unusually open algorithmic systems to rethink the "black box" metaphor, which has become a standard way to think about ethical, social, and political issues around artificial intelligence, machine learning, expert systems, and other automated, data-driven decisionmaking processes. Entire conferences are being held on these topics, like Fairness, Accountability, and Transparency in Machine Learning (FATML) and Governing Algorithms. In much current scholarship and policy advocacy, there is often an assumption that we are after some internal logic embedded into the codebase (or "the algorithm") itself, which has been hidden from us under reasons of corporate or state secrecy. Many times this is indeed the right goal, but scholars are increasingly raising broader and more complex issues around algorithmic systems, such as work from Nick Seaver (PDF), Tarleton Gillespie (PDF), and Kate Crawford (link), and Jenna Burrell (link), which I build on in the case of Wikipedia. What happens when the kind of systems that are kept under tight lock-and-key at Google, Facebook, Uber, the NSA, and so on are not just open sourced in Wikipedia, but also typically designed and developed in an open, public process in which developers have to explain their intentions and respond to questions and criticism?

In the article, I discuss these algorithmic systems as being a part of Wikipedia's particular organizational culture, focusing on how becoming and being a Wikipedian involves learning not just traditional cultural norms, but also familiarity with various algorithmic systems that operate across the site. In Wikipedia's unique setting, we see how the questions of algorithmic transparency and accountability subtly shift away from asking if such systems are open to an abstract, aggregate "public." Based on my experiences in Wikipedia, I instead ask: For whom are these systems open, transparent, understandable, interpretable, negotiable, and contestable? And for whom are they as opaque, inexplicable, rigid, bureaucratic, and even invisible as the jargon, rules, routines, relationships, and ideological principles of any large-scale, complex organization? Like all cultures, Wikipedian culture can be quite opaque, hard to navigate, difficult to fully explain, constantly changing, and has implicit biases – even before we consider the role of algorithmic systems. In looking to approaches to understanding culture from the humanities and the interpretive social sciences, we get a different perspective on what it means for algorithmic systems to be open, transparent, accountable, fair, and explainable.

I should say that I'm a huge fan and advocate of work on "opening the black box" in a more traditional information theory approach, which tries to audit and/or reverse engineer how Google search results are ranked, how Facebook news feeds are filtered, how Twitter's trending topics are identified, or similar kinds of systems that are making (or helping make) decisions about setting bail for a criminal trial, who gets a loan, or who is a potential terrorist threat. So many of these systems that make decisions about the public are opaque to the public, protected as trade secrets or for reasons of state security. There is a huge risk that such systems have deeply problematic biases built-in (unintentionally or otherwise), and many people are trying to reverse engineer or otherwise audit such systems, as well as looking at issues like biases in the underlying training data used for machine learning. For more on this topic, definitely look through the proceedings of FATML, read books like Frank Pasquale's The Black Box Society and Cathy O'Neill's Weapons of Math Destruction, and check out the Critical Algorithms Studies reading list.

Yet when I read this kind of work and hear these kinds of conversations, I often feel strangely out of place. I've spent many years investigating the role of highly-automated algorithmic systems in Wikipedia, whose community has strong commitments to openness and transparency. And now I'm in the Berkeley Institute for Data Science, an interdisciplinary academic research institute where open source, open science, and reproducibility are not only core values many people individually hold, but also a major focus area for the institute's work.

So I'm not sure how to make sense of my own position in the "algorithms studies" sub-field when I hear of heroic (and sometimes tragic) efforts to try and pry open corporations and governmental institutions that are increasingly relying on new forms of data-driven, automated decision-making and classification. If anything, I have the opposite problem: in the spaces I tend to spend time in, the sheer amount of code and data I can examine can be so open that it is overwhelming to navigate. There are so many people in academic research and the open source / free culture movements who are wanting a fresh pair of eyes on the work they've done, which often use many the same fundamental approaches and technologies that concern us when hidden away by corporations and governments.

Wikipedia has received very little attention from those who focus on issues around algorithmic opacity and interpretability (even less so than scientific research, but that's a different topic). Like almost all the major user-generated content platforms, Wikipedia deeply relies on automated systems for reviewing and moderating the massive number of contributions made to Wikipedia articles every day. Yet almost all of the code and most of the data keeping Wikipedia running is open sourced, including the state-of-the-art machine learning classifiers trained to distinguish good contributions from bad ones (for different definitions of good and bad).

The design, development, deployment, and discussion of such systems generally takes place in public forums, including wikis, mailing lists, chat rooms, code repositories, and issue/bug trackers. And this is not just a one-way mirror into the organization, as volunteers can and do participate in these debates and discussions. In fact, the people who are paid staff at the Wikimedia Foundation tasked with developing and maintaining these systems are often recruiting volunteers to help, since the Foundation is a non-profit that doesn't have the resources that a large company or even a smaller startup has.

From all this, Wikipedia may appear to be the utopia of algorithmic transparency and accountability that many scholars, policymakers, and even some industry practitioners are calling for in other major platforms and institutions. So for those of us who are concerned with black-boxed algorithmic systems, I ask: is open source, open data, and open process the solution to all our problems? Or more constructively, when those artificial constraints on secrecy are not merely removed by some external fiat, but something that people designing, developing, and deploying such systems strongly oppose on ideological grounds, what will our next challenge be?

In trying to work through my understanding of this issue, I argue we need to take an expanded micro-sociological view of algorithmic systems as deeply entwined with particular facets of culture. We need to look at algorithmic systems not just in terms of how they make decisions or recommendations by transforming inputs into outputs, but also asking how they transform what it means to participate in a particular socio-technical space. Wikipedia is a great place to study that, and many Wikipedia researchers have focused on related topics. For example, newcomers to Wikipedia must learn that in order to properly participate in the community, they have to directly and indirectly interact with various automated systems, such as tagging requests with machine-readable codes so that they are properly circulated to others in the community. And in terms of newcomer socialization, it probably isn't wise to tell someone about how to properly use these machine-readable templates, then send them to the code repository for the bot that parses these templates to assist with the task at hand.

It certainly makes sense that newcomers to a place like Wikipedia have to learn its organizational culture to fully participate. I'm not arguing that these barriers to entry are inherently bad and should be dismantled as a matter of principle. Over time, Wikipedians have developed a specific organizational culture through various norms, jargon, rules, processes, standards, communication platforms beyond the wiki, routinized co-located events, as well as bots, semi-automated tools, browser extensions, dashboards, scripted templates, and code directly built into the platform. This is a serious accomplishment and it is a crucial part of the story about how Wikipedia became one of the most widely consulted sources of knowledge today, rather than the frequently-ridiculed curiosity I remember it being in the early 2000s. And it is an even greater accomplishment that virtually all of this work is done in ways that are, in principle, accessible to the general public.

But what does that openness of code and development mean in practice? Who can meaningfully make use of what even to a long-time Wikipedian like me often feels like an overwhelming amount of openness? My argument isn't that open source, open code, and open process somehow doesn't make a difference. It clearly does in many different ways, but Wikipedia shows us that we should asking: when, where, and for whom does openness make more or less of a difference? Openness is not equally distributed, because openness takes certain kinds of work, expertise, self-efficacy, time, and autonomy to properly take advantage of it, as Nate Tkacz has noted with Wikipedia in general. For example, I reference Ezster Hargattai's work on digital divides, in which she argues that just giving access to the Internet isn't enough; we have to also teach people how to use and take advantage of the Internet, and these "second-level digital divides" are often where demographic gaps widen even more.

There is also an analogy here with Jo Freeman's famous piece The Tyranny of Structurelessness, in which she argues that documented, formalized rules and structures can be far more inclusive than informal, unwritten rules and structures. Newcomers can more easily learn what is openly documented and formalized, while it is often only possible to learn the informal, unwritten rules and structures by either having a connection to an insider or accidentally breaking them and being sanctioned. But there is also a problem with the other extreme, when the rules and structures grow so large and complex that they become a bureaucratic labyrinth that is just as hard for the newcomer to learn and navigate.

So for veteran Wikipedians, highly-automated workflows like speedy deletion can be a powerful way to navigate and act within Wikipedia at scale, in a similar way that Wikipedia's dozens of policies make it easy for veterans to speak volumes just by saying that an article is a CSD#A7, for example. For its intended users, it sinks into the background and becomes second nature, like all good infrastructure does. The veteran can also foreground the infrastructure and participate in complex conversations and collective decisions about how these tools should change based on various ideas about how Wikipedia should change – as Wikipedians frequently do. But for the newcomer, the exact same system – which is in principle almost completely open and contestable to anyone who opens up a ticket on Phabricator – can look and feel quite different. And just knowing "how to code" in the abstract isn't enough, as newcomers must learn how code operates in Wikipedia's unique organizational culture, which has many differences from other large-scale open source software projects.

So this article might seem on the surface to be a critique of Wikipedia, but it is more a critique of my wonderful, brilliant, dedicated colleagues who are doing important work to try and open up (or at least look inside) the proprietary algorithmic systems that are playing important roles in major platforms and institutions. Make no mistake: despite my critiques of the information theory metaphor of the black box, their work within this paradigm is crucial, because there can be many serious biases and inequalities that are intentionally or unintentionally embedded in and/or reinforced through such systems.

However, we must also do research in the tradition of the interpretive social sciences to understand the broader cultural dynamics around how people learn, navigate, and interpret algorithmic systems, alongside all of the other cultural phenomena that remain as "black boxed" as the norms, discourses, practices, procedures and ideological principles present in all cultures. I'm not the first one to raise these kinds of concerns, and I also want to highlight the work like that of Motahhare Eslami et al (PDF1, PDF2) on people's various "folk theories" of opaque algorithmic systems in social media sites. The case of Wikipedia shows how when such systems are quite open, it is perhaps even more important to understand how these differences make a difference.

by R. Stuart Geiger at September 27, 2017 07:00 AM

September 24, 2017

Ph.D. student

Existentialism in Design: Motivation

There has been a lot of recent work on the ethics of digital technology. This is a broad area of inquiry, but it includes such topics as:

  • The ethics of Internet research, including the Facebook emotional contagion study and the Encore anti-censorship study.
  • Fairness, accountability, and transparnecy in machine learning.
  • Algorithmic price-gauging.
  • Autonomous car trolley problems.
  • Ethical (Friendly?) AI research? This last one is maybe on the fringe…

If you’ve been reading this blog, you know I’m quite passionate about the intersection of philosophy and technology. I’m especially interested in how ethics can inform the design of digital technology, and how it can’t. My dissertation is exploring this problem in the privacy engineering literature.

I have a some dissatisfaction towards this field which I don’t expect to make it into my dissertation. One is that the privacy engineering literature and academic “ethics of digital technology” more broadly tends to be heavily informed by the law, in the sense of courts, legislatures, and states. This is motivated by the important consideration that technology, and especially technologists, should in a lot of cases be compliant with the law. As a practical matter, it certainly spares technologists the trouble of getting sued.

However, being compliant with the law is not precisely the same things as being ethical. There’s a long ethical tradition of civil disobedience (certain non-violent protest activities, for example) which is not strictly speaking legal though it has certainly had impact on what is considered legal later on. Meanwhile, the point has been made but maybe not often enough that legal language often looks like ethical language, but really shouldn’t be interpreted that way. This is a point made by Oliver Wendell Holmes Junior in his notable essay, “The Path of the Law”.

When the ethics of technology are not being framed in terms of legal requirements, they are often framed in terms of one of two prominent ethical frameworks. One framework is consequentialism: ethics is a matter of maximizing the beneficial consequences and minimizing the harmful consequences of ones actions. One variation of consequentialist ethics is utilitarianism, which attempts to solve ethical questions by reducing them to a calculus over “utility”, or benefit as it is experienced or accrued by individuals. A lot of economics takes this ethical stance. Another, less quantitative variation of consequentialist ethics is present in the research ethics principle that research should maximize benefits and minimize harms to participants.

The other major ethical framework used in discussions of ethics and technology is deontological ethics. These are ethics that are about rights, duties, and obligations. Justifying deontological ethics can be a little trickier than justifying consequentialist ethics. Frequently this is done by invoking social norms, as in the case of Nissenbaum’s contextual integrity theory. Another variation of a deontological theory of ethics is Habermas’s theory of transcendental pragmatics and legitimate norms developed through communicative action. In the ideal case, these norms become encoded into law, though it is rarely true that laws are ideal.

Consequentialist considerations probably make the world a better place in some aggregate sense. Deontological considerations probably maybe the world a fairer or at least more socially agreeable place, as in their modern formulations they tend to result from social truces or compromises. I’m quite glad that these frameworks are taken seriously by academic ethicists and by the law.

However, as I’ve said I find these discussions dissatisfying. This is because I find both consequentialist and deontological ethics to be missing something. They both rely on some foundational assumptions that I believe should be questioned in the spirit of true philosophical inquiry. A more thorough questioning of these assumptions, and tentative answers to them, can be found in existentialist philosophy. Existentialism, I would argue, has not had its due impact on contemporary discourse on ethics and technology, and especially on the questions surrounding ethical technical design. This is a situation I intend to one day remedy. Though Zach Weinersmith has already made a fantastic start:

“Self Driving Car Ethics”, by Weinersmith

SMBC: Autonomous vehicle ethics

What kinds of issues would be raised by existentialism in design? Let me try out a few examples of points made in contemporary ethics of technology discourse and a preliminary existentialist response to them.

Ethical Charge Existentialist Response
A superintelligent artificial intelligence could, if improperly designed, result in the destruction or impairment of all human life. This catastrophic risk must be avoided. (Bostrom, 2014) We are all going to die anyway. There is no catastrophic risk; there is only catastrophic certainty. We cannot make an artificial intelligence that prevents this outcome. We must instead design artificial intelligence that makes life meaningful despite its finitude.
Internet experiments must not direct the browsers of unwitting people to test the URLs of politically sensitive websites. Doing this may lead to those people being harmed for being accidentally associated with the sensitive material. Researchers should not harm people with their experiments. (Narayanan and Zevenbergen, 2015) To be held responsible by a state’s criminal justice system for the actions taken by ones browser, controlled remotely from America, is absurd. This absurdity, which pervades all life, is the real problem, not the suffering potentially caused by the experiment (because suffering in some form is inevitable, whether it is from painful circumstance or from ennui.) What’s most important is the exposure of this absurdity and the potential liberation from false moralistic dogmas that limit human potential.
Use of Big Data to sort individual people, for example in the case of algorithms used to choose among applicants for a job, may result in discrimination against historically disadvantaged and vulnerable groups. Care must be taken to tailor machine learning algorithms to adjust for the political protection of certain classes of people. (Barocas and Selbst, 2016) The egalitarian tendency in ethics which demands that the greatest should invest themselves in the well-being of the weakest is a kind of herd morality, motivated mainly by ressentiment of the disadvantaged who blame the powerful for their frustrations. This form of ethics, which is based on base emotions like pity and envy, is life-negating because it denies the most essential impulse of life: to overcome resistance and to become great. Rather than restrict Big Data’s ability to identify and augment greatness, it should be encouraged. The weak must be supported out of a spirit of generosity from the powerful, not from a curtailment of power.

As a first cut at existentialism’s response to ethical concerns about technology, it may appear that existentialism is more permissive about the use and design of technology than consequentialism and deontology. It is possible that this conclusion will be robust to further investigation. There is a sense in which existentialism may be the most natural philosophical stance for the technologist because a major theme in existentialist thought is the freedom to choose ones values and the importance of overcoming the limitations on ones power and freedom. I’ve argued before that Simone de Beauvoir, who is perhaps the most clear-minded of the existentialists, has the greatest philosophy of science because it respects this purpose of scientific research. There is a vivacity to existentialism that does not sweat the small stuff and thinks big while at the same time acknowledging that suffering and death are inevitable facts of life.

On the other hand, existentialism is a morally demanding line of inquiry precisely because it does not use either easy metaethical heuristics (such as consequentialism or deontology) or the bald realities of the human condition as a stopgap. It demands that we tackle all the hard questions, sometimes acknowledging that they are answerable or answerable only in the negative, and muddle on despite the hardest truths. Its aim is to provide a truer, better morality than the alternatives.

Perhaps this is best illustrated by some questions implied by my earlier “existentialist responses” that address the currently nonexistent field of existentialism in design. These are questions I haven’t yet heard asked by scholars at the intersection of ethics and technology.

  • How could we design an artificial intelligence (or, to make it simpler, a recommendation system) that makes the most meaningful choices for its users?
  • What sort of Internet intervention would be most liberatory for the people affected by it?
  • What technology can best promote generosity from the world’s greatest people as a celebration of power and life?

These are different questions from any that you read about in the news or in the ethical scholarship. I believe they are nevertheless important ones, maybe more important than the ethical questions that are more typically asked. The theoretical frameworks employed by most ethicists make assumptions that obscure what everybody already knows about the distribution of power and its abuses, the inevitability of suffering and death, life’s absurdity and especially the absurdity if moralizing sentiment in the face of the cruelty of reality, and so on. At best, these ethical discussions inform the interpretation and creation of law, but law is not the same as morality and to confuse the two robs morality of what is perhaps most essential component, which is that is grounded meaningfully in the experience of the subject.

In future posts (and, ideally, eventually in a paper derived from those posts), I hope to flesh out more concretely what existentialism in design might look like.


Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact.

Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. OUP Oxford.

Narayanan, A., & Zevenbergen, B. (2015). No Encore for Encore? Ethical questions for web-based censorship measurement.

Weinersmith, Z. “Self Driving Car Ethics”. Saturday Morning Breakfast Cereal.

by Sebastian Benthall at September 24, 2017 03:19 AM

September 20, 2017

Ph.D. student

Market segments and clusters of privacy concerns

One result from earlier economic analysis is that in the cases where personal information is being used to judge the economic value of an agent (such as when they are going to be hired, or offered a loan), the market is divided between those that would prefer more personal information to flow (because they are highly qualified, or highly credit-worthy), and those that would rather information not flow.

I am naturally concerned about whether this microeconomic modeling has any sort of empirical validity. However, there is some corroborating evidence in the literature on privacy attitudes. Several surveys (see references) have discovered that people’s privacy attitudes cluster into several groups, those only “marginally concerned”, the “pragmatists”, and the “privacy fundamentalists”. These groups have, respectively, stronger and stronger views on the restriction of their flow of personal information.

It would be natural to suppose that some of the variation in privacy attitudes has to do with expected outcomes of information flow. I.e., if people are worried that their personal information will make them ineligible for a job, they are more likely to be concerned about this information flowing to potential employers.

I need to dig deeper into the literature to see whether factors like income have been shown to be correlated with privacy attitudes.


Ackerman, M. S., Cranor, L. F., & Reagle, J. (1999, November). Privacy in e-commerce: examining user scenarios and privacy preferences. In Proceedings of the 1st ACM conference on Electronic commerce (pp. 1-8). ACM.

B. Berendt et al., “Privacy in E-Commerce: Stated Preferences versus Actual Behavior,” Comm. ACM, vol. 484, pp. 101-106, 2005.

K.B. Sheehan, “Toward a Typology of Internet Users and Online Privacy Concerns,” The Information Soc., vol. 1821, pp. 21-32, 2002.

by Sebastian Benthall at September 20, 2017 03:50 PM

September 18, 2017

Ph.D. student

Economic costs of context collapse

One motivation for my recent studies on information flow economics is that I’m interested in what the economic costs are when information flows across the boundaries of specific markets.

For example, there is a folk theory of why it’s important to have data protection laws in certain domains. Health care, for example. The idea is that it’s essential to have health care providers maintain the confidentiality of their patients because if they didn’t then (a) the patients could face harm due to this information getting into the wrong hands, such as those considering them for employment, and (b) this would disincentivize patients from seeking treatment, which causes them other harms.

In general, a good approximation of general expectations of data privacy is that data should not be used for purposes besides those for which the data subjects have consented. Something like this was encoded in the 1973 Fair Information Practices, for example. A more modern take on this from contextual integrity (Nissenbaum, 2004) argues that privacy is maintained when information flows appropriately with respect to the purposes of its context.

A widely acknowledged phenomenon in social media, context collapse (Marwick and boyd, 2011; Davis and Jurgenson, 2014), is when multiple social contexts in which a person is involved begin to interfere with each other because members of those contexts use the same porous information medium. Awkwardness and sometimes worse can ensue. These are some of the major ways the world has become aware of what a problem the Internet is for privacy.

I’d like to propose that an economic version of context collapse happens when different markets interfere with each other through network-enabled information flow. The bogeyman of Big Brother through Big Data, the company or government that has managed to collect data about everything about you in order to infer everything else about you, has as much to do with the ways information is being used in cross-purposed ways as it has to do with the quantity or scope of data collection.

It would be nice to get a more formal grip on the problem. Since we’ve already used it as an example, let’s try to model the case where health information is disclosed (or not) to a potential employer. We already have the building blocks for this case in our model of expertise markets and our model of labor markets.

There are now two uncertain variables of interest. First, let’s consider a variety of health treatments J such that m = \vert J \vert. The distribution of health conditions in society is distributed such that the utility of a random person i receiving a treatment j is w_{i,j}. Utility for one treatment is not independent from utility from another. So in general \vec{w} \sim W, meaning a person’s utility for all treatments is sampled from an underlying distribution W.

There is also the uncertain variable of how effective somebody will be at a job they are interested in. We’ll say this is distributed according to X, and that a person’s aptitude for the job is x_i \sim X.

We will also say that W and X are not independent from each other. In this model, there are certain health conditions that are disabling with respect to a job, and this has an effect on expected performance.

I must note here that I am not taking any position on whether or not employers should take disabilities into account when hiring people. I don’t even know for sure the consequences of this model yet. You could imagine this scenario taking place in a country which does not have the Americans with Disabilities Act and other legislation that affects situations like this.

As per the models that we are drawing from, let’s suppose that normal people don’t know how much they will benefit from different medical treatments; i doesn’t know \vec{w}_i. They may or may not know x_i (I don’t yet know if this matters). What i does know is their symptoms, y_i \sim Y.

Let’s say person x_i goes to the doctor, reporting y_i, on the expectation that the doctor will prescribe them treatment \hat{j} that maximizes their welfare:

\hat j = arg \max_{j \in J} E[X_j \vert y]

Now comes the tricky part. Let’s say the doctor is corrupt and willing to sell the medical records of her patients to her patient’s potential employers. By assumption y_i reveals information both about w_i and x_i. We know from our earlier study that information about x_i is indeed valuable to the employer. There must be some price (at least within our neoclassical framework) that the employer is willing to pay the corrupt doctor for information about patient symptoms.

We also know that having potential employers know more about your aptitudes is good for highly qualified applicants and bad for not as qualified applicants. The more information employers know about you, the more likely they will be able to tell if you are worth hiring.

The upshot is that there may be some patients who are more than happy to have their medical records sold off to their potential employers because those particular symptoms are correlated with high job performance. These will be attracted to systems that share their information across medical and employment purposes.

But for those with symptoms correlated with lower job performance, there is now a trickier decision. If doctors are corrupt, it may be that they choose not to reveal their symptoms accurately (or at all) because this information might hurt their chances of employment.

A few more wrinkles here. Suppose it’s true the fewer people will go to corrupt doctors because they suspect or know that information will leak to their employers. If there are people who suspect or know that the information that leaks to their employers will reflect on them favorably, that creates a selection effect on who goes to the doctor. This means that the information that i has gone to the doctor, or not, is a signal employers can use to discriminate between potential applicants. So to some extent the harms of the corrupt doctors fall on the less able even if they opt out of health care. They can’t opt out entirely of the secondary information effects.

We can also add the possibility that not all doctors are corrupt. Only some are. But if it’s unknown which doctors are corrupt, the possibility of corruption still affects the strategies of patients/employees in a similar way, only now in expectation. Just as in the Akerlof market for lemons, a few corrupt doctors ruins the market.

I have not made these arguments mathematically specific. I leave that to a later date. But for now I’d like to draw some tentative conclusions about what mandating the protection of health information, as in HIPAA, means for the welfare outcomes in this model.

If doctors are prohibited from selling information to employers, then the two markets do not interfere with each other. Doctors can solicit symptoms in a way that optimizes benefits to all patients. Employers can make informed choices about potential candidates through an independent process. The latter will serve to select more promising applicants from less promising applicants.

But if doctors can sell health information to employers, several things change.

  • Employers will benefit from information about employee health and offer to pay doctors for the information.
  • Some doctors will discretely do so.
  • The possibility of corrupt doctors will scare off those patients who are afraid their symptoms will reveal a lack of job aptitude.
  • These patients no longer receive treatment.
  • This reduces the demand for doctors, shrinking the health care market.
  • The most able will continue to see doctors. If their information is shared with employers, they will be more likely to be hired.
  • Employers may take having medical records available to be bought from corrupt doctors as a signal that the patient is hiding something that would reveal poor aptitude.

In sum, without data protection laws, there are fewer people receiving beneficial treatment and fewer jobs for doctors providing beneficial treatment. Employers are able to make more advantageous decisions, and the most able employees are able to signal their aptitude through the corrupt health care system. Less able employees may wind up being identified anyway through their non-participation in the medical system. If that’s the case, they may wind up returning to doctors for treatment anyway, though they would need to have a way of paying for it besides employment.

That’s what this model says, anyway. The biggest surprise for me is the implication that data protection laws serve this interests of service providers by expanding their customer base. That is a point that is not made enough! Too often, the need for data protection laws is framed entirely in terms of the interests of the consumer. This is perhaps a politically weaker argument, because consumers are not united in their political interest (some consumers would be helped, not harmed, by weaker data protection).


Akerlof, G. A. (1970). The market for” lemons”: Quality uncertainty and the market mechanism. The quarterly journal of economics, 488-500.

Davis, J. L., & Jurgenson, N. (2014). Context collapse: theorizing context collusions and collisions. Information, Communication & Society, 17(4), 476-485.

Marwick, A. E., & Boyd, D. (2011). I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience. New media & society, 13(1), 114-133.

Nissenbaum, H. (2004). Privacy as contextual integrity. Wash. L. Rev., 79, 119.

by Sebastian Benthall at September 18, 2017 02:08 AM

September 13, 2017

Ph.D. student

Credit scores and information economics

The recent Equifax data breach brings up credit scores and their role in the information economy. Credit scoring is a controversial topic in the algorithmic accountability community. Frank Pasquale, for example, writes about it in The Black Box Society. Most of the critical writing on the subject points to how credit scoring might be done in a discriminatory or privacy-invasive way. As interesting as those critiques are from a political and ethical perspective, it’s worth reviewing what credit scores are for in the first place.

Let’s model this as we have done in other cases of information flow economics.

There’s a variable of interest, the likelihood that a potential borrower will not default on a loan, X. Note that any value sampled from this x will vary within the interval [0,1] because it is a value of probability.

There’s a decision to be made by a bank: whether or not to provide a random borrower a loan.

To keep things very simple, let’s suppose that the bank gets a payoff of 1 if the borrower is given a loan and does not default and gets a payoff of -1 if the borrower gets the loan and defaults. The borrower gets a payoff of 1 if he gets the loan and 0 otherwise. The bank’s strategy is to avoid giving loans that lead to negative expected payoff. (This is a gross oversimplification of, but is essentially consistent with, the model of credit used by Blöchlinger and Leippold (2006).

Given a particular x, the expected utility of the bank is:

x (1) + (1 - x) (-1) = 2x - 1

Given the domain of [0,1], this function ranges from -1 to 1, hitting 0 when x = .5.

We can now consider welfare outcomes under conditions of now information flow, total information flow, and partial information flow.

Suppose the bank has no insight into x besides a prior expectation X. Then the expected value of the bank upon offering the loan is E[2x+1]. If it is above zero, the bank will offer the loan and the borrower gets a positive payoff. If it is below zero, the bank will not offer the loan and both the bank and potential borrower will get zero payoff. The outcome depends entirely on the prior probability of loan default and is either rewards borrowers or not depending on that distribution.

If the bank has total insight into x, then the outcomes are different. The bank can use the option to reject borrowers for whom x is less than .5, and accept those for whom x is greater than .5. If we see the game as repeated over many borrowers whose chances of paying off their loan are all sampled from X. Then the additional knowledge of the bank creates two classes of potential borrowers, one that gets loans and one that does not. This increases inequality among borrowers.

It also increases the utility of the bank. This is perhaps best illustrated with a simple example. Suppose the distribution X is uniform over the unit interval [0,1]. Then the expected value of the bank’s payoff under complete information is

\int_{.5}^{1} 2x - 1 dx = 0.25

which is a significant improvement over the expected payoff of 0 in the uninformed case.

Putting off an analysis of the partial information case for now, suffice it to say that we expect partial information (such as a credit score) to lead to an intermediate result, improving bank profits and differentiating borrowers with respect to the bank’s choice to loan.

What is perhaps most interesting about this analysis is the similarity between it and Posner’s employment market. In both cases, the subject of the variable of interest X is a person’s prospects for improving the welfare of the principle decision-maker upon their being selected, where selection also implies benefit to the subject. Uncertainty about the prospects leads to equal treatment of prospective persons and reduced benefit to the principle. More information leads to differentiated impact to the prospects and benefit to the principle.


Blöchlinger, A., & Leippold, M. (2006). Economic benefit of powerful credit scoring. Journal of Banking & Finance, 30(3), 851-873.

by Sebastian Benthall at September 13, 2017 06:01 PM

September 12, 2017

Ph.D. alumna

Data & Society’s Next Stage

In March 2013, in a flurry of days, I decided to start a research institute. I’d always dreamed of doing so, but it was really my amazing mentor and boss – Jennifer Chayes – who put the fire under my toosh. I’d been driving her crazy about the need to have more people deeply interrogating how data-driven technologies were intersecting with society. Microsoft Research didn’t have the structure to allow me to move fast (and break things). University infrastructure was even slower. There were a few amazing research centers and think tanks, but I wanted to see the efforts scale faster. And I wanted to build the structures to connect research and practices, convene conversations across sectors, and bring together a band of what I loved to call “misfit toys.”  So, with the support of Jennifer and Microsoft, I put pen to paper. And to my surprise, I got the green light to help start a wholly independent research institute.

I knew nothing about building an organization. I had never managed anyone, didn’t know squat about how to put together a budget, and couldn’t even create a check list of to-dos. So I called up people smarter than I to help learn how other organizations worked and figure out what I should learn to turn a crazy idea into reality. At first, I thought that I should just go and find someone to run the organization, but I was consistently told that I needed to do it myself, to prove that it could work. So I did. It was a crazy adventure. Not only did I learn a lot about fundraising, management, and budgeting, but I also learned all sorts of things about topics I didn’t even know I would learn to understand – architecture, human resources, audits, non-profit law. I screwed up plenty of things along the way, but most people were patient with me and helped me learn from my mistakes. I am forever grateful to all of the funders, organizations, practitioners, and researchers who took a chance on me.

Still, over the next four years, I never lost that nagging feeling that someone smarter and more capable than me should be running Data & Society. I felt like I was doing the organization a disservice by not focusing on research strategy and public engagement. So when I turned to the board and said, it’s time for an executive director to take over, everyone agreed. We sat down and mapped out what we needed – a strategic and capable leader who’s passionate about building a healthy and sustainable research organization to be impactful in the world. Luckily, we had hired exactly that person to drive program and strategy a year before when I was concerned that I was flailing at managing the fieldbuilding and outreach part of the organization.

I am overwhelmingly OMG ecstatically bouncing for joy to announce that Janet Haven has agreed to become Data & Society’s first executive director. You can read more about Janet through the formal organizational announcement here.  But since this is my blog and I’m telling my story, what I want to say is more personal. I was truly breaking when we hired Janet. I had taken off more than I could chew. I was hitting rock bottom and trying desperately to put on a strong face to support everyone else. As I see it, Janet came in, took one look at the duct tape upon which I’d built the organization and got to work with steel, concrete, and wood in her hands. She helped me see what could happen if we fixed this and that. And then she started helping me see new pathways for moving forward. Over the last 18 months, I’ve grown increasingly confident that what we’re doing makes sense and that we can build an organization that can last. I’ve also been in awe watching her enable others to shine.

I’m not leaving Data & Society. To the contrary, I’m actually taking on the role that my title – founder and president – signals. And I’m ecstatic. Over the last 4.5 years, I’ve learned what I’m good at and what I’m not, what excites me and what makes me want to stay in bed. I built Data & Society because I believe that it needs to exist in this world. But I also realize that I’m the classic founder – the crazy visionary that can kickstart insanity but who isn’t necessarily the right person to take an organization to the next stage. Lucky for me, Janet is. And together, I can’t wait to take Data & Society to the next level!

by zephoria at September 12, 2017 02:34 PM

September 11, 2017

Ph.D. student

Information flow in economics

We have formalized three different cases of information economics:

What we discovered is that each of these cases has, to some extent, a common form. That form is this:

There is a random variable of interest, x \sim X (that is, a value x sampled from a probability distribution X), that has direct effect on the welfare outcome of decisions made be agents in the economy. In our cases this was the aptitude of job applicants, consumers willingness to pay, and the utility of receiving a range of different expert recommendations, respectively.

In the extreme cases, the agent at the focus of the economic model could act with extreme ignorance of x, or extreme knowledge of it. Generally, the agent’s situation improves the more knowledgeable they are about x. The outcomes for the subjects of X vary more widely.

We also considered the possibility that the agent has access to partial information about X through the observation of a different variable y \sim Y. Upon observation of y, they can make their judgments based on an improved subjective expectation of the unknown variable, P(x \vert y). We assumed that the agent was a Bayesian reasoner and so capable of internalizing evidence according to Bayes rule, hence they are able to compute:

P(X \vert Y) \propto P(Y \vert X) P(X)

However, this depends on two very important assumptions.

The first is that the agent knows the distribution X. This is the prior in their subjective calculation of the Bayesian update. In our models, we have been perhaps sloppy in assuming that this prior probability corresponds to the true probability distribution from which the value x is drawn. We are somewhat safe in this assumption because for the purposes of determining strategy, only subjective probabilities can be taken into account and we can relax the distribution to encode something close to zero knowledge of the outcome if necessary. In more complex models, the difference between agents with different knowledge of X may be more strategically significant, but we aren’t there yet.

The second important assumption is that the agent knows the likelihood function P(Y | X). This is quite a strong assumption, as it implies that the agent knows truly how Y covaries with X, allowing them to “decode” the message y into useful information about x.

It may be best to think of access and usage of the likelihood function as a rare capability. Indeed, in our model of expertise, the assumption was that the service provider (think doctor) knew more about the relationship between X (appropriate treatment) and Y (observable symptoms) than the consumer (patient) did. In the case of companies that use data science, the idea is that some combination of data and science gives the company an edge in knowing the true value of some uncertain property than its competitors.

What we are discovering is that it’s not just the availability of y that matters, but also the ability to interpret y with respect to the probability of x. Data does not speak for itself.

This incidentally ties in with a point which we have perhaps glossed over too quickly in the present discussion, which is what is information, really? This may seem like a distraction in a discussion about economics but it is a question that’s come up in my own idiosyncratic “disciplinary” formation. One of the best intuitive definitions of information is provided by philosopher Fred Dretske (1981; 1983). Made a presentation of Fred Dretske’s view on information and its relationship to epistemological skepticism and Shannon information theory; you can find this presentation here. But for present purposes I want to call attention to his definition of what it means for a message to carry information, which is:

[A] message carries the information that X is a dingbat, say, if and only if one could learn (come to know) that X is a dingbat from the message.

When I say that one could learn that X was a dingbat from the message, I mean, simply, that the message has whatever reliable connection with dingbats is required to enable a suitably equipped, but otherwise ignorant receiver, to learn from it that X is a dingbat.

This formulation is worth mentioning because it supplies a kind of philosophical validation for our Bayesian formulation of information flow in the economy. We are modeling situations where Y is a signal that is reliably connected with X such that instantiations of Y carry information about the value of the X. We might express this in terms of conditional entropy:

H(X|Y) < H(X)

While this is sufficient for Y to carry information about X, it is not sufficient for any observer of Y to consequently know X. An important part of Dretske's definition is that the receiver must be suitably equipped to make the connection.

In our models, the “suitably equipped” condition is represented as the ability to compute the Bayesian update using a realistic likelihood function P(Y \vert X). This is a difficult demand. A lot of computational statistics has to do with the difficulty of tractably estimating the likelihood function, let alone computing it perfectly.


Dretske, F. I. (1983). The epistemology of belief. Synthese, 55(1), 3-19.

Dretske, F. (1981). Knowledge and the Flow of Information.

by Sebastian Benthall at September 11, 2017 09:42 PM

Economics of expertise and information services

We have no considered two models of how information affects welfare outcomes.

In the first model, inspired by an argument from Richard Posner, the are many producers (employees, in the specific example, but it could just as well be cars, etc.) and a single consumer. When the consumer knows nothing about the quality of the producers, the consumer gets an average quality producer and the producers split the expected utility of the consumer’s purchase equally. When the consumer is informed, she benefits and so does the highest quality producer, at the detriment of the other producers.

In the second example, inspired by Shapiro and Varian’s discussion of price differentiation in the sale of information goods, there was a single producer and many consumers. When the producer knows nothing about the “quality” of the consumers–their willingness to pay–the producer charges all consumers a profit-maximizing price. This price leaves many customers out of reach of the product, and many others getting a consumer surplus because the product is cheap relative to their demand. When the producer is more informed, they make more profit by selling as personalized prices. This lets the previously unreached customers in on the product at a compellingly low price. It also allows the producer to charge higher prices to willing customers; they capture what was once consumer surplus for themselves.

In both these cases, we have assumed that there is only one kind of good in play. It can vary numerically in quality, which is measured in the same units as cost and utility.

In order to bridge from theory of information goods to theory of information services, we need to take into account a key feature of information services. Consumers buy information when they don’t know what it is they want, exactly. Producers of information services tailor what they provide to the specific needs of the consumers. This is true for information services like search engines but also other forms of expertise like physician’s services, financial advising, and education. It’s notable that these last three domains are subject to data protection laws in the United States (HIPAA, GLBA, and FERPA) respectively, and on-line information services are an area where privacy and data protection are a public concern. By studying the economics of information services and expertise, we may discover what these domains have in common.

Let’s consider just a single consumer and a single producer. The consumer has a utility function \vec{x} \sim X (that is, sampled from random variable X, specifying the values it gets for the consumption of each of m = \vert J \vert products. We’ll denote with x_j the utility awarded to the consumer for the consumption of product j \in J.

The catch is that the consumer does not know X. What they do know is y \sim Y, which is correlated with X is some way that is unknown to them. The consumer tells the producer y, and the producer’s job is to recommend to them j \in J that will most benefit them. We’ll assume that the producer is interested in maximizing consumer welfare in good faith because, for example, they are trying to promote their professional reputation and this is roughly in proportion to customer satisfaction. (Let’s assume they pass on costs of providing the product to the consumer).

As in the other cases, let’s consider first the case where the acting party has no useful information about the particular customer. In this case, the producer has to choose their recommendation \hat j based on their knowledge of the underlying probability distribution X, i.e.:

\hat j = arg \max_{j \in J} E[X_j]

where X_j is the probability distribution over x_j implied by X.

In the other extreme case, the producer has perfect information of the consumer’s utility function. They can pick the truly optimal product:

\hat j = arg \max_{j \in J} x_j

How much better off the consumer is in the second case, as opposed to the first, depends on the specifics of the distribution X. Suppose X_j are all independent and identically distributed. Then an ignorant producer would be indifferent to the choice of \hat j, leaving the expected outcome for the consumer E[X_j], whereas the higher the number of products m the more \max_{j \in J} x_j will approach the maximum value of X_j.

In the intermediate cases where the producer knows y which carries partial information about \vec{x}, they can choose:

\hat j = arg \max_{j \in J} E[X_j \vert y] =

arg \max_{j \in J} \sum x_j P(x_j = X_j \vert y) =

arg \max_{j \in J} \sum x_j P(y \vert x_j = X_j) P(x_j = X_j)

The precise values of the terms here depend on the distributions X and Y. What we can know in general is that the more informative is y is about x_j, the more the likelihood term P(y \vert x_j = X_j) dominates the prior P(x_j = X_j) and the condition of the consumer improves.

Note that in this model, it is the likelihood function P(y \vert x_j = X_j) that is the special information that the producer has. Knowledge of how evidence (a search query, a description of symptoms, etc.) are caused by underlying desire or need is the expertise the consumers are seeking out. This begins to tie the economics of information to theories of statistical information.

by Sebastian Benthall at September 11, 2017 01:25 AM

September 09, 2017

Ph.D. student

Formalizing welfare implications of price discrimination based on personal information

In my last post I formalized Richard Posner’s 1981 argument concerning the economics of privacy. This is just one case of the economics of privacy. A more thorough analysis of the economics of privacy would consider the impact of personal information flow in more aspects of the economy. So let’s try another one.

One major theme of Shapiro and Varian’s Information Rules (1999) is the importance of price differentiation when selling information goods and how the Internet makes price differentiation easier than ever. Price differentiation likely motivates much of the data collection on the Internet, though it’s a practice that long predates the Internet. Shapiro and Varian point out that the “special offers” one gets from magazines for an extension to a subscription may well offer a personalized price based on demographic information. What’s more, this personalized price may well be an experiment, testing for the willingness of people like you to pay that price. (See Acquisti and Varian, 2005 for a detailed analysis of the economics of conditioning prices on purchase history.)

The point of this post is to analyze how a firm’s ability to differentiate its prices is a function of the knowledge it has about its customers and hence outcomes change with the flow of personal information. This makes personalized price differentiation a sub-problem of the economics of privacy.

To see this, let’s assume there are a number of customers for a job, i \in I, where the number of customers is n = \left\vert{I}\right\vert. Let’s say each has a willingness to pay for the firm’s product, x_i. Their willingness to pay is sampled from an underlying probability distribution x_i \sim X.

Note two things about how we are setting up this model. The first is that it closely mirrors our formulation of Posner’s argument about hiring job applicants. Whereas before the uncertain personal variable was aptitude for a job, in this case it is willingness to pay.

The second thing to note is that whereas it is typical to analyze price differentiation according to a model of supply and demand, here we are modeling the distribution of demand as a random variable. This is because we are interested in modeling information flow in a specific statistical sense. What we will find is that many of the more static economic tools translate well into a probabilistic domain, with some twists.

Now suppose the firm knows X but does not know any specific x_i. Knowing nothing to differentiate the customers, the firm will choose to offer the product at the same price z to everybody. Each customer will buy the product if x_i > z, and otherwise won’t. Each customer that buys the product contributes z to the firm’s utility (we are assuming an information good with near zero marginal cost). Hence, the firm will pick \hat z according to the following function:

\hat z = arg \max_z E[\sum_i z [x_i > z]] =

\hat z = arg \max_z \sum_i E[z [x_i > z]] =

\hat z = arg \max_z \sum_i z E[[x_i > z]] =

\hat z = arg \max_z \sum_i z P(x_i > z) =

\hat z = arg \max_z \sum_i z P(X > z)

Where [x_i > z] is a function with value 1 if x_i > z and 0 otherwise; this is using Iverson bracket notation.

This is almost identical to the revenue-optimizing strategy of price selection more generally, and it has a number of similar properties. One property is that for every customer for whom x_i > z, there is a consumer surplus of utility $late x_i – z$, that feeling of joy the customer gets for having gotten something valuable for less than they would have been happy to pay for it. There is also the deadweight loss of customers for whom z > x_i. These customers get 0 utility from the product and pay nothing to the producer despite their willingness to pay.

Now consider the opposite extreme, wherein the producer knows the willingness to pay of each customer x_i and can pick a personalized price z_i accordingly. The producer can price z_i = x_i - \epsilon, effectively capturing the entire demand \sum_i x_i as producer surplus, while reducing all consumer surplus and deadweight loss to zero.

What are the welfare implications of the lack of consumer privacy?

Like in the case of Posner’s employer, the real winner here is the firm, who is able to capture all the value added to the market by the increased flow of information. In both cases we have assumed the firm is a monopoly, which may have something to do with this result.

As for consumers, there are two classes of impact. For those with x_i > \hat z, having their personal willingness to pay revealed to the firm means that they lose their consumer surplus. Their welfare is reduced.

For those consumers with x_i < \hat z, these discover that they now can afford the product as it is priced close to their willingness to pay.

Unlike in Posner's case, "the people" here are more equal when their personal information is revealed to the firm because now the firm is extracting every spare ounce of joy it can from each of them, whereas before some consumers were able to enjoy low prices relative to their idiosyncratically high appreciation for the good.

What if the firm has access to partial information about each consumer y_i that is a clue to their true x_i without giving it away completely? Well, since the firm is a Bayesian reasoner they now have the subjective belief P(x_i \vert y_i) and will choose each z_i in a way that maximizes their expected profit from each consumer.

z_i = arg \max_z E[z [P(x_i > z \vert y_i)]]

The specifics of the distributions X, Y, and P(Y | X) all matter for the particular outcomes here, but intuitively one would expect the results of partial information to fall somewhere between the extremes of undifferentiated pricing and perfect price discrimination.

Perhaps the more interesting consequence of this analysis is that the firm has, for each consumer, a subjective probabilistic distribution of that consumer’s demand. Their best strategy for choosing the personalized price is similar to that of choosing a price for a large uncertain consumer demand base, only now the uncertainty is personalized. This probabilistic version of classic price differentiation theory may be more amenable to Bayesian methods, data science, etc.


Acquisti, A., & Varian, H. R. (2005). Conditioning prices on purchase history. Marketing Science, 24(3), 367-381.

Shapiro, C., & Varian, H. R. (1998). Information rules: a strategic guide to the network economy. Harvard Business Press.

by Sebastian Benthall at September 09, 2017 02:15 PM

September 07, 2017

Ph.D. student

Formalizing Posner’s economics of privacy argument

I’d like to take a more formal look at Posner’s economics of privacy argument, in light of other principles in economics of information, such as those in Shapiro and Varian’s Information Rules.

By “formal”, what I mean is that I want to look at the mathematical form of the argument. This is intended to strip out some of the semantics of the problem, which in the case of economics of privacy can lead to a lot of distracting anxieties, often for legitimate ethical reasons. However, there are logical realities that one must face despite the ethical conundrums they cause. Indeed, if there weren’t logical constraints on what is possible, then ethics would be unnecessary. So, let’s approach the blackboard, shall we?

In our interpretation of Posner’s argument, there are a number of applicants for a job, i \in I, where the number of candidates is n = \left\vert{I}\right\vert. Let’s say each is capable of performing at a certain level based on their background and aptitude, x_i. Their aptitude is sampled from an underlying probability distribution x_i \sim X.

There is an employer who must select an applicant for the job. Let’s assume that their capacity to pay for the job is fixed, for simplicity, and that all applicants are willing to accept the wage. The employer must pick an applicant i and gets utility x_i for their choice. Given no information on which to base her choice, she chooses a candidate randomly, which is equivalent to sampling once from X. Her expected value, given no other information on which to make the choice, is E[X]. The expected welfare of each applicant is their utility from getting the job (let’s say it’s 1 for simplicity) times their probability of being picked, which comes to \frac{1}{n}.

Now suppose the other extreme: the employer has perfect knowledge of the abilities of the applicants. Since she is able to pick the best candidate, her utility is \max x_i. Let \hat i = arg\max_{i \in I} x_i. Then the utility for applicant \hat i is 1, and it is 0 for the other applicants.

Some things are worth noting about this outcome. There is more inequality. All expected utility from the less qualified applicants has moved to the most qualified applicant. There is also an expected surplus of (\max x_i) - E[X] that accrues to the totally informed employer. One wonders if a “safety net” were to be provided those who have lost out in this change; if it could be, it would presumably be funded from this surplus. If the surplus were entirely taxed and redistributed among the applicants who did not get the job, it would provide each rejected applicant with \frac{(\max x_i) - E[X]}{n-1} utility. Adding a little complexity to the model we could be more precise by computing the wage paid to the worker and identify whether redistribution could potentially recover the losses of the weaker applicants.

What about intermediary conditions? These get more analytically complex. Suppose that each applicant i produces an application y_i which is reflective of their abilities. When the employer makes her decision, her expectation of the performance of each applicant is

P(x_i \vert y_i) \propto P(y_i \vert x_i)P(x_i)

because naturally the employer is a Bayesian reasoner. She makes her decision by maximizing her expected gain, based on this evidence:

arg\max E[P(x_i \vert y_i)] =

arg\max \sum_{x_i} x_i p(x_i \vert y_i) =

arg\max \sum_{x_i} x_i p(y_i \vert x_i) p(x_i)

The particulars of the distributions X and Y and especially P(Y \vert X) matter a great deal to the outcome. But from the expanded form of the equation we can see that the more revealing y_i is about x_i< the more the likelihood term p(y_i \vert x_i) will overcome the prior expectations. It would be nice to be able to capture the impact of this additional information in a general way. One would think that providing limited information about applicants to the employer would result in an intermediate outcome. Under reasonable assumptions, more qualified applicants would be more likely to be hired and the employer would accrue more value from the work.

What this goes to show is how ones evaluation of Posner's argument about the economics of privacy really has little to do with the way one feels about privacy and much more to do with how one feels about the equality and economic surplus. I've heard that a similar result has been discovered by Solon Barocas, though I'm not sure where in his large body of work to find it.

by Sebastian Benthall at September 07, 2017 10:21 PM

September 06, 2017

Ph.D. student

From information goods to information services

Continuing to read through Information Rules, by Shapiro and Varian (1999), I’m struck once again by its clear presentation and precise wisdom. Many of the core principles resonate with my experience in the software business when I left it in 2011 for graduate school. I think it’s fair to say that Shapiro and Varian anticipated the following decade of  the economics of content and software distribution.

What they don’t anticipate, as far as I can tell, is what has come to dominate the decade after that, this decade. There is little in Information Rules that addresses the contemporary phenomena of cloud computing and information services, such as Software-as-a-Service, Platforms-as-a-Service, and Infrastructure-as-a-Service. Yet these are clearly the kinds of services that have come to dominate the tech market.

That’s an opening. According to a business manager in 2014, there’s no book yet on how to run an SaaS company. While sure that if I were slightly less lazy I would find several, I wonder if they are any good. By “any good”, I mean would they hold up to scientific standards in their elucidation of economic law, as opposed to being, you know, business books.

One of the challenges of working on this which has bothered me since I first became curious about these problems is that there is not very good elegant formalism available for representing competition between computing agents. The best that’s out there is probably in the AI literature. But that literature is quite messy.

Working up from something like Information Rules might be a more promising way of getting at some of these problems. For example, Shapiro and Varian start from the observation that information goods have high fixed (often, sunk) costs and low marginal costs to reproduce. This leads them to the conclusion that the market cannot look like a traditional competitive market with multiple firms selling similar goods but rather must either have a single dominant firm or a market of many similar but differentiated products.

The problem here is that most information services, even “simple” ones like a search engine, are not delivering a good. They are being responsive to some kind of query. The specific content and timing of the query, along with the state of the world at the time of the query, are unique. Consumers may make the same query with varying demand. The value-adding activity is not so much creating the good as it is selecting the right response to the query. And who can say how costly this is, marginally?

On the other hand, this framing obscures something important about information goods, which is that all information goods are, in a sense, a selection of bits from the wide range of possible bits one might send or receive. This leads to my other frustration with information economics, which is that it is insufficiently tied to the statistical definition of information and the modeling tools that have been built around it. This is all the more frustrating because I suspect that in advanced industrial settings these connections have been made and are used with confidence. However, it had been slow to make it into mainstream understanding. There’s another opportunity here.

by Sebastian Benthall at September 06, 2017 01:37 AM

September 02, 2017

MIMS 2014

Movies! (Now with More AI!!)

Earlier, I played around with topic modeling/recommendation engines in Apache Spark. Since then, I’ve been curious to see if I could make any gains by adopting another text processing approach in place of topic modeling—word2vec. For those who don’t know word2vec, it takes individual words and maps them into a vector space where the vector weights are determined by a neural network that trains on a corpus of text documents.

I won’t go into major depth on neural networks here (a gentle introduction for those who are interested), except to say that they are considered among many to be the bleeding-edge of artificial intelligence. Personally, I like word2vec because you don’t necessarily have to train the vectors yourself. Google has pre-trained vectors derived from a massive corpus of news documents they’ve indexed. These vectors are rich in semantic meaning, so it’s pretty cool that you can leverage their value with no extra work. All you have to do is download the (admittedly large 1.5 gig) file onto your computer and you’re good to go.

Almost. Originally, I had wanted to do this on top of my earlier spark project, using the same pseudo-distributed docker cluster on my old-ass laptop. But when I tried to load the pre-trained Google word vectors into memory, I got a big fat MemoryError, which I actually thought was pretty generous because it was nice enough to tell me exactly what it was.

I had three options: commandeer some computers in the cloud on Amazon, try to finagle spark’s configuration like I did last time, or finally, try running Spark in local mode. Since I am still operating on the cheap, I wasn’t gonna go with option one. And since futzing around with Spark’s configuration put me in a dead end last time, I decided to ditch the pseudo-cluster and try running Spark in local mode.

Although local mode was way slower on some tasks, it could still load Google’s pre-trained word2vec model, so I was in business. Similar to my approach with topic modeling, I created a representative vector (or ‘profile’) for each user in the Movielens dataset. But whereas in the topic model, I created a profile vector by taking the max value in each topic across a user’s top-rated movies, here I instead averaged the vectors I derived from each movie (which were themselves averages of word vectors).

Let’s make this a bit more clear. First you take a plot summary scraped from Wikipedia, and then you remove common stop words (‘the’, ‘a’, ‘my’, etc.). Then you pass those words through the pre-trained word2vec model. This maps each word to a vector of length 300 (a word vector can in principle be of any length, but Google’s are of length 300). Now you have D vectors of length 300, where D is the number of words in a plot summary. If you average the values in those D vectors, you arrive at a single vector that represents one movie’s plot summary.

Note: there are other ways of aggregating word vectors into a single document representation (including doc2vec), but I proceeded with averages because I was curious to see whether I could make any gains by using the most dead simple approach.

Once you have an average vector for each movie, you can get a profile vector for each user by averaging (again) across a user’s top-rated movies. At this point, recommendations can be made by ranking the cosine similarity between a user’s profile and the average vectors for each movie. This could power a recommendation engine its own—or supplement explicit ratings for (user, movie) pairs that aren’t observed in the training data.

Cognizant of the hardware limitations I ran up against last time, I opted for the same approach I adopted then, which was to pretend I knew less about users and their preferences than I really did. My main goal was to see whether word2vec could beat out the topic modeling approach, and in fact it did. With 25% of the data covered up, the two algorithms performed roughly the same against the covered up data. But with 75% of the data covered up, word2vec resulted in an 8% performance boost (as compared with 3% gained from topic modeling)

So with very little extra work (simple averaging and pre-trained word vectors), word2vec has pretty encouraging out of the box performance. It definitely makes me eager to use word2vec in the future.

Also a point in word2vec’s favor: when I sanity checked the cosine similarity scores of word2vec’s average vectors across different movies, The Ipcress File shot to the top of the list of movies most similar The Bourne Ultimatum. Still don’t know what The Ipcress File is? Then I don’t feel bad re-using the same joke as a meme sign-off.


by dgreis at September 02, 2017 08:23 PM

August 28, 2017

Ph.D. student

Shapiro and Varian: scientific “laws of economics”

I’ve been amiss in not studying Shapiro and Varian’s Information Rules: A Strategic Guide to the Network Economy (1998, link) more thoroughly. In my years in the tech industry and academic study, there are few sources that deal with the practical realities of technology and society as clearly as Shapiro and Varian. As I now turn my attention more towards the rationale for various forms of information law and find how much of it is driven by considerations of economics, I have to wonder why this was not something I’ve given more emphasis in my graduate study so far.

The answer that comes immediately to mind is that throughout my academic study of the past few years I’ve encountered a widespread hostility to economics from social scientists of other disciplines. This hostility resembles, though is somewhat different from, the hostility social scientists other other stripes have had (in my experience) for engineers. The critiques have been along the lines that economists are powerful disproportionately to the insight provided by the field, that economists are focused too narrowly on certain aspects of social life to the exclusion of others that are just as important, that economists are arrogant in their belief that their insights about incentives apply to other areas of social life besides the narrow concerns of the economy, that economists mistakenly think their methods are more scientific or valid than other social scientists, that economics is in the business of enshrining legal structures into place that give their conclusions more predictive power than they would have in other legal regimes and, as of the most recent news cycle, that the field of economics is hostile to women.

This is a strikingly familiar pattern of disciplinary critique, as it seems to be the same one levied at any field that aims to “harden” inquiry into social life. The encroachment of engineering disciplines and physicists into social explanation has come with similar kinds of criticism. These criticisms, it must be noted, contain at least one contradiction: should economists be concerned about issues besides the economy, or not? But the key issue, as with most disciplinary spats, is the politics of a lot of people feeling dismissed or unheard or unfunded.

Putting all this aside, what’s interesting about the opening sections of Shapiro and Varian’s book is their appeal to the idea of laws of economics, as if there were such laws analogous to laws of physics. The idea is that trends in the technology economy are predictable according to these laws, which have been learned through observation and formalized mathematically, and that these laws should therefore be taught for the benefit of those who would like to participate successfully in that economy.

This is an appealing idea, though one that comes under criticism, you know, from the critics, with a predictability that almost implies a social scientific law. This has been a debate going back to discussions of Marx and communism. Early theorists of the market declared themselves to have discovered economic laws. Marx, incidentally, also declared that he had discovered (different) economic laws, albeit according to the science of dialectical materialism. But the latter declared that the former economic theories hide the true scientific reality of the social relations underpinning the economy. These social relations allowed for the possibility of revolution in a way that an economy of goods and prices abstracted from society did not.

As one form of the story goes, the 20th century had its range of experiments with ways of running an economy. Those most inspired by Marxism had mass famines and other unfortunate consequences. Those that took their inspiration from the continually evolving field of increasingly “neo”-classical economics, with its variations of Keynesianism, monetarism, and the rest, had some major bumps (most recently the 2008 financial crisis) but tends to improve over time with historical understanding and the discovery of, indeed, laws of economics. And this is why Janet Yellen and Mario Draghi are now warning against removing the post-crisis financial market regulations.

This offers an anecdotal counter to the narrative that all economists ever do is justify more terrible deregulation at the expense of the lived experience of everybody else. The discovery of laws of economics can, indeed, be the basis for economic regulation; in fact this is often the case. In point of fact, it may be that this is one of the things that tacitly motivates the undermining of economic epistemology: the fact that if the laws of economics were socially determined to be true, like the laws of physics, such that everybody ought to know them, it would lead to democratic will for policies that would be opposed to the interests of those who have heretofore enjoyed the advantage of their privileged (i.e., not universally shared) access to the powerful truth about markets, technology, etc.

Which is all to say: I believe that condemnations of economics as a field are quite counterproductive, socially, and that the scientific pursuit of the discovery of economic laws is admirable and worthy. Those that criticize economics for this ambition, and teach their students to do so, imperil everyone else and should stop.

by Sebastian Benthall at August 28, 2017 05:12 PM

August 25, 2017

Ph.D. student

Reason returns to Berkeley

I’ve been struck recently by a subtle shift in messaging at UC Berkeley since Carol T. Christ has become the university’s Chancellor. Incidentally, she is the first woman chancellor of the university, with a research background in Victorian literature. I think both of these things may have something to do with the bold choice she’s made in recent announcements: the inclusion of reason as among the University’s core values.

Notably, the word has made its appearance next to three other terms that have had much more prominence in the university in recent years: equity, inclusion, and diversity. For example, in the following statements:

In “Thoughts on Charlottesville”:

We must now come together to oppose what are dangerous threats to the values we hold dear as a democracy and as a nation. Our shared belief in reason, diversity, equity, and inclusion is what animates and supports our campus community and the University’s academic mission. Now, more than ever, those values are under assault; together we must rise to their defense.

And, strikingly, this message on “Free Speech”:

Nonetheless, defending the right of free speech for those whose ideas we find offensive is not easy. It often conflicts with the values we hold as a community—tolerance, inclusion, reason and diversity. Some constitutionally-protected speech attacks the very identity of particular groups of individuals in ways that are deeply hurtful. However, the right response is not the heckler’s veto, or what some call platform denial. Call toxic speech out for what it is, don’t shout it down, for in shouting it down, you collude in the narrative that universities are not open to all speech. Respond to hate speech with more speech.

The above paragraph comes soon after this one, in which Chancellor Christ defends Free Speech on Millian philosophical grounds:

The philosophical justification underlying free speech, most powerfully articulated by John Stuart Mill in his book On Liberty, rests on two basic assumptions. The first is that truth is of such power that it will always ultimately prevail; any abridgement of argument therefore compromises the opportunity of exchanging error for truth. The second is an extreme skepticism about the right of any authority to determine which opinions are noxious or abhorrent. Once you embark on the path to censorship, you make your own speech vulnerable to it.

This slight change in messaging strikes me as fundamentally wise. In the past year, the university has been wracked by extreme passions and conflicting interests, resulting in bad press externally and I imagine discomfort internally. But this was not unprecedented; the national political bifurcation could take hold at Berkeley precisely because it had for years been, with every noble intention, emphasizing inclusivity and equity without elevating a binding agent that makes diversity meaningful and productive. This was partly due to the influence of late 20th century intellectual trends that burdened “reason” with the historical legacy of those regimes that upheld it as a virtue, which tended to be white and male. There was a time when “reason” was so associated with these powers that the term was used for the purposes of exclusion–i.e. with the claim that new entrants to political and intellectual power were being “unreasonable”.

Times have changed precisely because the exclusionary use of “reason” was a corrupt one; reason in its true sense is impersonal and transcends individual situation even as it is immanent in it. This meaning of reason would be familiar to one steeped in an older literature.

Carol Christ’s wording reflects a 21st century theme which to me gives me profound confidence in Berkeley’s future: the recognition that reason does not oppose inclusion, but rather demands it, just as scientific logic demands properly sampled data. Perhaps the new zeitgeist at Berkeley has something to do with the new Data Science undergraduate curriculum. Given the state of the world, I’m proud to see reason make a comeback.

by Sebastian Benthall at August 25, 2017 02:52 PM

August 24, 2017

Center for Technology, Society & Policy

Preparing for Blockchain

by Ritt Keerati, CTSP Fellow | Permalink

Policy Considerations and Challenges for Financial Regulators (Part I)

Blockchain―a distributed ledger technology that maintains a continuously-growing list of records―is an emerging technology that has captured the imagination and investment of Silicon Valley and Wall Street. The technology has propelled the invention of virtual currencies such as Bitcoin and now holds promise to revolutionize a variety of industries including, most notably, the financial sector. Accompanying its disruptive potential, blockchain also carries significant implications and raises questions for policymakers. How will blockchain change the ways financial transactions are conducted? What risks will that pose to consumers and the financial system? How should the new technology be regulated? What roles should the government play in promoting and managing the technology?

Blockchain represents a disruptive technology because it enables the creation of a “trustless network.” The technology enables parties lacking pre-existing trust to transact with one another without the need for intermediaries or central authority. It may revolutionize how financial transactions are conducted, eliminate certain roles of existing institutions, improve transaction efficiencies, and reduce costs.

Despite its massive potential, blockchain is still in its early innings in terms of deployment. So far, the adoption of blockchain within the financial industry has been to facilitate business-to-business transactions or to improve record-keeping processes of existing financial institutions. Besides Bitcoin, direct-to-consumer applications remain limited, and such applications still rely on the existing financial infrastructure. For instance, although blockchain has the potential to disintermediate banks and enable customers to transfer money directly between each other, money transfer applications using blockchain are still linked to bank accounts. As a result, financial institutions still serve as gatekeepers, helping ensure regulatory compliance and consumer protection.

With that said, new use-cases of blockchain are emerging rapidly, and accompanying these developments are risks and challenges. From the regulators’ perspectives, below are some of the key risks that financial regulators must consider in dealing with the emergence of blockchain:

  • Lack of Clarity on Compliance Requirements: New use-cases of blockchain—such as digital token and decentralized payment system—raise questions about applicability of the existing regulatory requirements. For instance, how should an application created by a community of developers to facilitate transfer of digital currencies be regulated? Who should be regulated, given that the software is created by a group of independent developers? How should state-level regulations be applied, especially if the states cannot identify the actual users given blockchain anonymity? Such lack of clarity could lead to the failure to comply and/or higher costs of compliance.
  • Difficulty in Adjusting Regulations to Handle Industry Changes: The lack of effective engagement by the regulators could prevent them from acquiring sufficient knowledge about the technology to be able to issue proper rules and responses or to assist Congress in devising appropriate legislation. For instance, there remains a disagreement on how digital tokens should be treated: as currencies, commodities, or securities?
  • Risks from Industry Front-running Policymakers: The lack of clarity on the existing regulatory framework, coupled with possible emergence of new regulations, could incentivize some industry players to “front-run” the regulators by rolling out their products before new guidelines emerge in hope of forcing the regulators to yield to industry demand. The most evident comparison is Uber, in which the application continues in violation of labor laws.
  • Challenges arising from New Business Models: Blockchain will propel several new business models, some of which could pose regulatory challenges and unknown consequences. For instance, the emergence of a decentralized transaction system raises questions about how such a system should be regulated, how to confirm the identities of relevant parties, how to prevent fraud and money laundering, who to be responsible in the case of fraud and errors, and more.
  • Potential Technical Issues: Blockchain is a new technology—it has been in existence for less than a decade. Therefore, the robustness of the technology has not yet been proven. In fact, there remain several issues to be resolved even with Bitcoin―the most recognized blockchain application―such as scalability, lag time, and other technical glitches. Moreover, features such as identity verification, privacy, and security also have not been fully integrated. Finally, the use of blockchain to upgrade the technical infrastructure also raises questions about interoperability, technology transition, and system robustness.
  • Potential New Systemic Risks: Blockchain has the potential to transform the nature of the transaction network from a centralized to a decentralized system. In addition, it enhances the speed of transaction settlement and clearing and improves transaction visibility. Questions remain whether these features will increase or undermine the stability of the financial system. For instance, given the transaction expediency enabled by blockchain, will the regulators be able to analyze transactional data in real-time, and will they be able to respond quickly to prevent a potential disaster?
  • Risks from Bad Actors: Any financial system is exposed to risks from bad actors; unfortunately, frauds, pyramid schemes, and scams are bound to happen. Because blockchain and digital currencies are new, such risks are potentially heightened as consumers, companies, and regulators are less familiar with the technology. The fact that blockchain changes the way people do business also raises questions about who should be responsible in the case of frauds, whether the damaged parties should be protected and compensated, and who should bear the responsibility of preventing such events and safeguarding consumers.
  • Other Potential Challenges and Opportunities: Blockchain’s revolutionary potential could unveil other policy and societal challenges, not only in the financial industry but also to the society at large. For instance, blockchain could alter the roles of some financial intermediaries, such as banks and brokers, leading to job shrinkage and displacement. At the same time, it could provide other opportunities that would benefit society.

Because blockchain has the potential to transform several industries and because the technology is evolving rapidly, unified and consistent engagement by financial regulators is crucial. However, based on the current dynamics, there is a lack of unified and effective engagement by regulators and legislators in the development and deployment of blockchain technology. The regulators, therefore, must find better ways to interact with the financial and technology industries, balancing between (1) regulating too loosely and thereby introducing risks into the financial system, and (2) regulating too tightly and thereby stifling innovation. Such engagement should aim to help the government monitor activities within industry, learn about the technology and its use-cases, collaborate with industry players, and lead the industry to produce public benefits. Policy alternatives that would facilitate such engagement should aim to achieve the following three objectives:

  • Engage Policymakers in Discussions on Blockchain in Unified and Effective Manners: The policy should promote collaboration between the regulators and industry participants as well as coordination across regulatory agencies. It should create a platform that allows the regulators to (1) convey clear and consistent messages to industry participants, (2) learn from such interaction and use the lessons learned to adjust their rules and responses, (3) provide appropriate recommendations to legislators to help them adjust the policy frameworks, if necessary.
  • Allow Policymakers to Ensure Regulatory Compliance and Maintain Stability of the Financial System: Second, the policy should enable the regulators to ensure industry compliance. More importantly, it should preserve the stability of the financial system. This means that the policy should allow the regulators to anticipate and respond quickly to potential risks that may be introduced by the technology into the financial system.
  • Promote Technological Innovation in Blockchain / distributed ledger technology: Finally, while the policy should aim to enhance the regulators’ understanding of the technology, it should refrain from undermining the industry’s incentives to innovate and utilize the technology. While regulatory compliance and consumer protection are crucial, they should not come at the price of innovation.

Part II of this series will discuss potential alternatives that policymakers may utilize to enhance collaboration among various regulatory agencies and to improve interactions with industry participants.



Policy Alternatives for Financial Regulators and Policymakers (Part II)

In Part I, we discussed potential regulatory concerns arising from the emergence of blockchain technology. Such issues include lack of clarity on compliance requirements, challenges in regulating new business models, potential technical glitches, potential new systemic risks, and challenges in controlling bad actors.

To mitigate these issues, effective interaction between regulators and industry participants is crucial. Currently, there is a lack of unified and effective engagement by regulators and policymakers in the development and deployment of blockchain. Soundly addressing these matters will require better collaboration among regulators and more frequent interactions with industry participants. Rather than maintaining status quo, policymakers may choose among these alternatives to enhance collaboration between the regulators and industry participants:

  • Adjustment of Existing Regulatory Framework: Under this approach, the regulators either modify the existing laws or issue new laws to facilitate the emergence of the new technology. Examples of this approach include (1) the plan by the Office of the Comptroller of the Currency (OCC) to issue fintech charter to technology companies offering financial services and (2) the enactment of BitLicense regulation by the State of New York. Essentially, this policy alternative allows financial regulators to create a “derivative framework” based on existing regulations.
  • Issuance of Regulatory Guideline: Because some regulations are ambiguous when applied to blockchain-based businesses, regulatory agencies may choose to provide preliminary perspectives on how they plan to regulate the new technology. This may come in the form of a statement specifying how the regulators plan to manage blockchain applications, how active or passive the regulators will engage with industry players, how strict or flexible the rules will be, what the key priorities are, and how the regulators plan to use the technology themselves. Such a guideline will provide industry participants with added clarity, while offering them flexibility and autonomy for self-regulation.
  • Creation of Multi-Party Working Group: A multi-party working group represents an effort by regulatory agencies and industry participants to collaborate and arrive at a standard framework or shared best practices for technology development and regulation. Under this approach, various regulatory agencies would work together to formulate and issue a single policy framework for the industry. They may also collaborate with industry participants to learn from their experiences and take their feedbacks to adjust their policies accordingly.
  • Establishment of Regulatory Sandbox: Several foreign regulators—such as the United Kingdom, Singapore, Australia, Hong Kong, France, and Canada—have established regulatory sandboxes to manage the emergence of blockchain. A sandbox essentially provides a well-defined space in which companies can experiment with new technology and business models in a relaxed regulatory environment and in some cases with support of the regulators for a period of time. This leads to several potential benefits, including: reduced time-to-market of new technology, reduced cost, better access to financing for companies, and more innovative products reaching the market.

Each of the aforementioned policy alternatives has different advantages and disadvantages. For instance, while status quo is clearly the easiest to implement, it fails to solve many policy problems arising from the existing regulatory framework. On the other hand, although a regulatory sandbox will be the most effective in promoting innovation while protecting consumers, it will also be the most difficult to implement and the costliest to scale. Given the trade-offs between these alternatives and the fact that these alternatives are not mutually exclusive, the best solution will likely be a combination of some or all of the above approaches. Specifically, this report recommends a three-prong approach, including:

  • Issuance of Regulatory Guideline: Financial regulators should provide a general guideline of how they plan to regulate blockchain-based applications. Such guideline should include details such as: key priorities from the regulators’ perspectives (such as consumer protection and overall financial stability), the nature of engagement between the regulators and industry players (such as how active the regulators plan to monitor companies’ activities and how much leeway the industry will have for self-regulation), how the regulators plan to address potential issues that may arise (such as those arising from the incompatibility between the new business models and the existing regulations), and how industry players may correspond with the regulators to avoid noncompliance. To the extent that such an indication could come from the President, it would also provide consistency in the framework across agencies.
  • Creation of Public-Private Working Group: The regulators should establish a public-private working group that would allow various financial regulatory agencies and industry players to interact, share insights and best practices, and brainstorm ideas to promote innovation and effective regulation. Participants in the working group will include representatives from various financial regulatory agencies as well as industry players. The working group will aim to promote knowledge sharing, while the actual authority will remain with each regulatory agency. It will also serve as a central point of contact when interacting with foreign and international agencies. Note that although similar working groups, such as the Blockchain Alliance, exist currently, they are typically spearheaded by the industry and geared toward promoting industry’s preferences. The regulators should instead create their own platform that would allow them to learn about the technology, discuss emerging risks and potential options, and explore potential policy options in an unbiased fashion.
  • Enactment of Suitable Safe Harbor: Although blockchain may expose consumers and the financial system to some risks, regulators may not need to regulate every minute aspect of these new use-cases, particularly if the risks are small. Hence, under certain conditions, the regulators may consider creating safe harbor that would allow industry players to experiment with their ideas without being overly concerned with the regulatory burden while also limiting the risks to consumers and the financial system. For instance, with respect to money transfer applications, FinCEN may consider creating safe harbor for transactions below a certain amount.

This recommendation essentially aims to promote a prudent and flexible market-based solution. The recommendation affords industry players the freedom to operate within the existing regulatory environment, while also giving them greater clarity on the applicability of the regulations and enabling productive interaction with the regulators. It also allows the regulators to protect consumers and the financial system without stifling innovation. Lastly, this solution is viable within the existing political context and despite the complex regulatory regime that exists currently.

For policymakers, the most important near-term goal should be to ensure that regulators are well educated about blockchain and that they understand its trends and implications. With respect to regulatory compliance, policymakers should be attentive to the adoption of the technology by existing financial institutions, particularly in the area of money transfer, clearing and settlement of assets, and trade finance. Longer-term, Congress also ought to find ways to reform the existing financial regulatory framework and to consolidate both regulatory agencies and regulations in order to reduce cross-jurisdictional complexity and promote innovation and efficiency.

The emergence of blockchain and digital ledger technology represents a potential pivot-point in the ongoing global efforts to apply technology to improve the financial system. The United States has the opportunity to strengthen its leadership in the world of global finance by pursuing supportive policies that promote financial technology innovation, while making sure that consumers are protected and the financial system remains sound. This will require a policy framework that balances an open-market approach with a circumspect supervision. The next 5-10 years represents an opportune time for U.S. policymakers to evaluate their approaches toward financial regulation, pursue necessary reform and adjustment efforts, and work together with technology companies and financial institutions to make the United States both a global innovation hub and an international financial center.


Link: Preparing for Blockchain Whitepaper

by Rohit Raghavan at August 24, 2017 02:50 AM

August 23, 2017

Ph.D. student

Notes on Posner’s “The Economics of Privacy” (1981)

Lately my academic research focus has been privacy engineering, the designing of information processing systems that preserve privacy of their users. I have been looking the problem particularly through the lens of Contextual Integrity, a theory of privacy developed by Helen Nissenbaum (2004, 2009). According to this theory, privacy is defined as appropriate information flow, where “appropriateness” is determined relative to social spheres (such as health, education, finance, etc.) that have evolved norms based on their purpose in society.

To my knowledge most existing scholarship on Contextual Integrity is comprised by applications of a heuristic process associated with Contextual Integrity that evaluates the privacy impact of new technology. In this process, one starts by identifying a social sphere (or context, but I will use the term social sphere as I think it’s less ambiguous) and its normative structure. For example, if one is evaluating the role of a new kind of education technology, one would identify the roles of the education sphere (teachers, students, guardians of students, administrators, etc.), the norms of information flow that hold in the sphere, and the disruptions to these norms the technology is likely to cause.

I’m coming at this from a slightly different direction. I have a background in enterprise software development, data science, and social theory. My concern is with the ways that technology is now part of the way social spheres are constituted. For technology to not just address existing norms but deal adequately with how it self-referentially changes how new norms develop, we need to focus on the parts of Contextual Integrity that have heretofore been in the background: the rich social and metaethical theory of how social spheres and their normative implications form.

Because the ultimate goal is the engineering of information systems, I am leaning towards mathematical modeling methods that trade well between social scientific inquiry and technical design. Mechanism design, in particular, is a powerful framework from mathematical economics that looks at how different kinds of structures change the outcomes for actors participating in “games” that involve strategy action and information flow. While mathematical economic modeling has been heavily critiqued over the years, for example on the basis that people do not act with the unbounded rationality such models can imply, these models can be a first step and valuable in a technical context especially as they establish the limits of a system’s manipulability by non-human actors such as AI. This latter standard makes this sort of model more relevant than it has ever been.

This is my roundabout way of beginning to investigate the fascinating field of privacy economics. I am a new entrant. So I found what looks like one of the earliest highly cited articles on the subject written by the prolific and venerable Richard Posner, “The Economics of Privacy”, from 1981.

Richard Posner, from Wikipedia

Wikipedia reminds me that Posner is politically conservative, though apparently he has changed his mind recently in support of gay marriage and, since the 2008 financial crisis, the laissez faire rational choice economic model that underlies his legal theory. As I have mainly learned about privacy scholarship from more left-wing sources, it was interesting reading an article that comes from a different perspective.

Posner’s opening position is that the most economically interesting aspect of privacy is the concealment of personal information, and that this is interesting mainly because privacy is bad for market efficiency. He raises examples of employers and employees searching for each other and potential spouses searching for each other. In these cases, “efficient sorting” is facilitated by perfect information on all sides. Privacy is foremost a way of hiding disqualifying information–such as criminal records–from potential business associates and spouses, leading to a market inefficiency. I do not know why Posner does not cite Akerlof (1970) on the “market for ‘lemons'” in this article, but it seems to me that this is the economic theory most reflective of this economic argument. The essential question raised by this line of argument is whether there’s any compelling reason why the market for employees should be any different from the market for used cars.

Posner raises and dismisses each objective he can find. One objection is that employers might heavily weight factors they should not, such as mental illness, gender, or homosexuality. He claims that there’s evidence to show that people are generally rational about these things and there’s no reason to think the market can’t make these decisions efficiently despite fear of bias. I assume this point has been hotly contested from the left since the article was written.

Posner then looks at the objection that privacy provides a kind of social insurance to those with “adverse personal characteristics” who would otherwise not be hired. He doesn’t like this argument because he sees it as allocating the costs of that person’s adverse qualities to a small group that has to work with that person, rather than spreading the cost very widely across society.

Whatever one thinks about whose interests Posner seems to side with and why, it is refreshing to read an article that at the very least establishes the trade offs around privacy somewhat clearly. Yes, discrimination of many kinds is economically inefficient. We can expect the best performing companies to have progressive hiring policies because that would allow them to find the best talent. That’s especially true if there are large social biases otherwise unfairly skewing hiring.

On the other hand, the whole idea of “efficient sorting” assumes a policy-making interest that I’m pretty sure logically cannot serve the interests of everyone so sorted. It implies a somewhat brutally Darwinist stratification of personnel. It’s quite possible that this is not healthy for an economy in the long term. On the other hand, in this article Posner seems open to other redistributive measures that would compensate for opportunities lost due to revelation of personal information.

There’s an empirical part of the paper in which Posner shows that percentage of black and Hispanic populations in a state are significantly correlated with existence of state level privacy statutes relating to credit, arrest, and employment history. He tries to spin this as an explanation for privacy statutes as the result of strongly organized black and Hispanic political organizations successfully continuing to lobby in their interest on top of existing anti-discrimination laws. I would say that the article does not provide enough evidence to strongly support this causal theory. It would be a stronger argument if the regression had taken into account the racial differences in credit, arrest, and employment state by state, rather than just assuming that this connection is so strong it supports this particular interpretation of the data. However, it is interesting that this variable ways more strongly correlated with the existence of privacy statutes than several other variables of interest. It was probably my own ignorance that made me not consider how strongly privacy statutes are part of a social justice agenda, broadly speaking. Considering that disparities in credit, arrest, and employment history could well be the result of other unjust biases, privacy winds up mitigating the anti-signal that these injustices have in the employment market. In other words, it’s not hard to get from Posner’s arguments to a pro-privacy position based of all things on market efficiency.

It would be nice to model that more explicitly, if it hasn’t been done yet already.

Posner is quite bullish on privacy tort, thinking that it is generally not so offensive from an economic perspective largely because it’s about preventing misinformation.

Overall, the paper is a valuable starting point for further study in economics of privacy. Posner’s economic lens swiftly and clearly puts the trade-offs around privacy statutes in the light. It’s impressively lucid work that surely bears directly on arguments about privacy and information processing systems today.


Akerlof, G. A. (1970). The market for” lemons”: Quality uncertainty and the market mechanism. The quarterly journal of economics, 488-500.

Nissenbaum, H. (2004). Privacy as contextual integrity. Wash. L. Rev., 79, 119.

Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Posner, R. A. (1981). The economics of privacy. The American economic review, 71(2), 405-409. (jstor)

by Sebastian Benthall at August 23, 2017 07:41 PM