School of Information Blogs

January 16, 2019

Ph.D. student

Notes on O’Neil, Chapter 2, “Bomb Parts”

Continuing with O’Neil’s Weapons of Math Destruction on to Chapter 2, “Bomb Parts”. This is a popular book and these are quick chapters. But that’s no reason to underestimate them! This is some of the most lucid work I’ve read on algorithmic fairness.

This chapter talks about three kinds of “models” used in prediction and decision making, with three examples. O’Neil speak highly of the kinds of models used in baseball to predict the trajectory of hits and determine the optimal placement of people in the field. (Ok, I’m not so good at baseball terms). These are good, O’Neil says, because they are transparent, they are consistently adjusted with new data, and the goals are well defined.

O’Neil then very charmingly writes about the model she uses mentally to determine how to feed her family. She juggles a lot of variables: the preferences of her kids, the nutrition and cost of ingredients, and time. This is all hugely relatable–everybody does something like this. Her point, it seems, is that this form of “model” encodes a lot of opinions or “ideology” because it reflects her values.

O’Neil then discusses recidivism prediction, specifically the LSI-R (Level of Service Inventory–Revised) tool. It asks questions like “How many previous convictions have you had?” and uses that to predict likelihood of future prediction. The problem is that (a) this is sensitive to overpolicing in neighborhoods, which has little to do with actual recidivism rates (as opposed to rearrest rates), and (b) e.g. black neighborhoods are more likely to be overpoliced, meaning that the tool, which is not very good at predicting recidivism, has disparate impact. This is an example of what O’Neil calls an (eponymous) weapon of math destruction.(WMD)

She argues that the three qualities of a WMD are Scale, Opacity, and Damage. Which makes sense.

As I’ve said, I think this is a better take on algorithmic ethics than almost anything I’ve read on the subject before. Why?

First, it doesn’t use the word “algorithm” at all. That is huge, because 95% of the time the use of the word “algorithmic” in the technology-and-society literature is stupid. People use “algorithm” when they really mean “software”. Now, they use “AI System” to mean “a company”. It’s ridiculous.

O’Neil makes it clear in this chapter that what she’s talking about are different kinds of models. Models can be in ones head (as in her plan for feeding her family) or in a computer, and both kinds of models can be racist. That’s a helpful, sane view. It’s been the consensus of computer scientists, cognitive scientists, and AI types for decades.

The problem with WMDs, as opposed to other, better models, is that the WMDS models are unhinged from reality. O’Neil’s complaint is not with use of models, but rather that models are being used without being properly trained using sound sampling on data and statistics. WMDs are not artificially intelligences; they are artificial stupidities.

In more technical terms, it seems like the problem with WMDs is not that they don’t properly trade off predictive accuracy with fairness, as some computer science literature would suggest is necessary. It’s that the systems have high error rates in the first place because the training and calibration systems are poorly designed. What’s worse, this avoidable error is disparately distributed, causing more harm to some groups than others.

This is a wonderful and eye-opening account of unfairness in the models used by automated decision-making systems (note the language). Why? Because it shows that there is a connection between statistical bias, the kind of bias that creates distortions in a quantitative predictive process, and social bias, the kind of bias people worry about politically, which consistently uses the term in both ways. If there is statistical bias that is weighing against some social group, then that’s definitely, 100% a form of bias.

Importantly, this kind of bias–statistical bias–is not something that every model must have. Only badly made models have it. It’s something that can be mitigated using scientific rigor and sound design. If we see the problem the way O’Neil sees it, then we can see clearly how better science, applied more rigorously, is also good for social justice.

As a scientist and technologist, it’s been terribly discouraging in the past years to be so consistently confronted with a false dichotomy between sound engineering and justice. At last, here’s a book that clearly outlines how the opposite is the case!

by Sebastian Benthall at January 16, 2019 04:44 AM

January 15, 2019

Ph.D. student

Researchers receive grant to study the invisible work of maintaining open-source software

Researchers at the UC Berkeley Institute for Data Science (BIDS), the University of California, San Diego, and the University of Connecticut have been awarded a grant of $138,055 from the Sloan Foundation and the Ford Foundation as part of a broad initiative to investigate the sustainability of digital infrastructures. The grant funds research into the maintenance of open-source software (OSS) projects, particularly focusing on the visible and invisible work that project maintainers do to support their projects and communities, as well as issues of burnout and maintainer sustainability. The research project will be led by BIDS staff ethnographer and principal investigator Stuart Geiger and will be conducted in collaboration with Lilly Irani and Dorothy Howard at UC San Diego, Alexandra Paxton at the University of Connecticut, and Nelle Varoquaux and Chris Holdgraf at UC Berkeley.

Many open-source software projects have become foundational components for many stakeholders and are now widely used behind-the-scenes to support activities across academia, the tech industry, government, journalism, and activism. OSS projects are often initially created by volunteers and provide immense benefits for society, but their maintainers can struggle with how to sustain and support their projects, particularly when widely used in increasingly critical contexts. Most OSS projects are maintained by only a handful of individuals, and community members often talk about how their projects might collapse if only one or two key individuals leave the project. Project leaders and maintainers must do far more than just write code to ensure a project’s long-term success: They resolve conflicts, perform community outreach, write documentation, review others’ code, mentor newcomers, coordinate with other projects, and more. However, many OSS project leaders and maintainers have publicly discussed the effects of burnout as they find themselves doing unexpected and sometimes thankless work.

The one-year research project — The Visible and Invisible Work of Maintaining Open-Source Digital Infrastructure — will study these issues in various software projects, including software libraries, collaboration platforms, and discussion platforms that have come to be used as critical digital infrastructure. The researchers will conduct interviews with project maintainers and contributors from a wide variety of projects, as well as analyze projects’ code repositories and communication platforms. The goal of the research is to better understand what project maintainers do, the challenges they face, and how their work can be better supported and sustained. This research on the invisible work of maintenance will help maintainers, contributors, users, and funders better understand the complexities within such projects, helping set expectations, develop training programs, and formulate evaluations.

by R. Stuart Geiger at January 15, 2019 08:00 AM

January 12, 2019

Ph.D. student

Reading O’Neil’s Weapons of Math Destruction

I probably should have already read Cathy O’Neil’s Weapons of Math Destruction. It was a blockbuster of the tech/algorithmic ethics discussion. It’s written by an accomplished mathematician, which I admire. I’ve also now seen O’Neil perform bluegrass music twice in New York City and think her band is great. At last I’ve found a copy and have started to dig in.

On the other hand, as is probably clear from other blog posts, I have a hard time swallowing a lot of the gloomy political work that puts the role of algorithms in society in such a negative light. I encounter is very frequently, and every time feel that some misunderstanding must have happened; something seems off.

It’s very clear that O’Neil can’t be accused of mathophobia or not understanding the complexity of the algorithms at play, which is an easy way to throw doubt on the arguments of some technology critics. Yet perhaps because it’s a popular book and not an academic work of Science and Technology Studies, I haven’t it’s arguments parsed through and analyzed in much depth.

This is a start. These are my notes on the introduction.

O’Neil describes the turning point in her career where she soured on math. After being an academic mathematician for some time, O’Neil went to work as a quantitative analyst for D.E. Shaw. She saw it as an opportunity to work in a global laboratory. But then the 2008 financial crisis made her see things differently.

The crash made it all too clear that mathematics, once my refuge, was not only deeply entangled in the world’s problems but also fueling many of them. The housing crisis, the collapse of major financial institutions, the rise of unemployment–all had been aided and abetted by mathematicians wielding magic formulas. What’s more, thanks to the extraordinary powers that I loved so much, math was able to combine with technology to multiply the chaos and misfortune, adding efficiency and scale to systems I now recognized as flawed.

O’Neil, Weapons of Math Destruction, p.2

As an independent reference on the causes of the 2008 financial crisis, which of course has been a hotly debated and disputed topic, I point to Sassen’s 2017 “Predatory Formations” article. Indeed, the systems that developed the sub-prime mortgage market were complex, opaque, and hard to regulate. Something went seriously wrong there.

But was it mathematics that was the problem? This is where I get hung up. I don’t understand the mindset that would attribute a crisis in the financial system to the use of abstract, logical, rigorous thinking. Consider the fact that there would not have been a financial crisis if there had not been a functional financial services system in the first place. Getting a mortgage and paying them off, and the systems that allow this to happen, all require mathematics to function. When these systems operate normally, they are taken for granted. When they suffer a crisis, when the system fails, the mathematics takes the blame. But a system can’t suffer a crisis if it didn’t start working rather well in the first place–otherwise, nobody would depend on it. Meanwhile, the regulatory reaction to the 2008 financial crisis required, of course, more mathematicians working to prevent the same thing from happening again.

So in this case (and I believe others) the question can’t be, whether mathematics, but rather which mathematics. It is so sad to me that these two questions get conflated.

O’Neil goes on to describe a case where an algorithm results in a teacher losing her job for not adding enough value to her students one year. An analysis makes a good case that the cause of her students’ scores not going up is that in the previous year, the students’ scores were inflated by teachers cheating the system. This argument was not consider conclusive enough to change the administrative decision.

Do you see the paradox? An algorithm processes a slew of statistics and comes up with a probability that a certain person might be a bad hire, a risky borrower, a terrorist, or a miserable teacher. That probability is distilled into a score, which can turn someone’s life upside down. And yet when the person fights back, “suggestive” countervailing evidence simply won’t cut it. The case must be ironclad. The human victims of WMDs, we’ll see time and again, are held to a far higher standard of evidence than the algorithms themselves.

O’Neil, WMD, p.10

Now this is a fascinating point, and one that I don’t think has been taken up enough in the critical algorithms literature. It resonates with a point that came up earlier, that traditional collective human decision making is often driven by agreement on narratives, whereas automated decisions can be a qualitatively different kind of collective action because they can make judgments based on probabilistic judgments.

I have to wonder what O’Neil would argue the solution to this problem is. From her rhetoric, it seems like her recommendation must be prevent automated decisions from making probabilistic judgments. In other words, one could raise the evidenciary standard for algorithms so that they we equal to the standards that people use with each other.

That’s an interesting proposal. I’m not sure what the effects of it would be. I expect that the result would be lower expected values of whatever target was being optimized for, since the system would not be able to “take bets” below a certain level of confidence. One wonders if this would be a more or less arbitrary system.

Sadly, in order to evaluate this proposal seriously, one would have to employ mathematics. Which is, in O’Neil’s rhetoric, a form of evil magic. So, perhaps it’s best not to try.

O’Neil attributes the problems of WMD’s to the incentives of the data scientists building the systems. Maybe they know that their work effects people, especially the poor, in negative ways. But they don’t care.

But as a rule, the people running the WMD’s don’t dwell on these errors. Their feedback is money, which is also their incentive. Their systems are engineered to gobble up more data fine-tune their analytics so that more money will pour in. Investors, of course, feast on these returns and shower WMD companies with more money.

O’Neil, WMD, p.13

Calling out greed as the problem is effective and true in a lot of cases. I’ve argued myself that the real root of the technology ethics problem is capitalism: the way investors drive what products get made and deployed. This is a worthwhile point to make and one that doesn’t get made enough.

But the logical implications of this argument are off. Suppose it is true that “as a rule”, the makers of algorithms that do harm are made by people responding to the incentives of private capital. (IF harmful algorithm, THEN private capital created it.) That does not mean that there can’t be good algorithms as well, such as those created in the public sector. In other words, there are algorithms that are not WMDs.

So the insight here has to be that private capital investment corrupts the process of designing algorithms, making them harmful. One could easily make the case that private capital investment corrupts and makes harmful many things that are not algorithmic as well. For example, the historic trans-Atlantic slave trade was a terribly evil manifestation of capitalism. It did not, as far as I know, depend on modern day computer science.

Capitalism here looks to be the root of all evil. The fact that companies are using mathematics is merely incidental. And O’Neil should know that!

Here’s what I find so frustrating about this line of argument. Mathematical literacy is critical for understanding what’s going on with these systems and how to improve society. O’Neil certainly has this literacy. But there are many people who don’t have it. There is a power disparity there which is uncomfortable for everybody. But while O’Neil is admirably raising awareness about how these kinds of technical systems can and do go wrong, the single-minded focus and framing risks giving people the wrong idea that these intellectual tools are always bad or dangerous. That is not a solution to anything, in my view. Ignorance is never more ethical than education. But there is an enormous appetite among ignorant people for being told that it is so.

References

O’Neil, Cathy. Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books, 2017.

Sassen, Saskia. “Predatory Formations Dressed in Wall Street Suits and Algorithmic Math.” Science, Technology and Society22.1 (2017): 6-20.

by Sebastian Benthall at January 12, 2019 08:00 PM

January 11, 2019

Ph.D. student

"no photos please" and other broadcasts

We've spent a lot of collective time and effort on design and policy to support the privacy of the user of a piece of software, whether it's the Web or a mobile app or a device. But more current and more challenging is the privacy of the non-user of the app, the privacy of the bystander. With the ubiquity of sensors, we are increasingly observed, not just by giant corporations or government agencies, but by, as they say, little brothers.

Consider the smartphone camera. Taking digital photos is free, quick and easy; resolution and quality increase; metadata (like precise geolocation) is attached; sharing those photos is easy via online services. As facial recognition has improved, it has become easier to automatically identify the people depicted in a photo, whether they're the subject of a portrait or just in the background. If you don't want to share records of your precise geolocation and what you're doing in public places, with friends, family, strangers and law enforcement, it's no longer enough to be careful with the technology you choose to use, you'd also have to be constantly vigilant about the technology that everyone around you is using.

While it may be tempting to draw a "throw your hands up" conclusion from this -- privacy is dead, get over it, there's nothing we can easily do about it -- we actually have widespread experience with this kind of value and various norms to protect it. At conferences and public events, it's not uncommon to have a system of stickers on nametags to either opt-in or opt-out of photos. This is a help (not a hindrance) for event photographers: rather than asking everyone to pose in your photo, or asking everyone after the fact if they're alright with your posting a public photo, or being afraid of posting a photo and facing the anger of your attendees, you can just keep an eye on the red and green dots on those plastic nametags and feel confident that you're respecting the attendees at your event.

There are similar norms in other settings. Taking video in the movie theater violates legal protections, but there are also widespread and reasonably well-enforced norms against capturing video of live theater productions or comedians who test out new material in clubs, on grounds that may not be copyright. Art museums will often tell you whether photos are welcome or prohibited. In some settings the privacy of the people present is so essential that unwritten or written rules prohibit cameras altogether: at nude hot springs, for example, you just can't use a camera at all. You wouldn't take a photo in the waiting room of your doctor's office and you'll invite anger and social confrontation if you're taking photos of other people's children at your local playground.

And even in "public" or in contexts with friends, there are spoken or unspoken expectations. "Don't post that photo of me drinking, please." "Let me see how I look in that before you post it on Facebook." "Everyone knows that John doesn't like to have his photo taken."

As cameras become small and more widely used, and encompass depictions of more people, and are shared more widely and easily, and identifications of depicted people can also be shared, our social norms and spoken discussions don't easily keep up. Checking with people before you post a photo of them is absolutely a good practice and I encourage you to follow it. But why not also use technology to facilitate this checking others' preferences?

We have all the tools we need to make "no photos please" nametag stickers into unobtrusive and efficiently communicated messages. If you're attending a conference or party and don't want people to take your photo, just tap the "no photos please" setting on your smartphone before you walk in. And if you're taking photos at an event, your camera will show a warning when it knows that someone in the room doesn't want their photo taken, so that you can doublecheck with the people in your photo and make sure you're not inadvertently capturing someone in the background. And the venue can remind you that way too, in case you don't know the local norm that pictures shouldn't be taken in the church or museum.

Mockup of turning on No Photos Please mode. Camera icon by Mourad Mokrane from the Noun Project.

As a technical matter, I think we're looking at Bluetooth broadcast beacons, from smartphones or stationary devices. That could be a small Arduino-based widget on the wall of a commercial venue, or one day you might have a poker-chip-sized device in your pocket that you can click into private mode. When you're using a compatible camera app on your phone or a compatible handheld camera, your device regularly scans for nearby Bluetooth beacons and if it sees a "no photos please" message, it shows a (dismissable) warning.

Mockup of camera showing no photos warning.

The discretionary communication of preferences is ideal in part because it isn't self-enforcing. For example, if the police show up at the political protest you're attending and broadcast a "no photos please" beacon, you can (and should) override your camera warning to take photos of their official activity, as a safeguard for public safety and accountability. An automatically-enforcing DRM-style system would be both infeasible to construct and, if it were constructed, inappropriately inviting to government censorship or aggressive copyright maximalism. Technological hints are also less likely to confusingly over-promise a protection: we can explain to people that the "no photos please" beacon doesn't prevent impolite or malicious people from surreptitiously taking your photo, just as people are extremely familiar with the fact that placards, polite requests and even laws are sometimes ignored.

Making preferences technically available could also help with legal compliance. If you're taking a photo at an event and get a "no photos" warning, your device UI can help you log why you might be taking the photo anyway. Tap "I got consent" and your camera can embed metadata in the file that you gathered consent from the depicted people. Tap "Important public purpose" at the protest and you'll have a machine-readable affirmation in place of what you're doing, and your Internet-connected phone can also use that signal to make sure photos in this area are promptly backed up securely in case your device is confiscated.

People's preferences are of course more complicated than just "no photos please" or "sure, take my photo". While I, like many, have imagined that sticky policies could facilitate rules of how data is subsequently shared and used, there are good reasons to start with the simple capture-time question. For one, it's familiar, from these existing social and legal norms. For another, it can be a prompt for real-time in-person conversation. Rather than assuming an error-free technical-only system of preference satisfaction, this can be a quick reminder to check with the people right there in front of you for those nuances, and to do so prior to making a digital record.

Broadcast messages provide opportunities that I think we haven't fully explored or embraced in the age of the Internet and the (rightfully lauded) end-to-end principle. Some communications just naturally take the form of letting people in a geographic area know something relevant to the place. "The cafe is closing soon." "What's the history of that statue?" "What's the next stop on this train and when are we scheduled to arrive?" If WiFi routers included latitude and longitude in the WiFi network advertisement, your laptop could quickly and precisely geolocate even in areas where you don't have Internet access, and do so passively, without broadcasting your location to a geolocation provider. (That one is a little subtle; we wrote a paper on it back when we were evaluating the various privacy implications of WiFi geolocation databases at Berkeley.) What about, "Anyone up for a game of chess?" (See also, Grindr.) eBook readers could optionally broadcast the title of the current book to re-create the lovely serendipity of seeing the book cover a stranger is reading on the train. Music players could do the same.

The Internet is amazing for letting us communicate with people around the world around shared interests. We should see the opportunity for networking technology to also facilitate communications, including conversations about privacy, with those nearby.


Some end notes that my head wants to let go of: There is some prior art here that I don't want to dismiss or pass over, I just think we should push it further. A couple examples:

  • Google folks have developed broadcast URLs that they call The Physical Web so that real-life places can share a Web page about them (over mDNS or Bluetooth Low Energy) and I hope one day we can get a link to the presenter's current slide using networking rather than everyone taking a picture of a projected URL and awkwardly typing it into our laptops later.
  • The Occupy movement showed an interest in geographically-located Web services, including forums and chatrooms that operate over WiFi but not connected to the Internet. Occupy Here:
    Anyone within range of an Occupy.here wifi router, with a web-capable smartphone or laptop, can join the network “OCCUPY.HERE,” load the locally-hosted website http://occupy.here, and use the message board to connect with other users nearby.

Getting a little further afield but still related, it would be helpful if the network provider could communicate directly with the subscriber using the expressive capability of the Web. Lacking this capability, we've seen frustrating abuses of interception: captive portals redirect and impersonate Web traffic; ISPs insert bandwidth warnings as JavaScript insecurely transplanted into HTTP pages. Why not instead provide a way for the network to push a message to the client, not by pretending to be a server you happen to connect to around that same time, but just as a clearly separate message? ICMP control messages are an existing but underused technology.

by nick@npdoty.name at January 11, 2019 06:33 AM

January 09, 2019

Ph.D. student

computational institutions as non-narrative collective action

Nils Gilman recently pointed to a book chapter that confirms the need for “official futures” in capitalist institutions.

Nils indulged me in a brief exchange that helped me better grasp at a bothersome puzzle.

There is a certain class of intellectuals that insist on the primacy of narratives as a mode of human experience. These tend to be, not too surprisingly, writers and other forms of storytellers.

There is a different class of intellectuals that insists on the primacy of statistics. Statistics does not make it easy to tell stories because it is largely about the complexity of hypotheses and our lack of confidence in them.

The narrative/statistic divide could be seen as a divide between academic disciplines. It has often been taken to be, I believe wrongly, the crux of the “technology ethics” debate.

I questioned Nils as to whether his generalization stood up to statistically driven allocation of resources; i.e., those decisions made explicitly on probabilistic judgments. He argued that in the end, management and collective action require consensus around narrative.

In other words, what keeps narratives at the center of human activity is that (a) humans are in the loop, and (b) humans are collectively in the loop.

The idea that communication is necessary for collective action is one I used to put great stock in when studying Habermas. For Habermas, consensus, and especially linguistic consensus, is how humanity moves together. Habermas contrasted this mode of knowledge aimed at consensus and collective action with technical knowledge, which is aimed at efficiency. Habermas envisioned a society ruled by communicative rationality, deliberative democracy; following this line of reasoning, this communicative rationality would need to be a narrative rationality. Even if this rationality is not universal, it might, in Habermas’s later conception of governance, be shared by a responsible elite. Lawyers and a judiciary, for example.

The puzzle that recurs again and again in my work has been the challenge of communicating how technology has become an alternative form of collective action. The claim made by some that technologists are a social “other” makes more sense if one sees them (us) as organizing around non-narrative principles of collective behavior.

It is I believe beyond serious dispute that well-constructed, statistically based collective decision-making processes perform better than many alternatives. In the field of future predictions, Phillip Tetlock’s work on superforecasting teams and prior work on expert political judgment has long stood as an empirical challenge to the supposed primacy of narrative-based forecasting. This challenge has not been taken up; it seems rather one-sided. One reason for this may be because the rationale for the effectiveness of these techniques rests ultimately in the science of statistics.

It is now common to insist that Artificial Intelligence should be seen as a sociotechnical system and not as a technological artifact. I wholeheartedly agree with this position. However, it is sometimes implied that to understand AI as a social+ system, one must understand it one narrative terms. This is an error; it would imply that the collective actions made to build an AI system and the technology itself are held together by narrative communication.

But if the whole purpose of building an AI system is to collectively act in a way that is more effective because of its facility with the nuances of probability, then the narrative lens will miss the point. The promise and threat of AI is that is delivers a different, often more effective form of collective or institution. I’ve suggested that computational institution might be the best way to refer to such a thing.

by Sebastian Benthall at January 09, 2019 03:54 PM

January 08, 2019

MIMS 2012

My Yardstick for Empathy

A different perspective by Jamie Street on Unsplash A different perspective. Photo by Jamie Street on Unsplash

How do you know if you’re being empathetic? It’s easy to throw the term around, but difficult to actually apply. This is important to understand in my chosen field of design, but can also help anyone improve their interactions with other people.

My yardstick for being empathetic is imagining myself make the same decisions, in the same situation, that another person made.

If I look at someone’s behavior and think, “That doesn’t make sense,” or “Why did they do that?!” then I’m not being empathetic. I’m missing a piece of context — about their knowledge, experiences, skills, emotional state, environment, etc. — that led them to do what they did. When I feel that way, I push myself to keep searching for the missing piece that will make their actions become the only rational ones to take.

Is this always possible? No. Even armed with the same knowledge, operating in the same environment, and possessing the same skills as another person, I will occasionally make different decisions than them. Every individual is unique, and interpret and act on stimuli differently.

Even so, imagining myself behave the same as another person is what I strive for. That’s my yardstick for empathy.


If you want to learn more about empathy and how to apply it to your work and personal life, I highly recommend Practical Empathy by Indi Young.

by Jeff Zych at January 08, 2019 08:22 PM

January 04, 2019

MIMS 2012

Books Read 2018

In 2018 I read 23 books, which is a solid 9 more than last year’s paltry 14, and 1 more than 2016). I credit the improvement to the 4-month sabbatical I took in the spring. Not working really frees up time 😄

For the last 2 years I said I needed to read more fiction since I only read 3 in 2016 and 2 in 2017. So how did I do? I’m proud to say I managed to read 7 fiction books this year (if you can count My Dad Wrote a Porno as “fiction”…). My reading still skews heavily to non-fiction, and specifically design, but that’s what I’m passionate about and it helps me professionally, so I’m ok with it.

I also apparently didn’t finish any books in January or February. I thought this might have been a mistake at first, but when I looked back on that time I realized it’s because I was wrapping things up at Optimizely, and reading both Quicksilver by Neal Stephenson and Story by Robert McKee at the same time, which are long books that took awhile to work through.

Highlights

Story: Substance, Structure, Style, and the Principles of Screenwriting

by Robert McKee

I’ve read next to nothing about writing stories before, but Robert McKee’s primer on the subject is excellent. Even though I’m not a fiction author, I found his principles for writing compelling narratives valuable beyond just the domain of screenwriting.

Handstyle Lettering

Published and edited by Victionary

There wasn’t much to “read” in this book, but it was full of beautiful hand-lettered pieces that continue to inspire me to be a better letterer.

The Baroque Cycle

by Neal Stephenson

Neal Stephenson’s Baroque Cycle is a broad, staggering, 3-volume and 2,500+ page opus of historical science fiction, making it no small feat to complete (I read the first 2 this year, and am almost done with the 3rd volume). It takes place during the scientific revolution of the 17th and 18th centuries when the world transitioned out of feudal rule towards a more rational and merit-based society that we would recognize as modern. It weaves together a story between fictional and non-fictional characters, including Newton, Leibniz, Hooke, Wren, royalty, and other persons-of-quality. Although the series can be slow and byzantine at times, Stephenson makes up for it with his attention to detail and the sheer amount of research and effort he put into accurately capturing the time period and bringing the story to life. Even just having the audacity to put yourself in Newton’s head to speak from his perspective, much less to do so convincingly, makes the series worth the effort.

Good Strategy, Bad Strategy

by Richard P. Rumelt

Strategy is a fuzzy concept, but Rumelt makes it concrete and approachable with many examples of good and bad strategy. Read my full notes here. Highly recommended.

Bird by Bird: Some Instructions on Writing and Life

by Anne Lamott

A great little meditation on the writing process (and life!), sprinkled with useful tips and tricks throughout.

Creative Selection: Inside Apple’s design process

by Ken Kocienda

Ken Kocienda was a software engineer during the “golden age of Steve Jobs,” and provides a fascinating insight into the company’s design process. I’m still chewing on what I read (and hope to publish more thoughts soon), but it’s striking how different it is from any process I’ve ever seen at any company, and different from best practices written about in books. It’s basically all built around Steve Jobs’ exacting taste, with designers and developers demoing their work to Steve with the hope of earning his approval. Very difficult to replicate, but the results speak for themselves.

Ogilvy on Advertising

by David Ogilvy

I hadn’t read much about advertising before, but Ogilvy’s book on the subject is great. It’s full of practical advice on how to write compelling headlines and ads that sell. Read my notes here.

Full List of Books Read

  • Story: Substance, Structure, Style, and the Principles of Screenwriting by Robert McKee (3/7/18)
  • The Color of Pixar by Tia Kratter (3/18/18)
  • Conversational Design by Erika Hall (3/27/18)
  • Quicksilver by Neal Stephenson (4/3/18)
  • Handstyle Lettering published and edited by Victionary (4/24/18)
  • Bimimicry: Innovation Inspired by Nature by Janine M. Benyus (5/4/18)
  • Design is Storytelling by Ellen Lupton (5/11/18)
  • Trip by Tao Lin (5/20/18)
  • Good Strategy, Bad Strategy: The Difference and Why it Matters by Richard P. Rumelt (5/27/18)
  • Bird by Bird: Some Instructions on Writing and Life by Anne Lamott (6/10/18)
  • The Inmates are Running the Asylum by Alan Cooper (6/13/18)
  • It Chooses You by Miranda July (6/13/18)
  • String Theory by David Foster Wallace (6/22/18)
  • Invisible Cities by Italo Calvino (6/28/18)
  • My Dad Wrote a Porno by Jamie Morton, James Cooper, Alice Levine, and Rocky Flintstone (7/1/18)
  • The User Experience Team of One by Leah Buley (7/8/18)
  • Change by Design by Tim Brown (9/3/18)
  • Darkness at Noon by Arthur Koestler (9/16/2018)
  • Creative Selection: Inside Apple’s design process during the golden age of Steve Jobs by Ken Kocienda (9/20/18)
  • The Confusion by Neal Stephenson (9/26/18)
  • How to Change Your Mind by Michael Pollan (10/27/18)
  • Ogilvy on Advertising by David Ogilvy (11/11/18)
  • Draft No. 4. On the writing process by John McPhee (11/14/18)

by Jeff Zych at January 04, 2019 12:26 AM

December 30, 2018

Ph.D. student

State regulation and/or corporate self-regulation

The dust from the recent debates about whether regulation or industrial self-regulation in the data/tech/AI industry appears to be settling. The smart money is on regulation and self-regulation being complementary for attaining the goal of an industry dominated by responsible actors. This trajectory leads to centralized corporate power that is lead from the top; it is a Hamiltonian not Jeffersonian solution, in Pasquale’s terms.

I am personally not inclined towards this solution. But I have been convinced to see it differently after a conversation today about environmentally sustainable supply chains in food manufacturing. Nestle, for example, has been internally changing its sourcing practices to more sustainable chocolate. It’s able to finance this change from its profits, and when it does change its internal policy, it operates on a scale that’s meaningful. It is able to make this transition in part because non-profits, NGO’s, and farmers cooperatives lay through groundwork for sustainable sourcing external to the company. This lowers the barriers to having Nestle switch over to new sources–they have already been subsidized through philanthropy and international aid investments.

Supply chain decisions, ‘make-or-buy’ decisions, are the heart of transaction cost economics (TCE) and critical to the constitution of institutions in general. What this story about sustainable sourcing tells us is that the configuration of private, public, and civil society institutions is complex, and that there are prospects for agency and change in the reconfiguration of those relationships. This is no different in the ‘tech sector’.

However, this theory of economic and political change is not popular; it does not have broad intellectual or media appeal. Why?

One reason may be because while it is a critical part of social structure, much of the supply chain is in the private sector, and hence is opaque. This is not a matter of transparency or interpretability of algorithms. This is about the fact that private institutions, by virtue of being ‘private’, do not have to report everything that they do and, probably, shouldn’t. But since so much of what is done by the massive private sector is of public import, there’s a danger of the privatization of public functions.

Another reason why this view of political change through the internal policy-making of enormous private corporations is unpopular is because it leaves decision-making up to a very small number of people–the elite managers of those corporations. The real disparity of power involved in private corporate governance means that the popular attitude towards that governance is, more often than not, irrelevant. Even less so that political elites, corporate elites are not accountable to a constituency. They are accountable, I suppose, to their shareholders, which have material interests disconnected from political will.

This disconnected shareholder will is one of the main reasons why I’m skeptical about the idea that large corporations and their internal policies are where we should place our hopes for moral leadership. But perhaps what I’m missing is the appropriate intellectual framework for how this will is shaped and what drives these kinds of corporate decisions. I still think TCE might provide insights that I’ve been missing. But I am on the lookout for other sources.

by Sebastian Benthall at December 30, 2018 08:39 PM

December 24, 2018

Ph.D. student

Ordoliberalism and industrial organization

There’s a nice op-ed by Wolfgang Münchau in FT, “The crisis of modern liberalism is down to market forces”.

Among other things, it reintroduces the term “ordoliberalism“, a particular Germanic kind of enlightened liberalism designed to prevent the kind of political collapse that had precipitated the war.

In Münchau’s account, the key insight of ordoliberalism is its attention to questions of social equality, but not through the mechanism of redistribution. Rather, ordoliberal interventions primarily effect industrial organization, favoring small to mid- sized companies.

As Germany’s economy remains robust and so far relatively politically stable, it’s interesting that ordoliberalism isn’t discussed more.

Another question that must be asked is to what extent the rise of computational institutions challenges the kind of industrial organization recommended by ordoliberalism. If computation induces corporate concentration, and there are not good policies for addressing that, then that’s due to a deficiency in our understanding of what ‘market forces’ are.

by Sebastian Benthall at December 24, 2018 02:32 PM

December 22, 2018

Ph.D. student

When *shouldn’t* you build a machine learning system?

Luke Stark raises an interesting question, directed at “ML practitioner”:

As an “ML practitioner” in on this discussion, I’ll have a go at it.

In short, one should not build an ML system for making a class of decisions if there is already a better system for making that decision that does not use ML.

An example of a comparable system that does not use ML would be a team of human beings with spreadsheets, or a team of people employed to judge for themselves.

There are a few reasons why a non-ML system could be superior in performance to an ML system:

  • The people involved could have access to more data, in the course of their lives, in more dimensions of variation, than is accessible by the machine learning system.
  • The people might have more sensitized ability to make semantic distinctions, such as in words or images, than an ML system
  • The problem to be solved could be a “wicked problem” that is itself over a very high-dimensional space of options, with very irregular outcomes, such that they are not amenable to various forms of, e.g., linear approximations
  • The people might be judging an aspect of their own social environment, such that the outcome’s validity is socially procedural (as in the outcome of a vote, or of an auction)

These are all fine reasons not to use an ML system. On the other hand, the term “ML” has been extended, as with “AI”, to include many hybrid human-computer systems, which has led to some confusion. So, for example. crowdsourced labels of images provide useful input data to ML systems. This hybrid system might perform semantic judgments over a large scale of data, at a high speed, at a tolerable rate of accuracy. Does this system count as an ML system? Or is it a form of computational institution that rivals other ways of solving the problem, and just so happens to have a machine learning algorithm as part of its process?

Meanwhile, the research frontier of machine learning is all about trying to solve problems that previously haven’t been solved, or solved as well, as alternative kinds of systems. This means there will always be a disconnect between machine learning research, which is trying to expand what it is possible to do with machine learning, and what machine learning research should, today, be deployed. Sometimes, research is done to develop technology that is not mature enough to deploy.

We should expect that a lot of ML research is done on things that should not ultimately be deployed! That’s because until we do the research, we may not understand the problem well enough to know the consequences of deployment. There’s a real sense in which ML research is about understanding the computational contours of a problem, whereas ML industry practice is about addressing the problems customers have with an efficient solution. Often this solution is a hybrid system in which ML only plays a small part; the use of ML here is really about a change in the institutional structure, not so much a part of what service is being delivered.

On the other hand, there have been a lot of cases–search engines and social media being important ones–where the scale of data and the use of ML for processing has allowed for a qualitatively different form of product or service. These are now the big deal companies we are constantly talking about. These are pretty clearly cases of successful ML.

by Sebastian Benthall at December 22, 2018 06:47 PM

computational institutions

As the “AI ethics” debate metastasizes in my newsfeed and scholarly circles, I’m struck by the frustrations of technologists and ethicists who seem to be speaking past each other.

While these tensions play out along disciplinary fault-lines, for example, between technologists and science and technology studies (STS), the economic motivations are more often than not below the surface.

I believe this is to some extent a problem of the nomenclature, which is again the function of the disciplinary rifts involved.

Computer scientists work, generally speaking, on the design and analysis of computational systems. Many see their work as bounded by the demands of the portability and formalizability of technology (see Selbst et al., 2019). That’s their job.

This is endlessly unsatisfying to critics of the social impact of technology. STS scholars will insist on changing the subject to “sociotechnical systems”, a term that means something very general: the assemblage of people and artifacts that are not people. This, fairly, removes focus from the computational system and embeds it in a social environment.

A goal of this kind of work seems to be to hold computational systems, as they are deployed and used socially, accountable. It must be said that once this happens, we are no longer talking about the specialized domain of computer science per se. It is a wonder why STS scholars are so often picking fights with computer scientists, when their true beef seems to be with businesses that use and deploy technology.

The AI Now Institute has attempted to rebrand the problem by discussing “AI Systems” as, roughly, those sociotechnical systems that use AI. This is one the one hand more specific–AI is a particular kind of technology, and perhaps it has particular political consequences. But their analysis of AI systems quickly overflows into sweeping claims about “the technology industry”, and it’s clear that most of their recommendations have little to do with AI, and indeed are trying, once again, to change the subject from discussion of AI as a technology (a computer science research domain) to a broader set of social and political issues that do, in fact, have their own disciplines where they have been researched for years.

The problem, really, is not that any particular conversation is not happening, or is being excluded, or is being shut down. The problem is that the engineering focused conversation about AI-as-a-technology has grown very large and become an awkward synecdoche for the rise of major corporations like Google, Apple, Amazon, Facebook, and Netflix. As these corporations fund and motivate a lot of research, there’s a question of who is going to get pieces of the big pie of opportunity these companies represent, either in terms of research grants or impact due to regulation, education, etc.

But there are so many aspects of these corporations that are neither addressed by the terms “sociotechnical system”, which is just so broad, and “AI System”, which is as broad and rarely means what you’d think it does (that the system uses AI is incidental if not unnecessary; what matters is that it’s a company operating in a core social domain via primarily technological user interfaces). Neither of these gets at the unit of analysis that’s really of interest.

An alternative: “computational institution”. Computational, in the sense of computational cognitive science and computational social science: it denotes the essential role of theory of computation and statistics in explaining the behavior of the phenomenon being studied. “Institution”, in the sense of institutional economics: the unit is a firm, which is comprised of people, their equipment, and their economic relations, to their suppliers and customers. An economic lens would immediately bring into focus “the data heist” and the “role of machines” that Nissenbaum is concerned are being left to the side.

by Sebastian Benthall at December 22, 2018 04:59 PM

December 20, 2018

Ph.D. student

Tensions of a Digitally-Connected World in Cricket Wireless’ Holiday Ad Campaign

In the spirit of taking a break over the holidays, this is more of a fun post with some very rough thoughts (though inspired by some of my prior work on paying attention to and critiquing narratives and futures portrayed by tech advertising). The basic version is that the Cricket Wireless 2018 Holiday AdFour the Holidays (made by ad company Psyop), portrays a narrative that makes a slight critique of an always-connected world and suggests that physical face-to-face interaction is a more enjoyable experience for friends than digital sharing. While perhaps a over-simplistic critique of mobile technology use, the twin messages of “buy a wireless phone plan to connect with friends” and “try to disconnect to spend time with friends” highlight important tensions and contradictions present in everyday digital life.

But let’s look at the ad in a little more detail!

Last month, while streaming Canadian curling matches (it’s more fun than you might think, case in point, I’ve blogged about the sport’s own controversy with broom technology) there was a short Cricket ad playing with a holiday jingle. And I’m generally inclined to pay attention to an ad with a good jingle. Looking it up online brought up a 3 minute long short film version expanding upon the 15 second commercial (embedded above), which I’ll describe and analyze below.

It starts with Cricket’s animated characters Ramon (the green blob with hair), Dusty (the orange fuzzy ball), Chip (the blue square), and Rose (the green oblong shape) on a Hollywood set, “filming” the aforementioned commercial, singing their jingle:

The four, the merrier! Cricket keeps us share-ier!

Four lines of unlimited data, for a hundred bucks a month!

After their shoot is over, Dusty wants the group to watch fireworks from the Cricket water tower (which is really the Warner Brothers Studio water tower, though maybe we should call it Chekov’s water tower in this instance) on New Year’s Eve. Alas, the crew has other plans, and everyone flies to their holiday destinations: Ramon to Mexico, Dusty to Canada, Chip to New York, and Rose to Aspen.

The video then shows each character enjoying the holidays in their respective locations with their smartphones. Ramon uses his phone to take pictures of food shared on a family table; Rose uses hers to take selfies on a ski lift.

The first hint that there might be a message critiquing an always-connected world is when the ad shows Dusty in a snowed-in, remote Canadian cabin. Presumably this tells us that he gets a cell signal up there, but in this scene, he is not using his phone. Rather, he’s making cookies with his two (human) nieces (not sure how that works, but I’ll suspend my disbelief), highlighting a face-to-face familial interaction using a traditional holiday group activity.

The second hint that something might not be quite right is the dutch angel establishing shot of New York City in the next scene. The non-horizontal horizon line (which also evokes the off-balance establishing shot of New York from an Avengers: Infinity War trailer) visually puts the scene off balance. But the moment quickly passes, as we see Chip on the streets of New York taking instagram selfies.

2 Dutch angles of New York

Dutch angle of New York from Cricket Wireless’ “Four the Holidays” (left) and Marvel’s Avengers Infinity War (right)

Then comes a rapid montage of photos and smiling selfies that the group is sending and sharing with each other, in a sort of digital self-presentation utopia. But as the short film has been hinting at, this utopia is not reflective of the characters’ lived experience.

The video cuts to Dusty, skating alone on a frozen pond, successfully completing a trick, but then realizes that he has no one to share the moment with. He then sings “The four the merrier, Cricket keeps us share-ier” in a minor key as re-envisions clouds in the sky as the form of the four friends. The minor key and Dusty’s singing show skepticism in the lyrics’ claim that being share-ier is indeed merrier.

The minor key continues, as Ramon sings while envisioning a set of holiday lights as the four friends, and Rose sees a department store window display as the four friends. Chip attends a party where the Cricket commercial (from the start of the video) airs on a TV, but is still lonely. Chip then hails a cab, dramatically stating in a deep voice “Take me home.”

In the last scene, Chip sits atop the Cricket Water Tower (or, Chekov’s Water Tower returns!) at 11:57pm on New Year’s Eve, staring alone at his phone, discontent. This is the clearest signal about the lack of fulfillment he finds from his phone, and by extension, the digitally mediated connection with his friends.

Immediately this is juxtaposed with Ramon singing with his guitar from the other side of the water tower, still in the minor key. Chip hears him and immediately becomes happier, and the music shifts to a major key as Rose and Dusty enter as the tempo picks up, and the drums and orchestra of instruments join in. And the commercial ends with the four of them watching New Year’s fireworks together. It’s worth noting the lyrics at the end:

Ramon: The four the merrier…

Chip [spoken]: Ramon?! You’re here!

Rose: There’s something in the air-ier

All: That helps us connect, all the season through. The four, the merrier

Dusty: One’s a little harrier (So hairy!)

All: The holidays are better, the holidays are better, the holidays are better with your crew.

Nothing here is explicitly about Cricket wireless, or the value of being digitally connected. It’s also worth noting that the phone that Chip was previously staring at is nowhere to be found after he sees Ramon. There is some ambiguous use of the word “connect,” which could refer to both a face-to-face interaction or a digitally mediated one, but the tone of the scene and emotional storyline bringing the four friends physically together seems to suggest that connect refers to the value of face-to-face interaction.

So what might this all mean (beyond the fact that I’ve watched this commercial too many times and have the music stuck in my head)? Perhaps the larger and more important point is that the commercial/short film is emblematic of a series of tensions around connection and disconnection in today’s society. Being digitally connected is seen as a positive that allows for greater opportunity (and greater work output), but at the same time discontent is reflected in culture and media, ranging from articles on tech addiction, to guides on grayscaling iPhones to combat color stimulation, to disconnection camps. There’s also a moralizing force behind these tensions: to be a good employee/student/friend/family member/etc, we are told that we must be digitally connected and always-on, but at the same time, we are told that we must also be dis-connected or interact face-to-face in order to be good subjects.

In many ways, the tensions expressed in this video — an advertisement for a wireless provider trying to encourage customers to sign up for their wireless plans, while presenting a story highlighting the need to digitally disconnect — parallels the tensions that Ellie Harmon and Melissa Mazmanian find in their analysis of media discourse of smartphones: that there is both a push for individuals to integrate the smartphone into everyday life, and to dis-integrate the smartphone from everyday life. What is fascinating to me here is that this video from Cricket exhibits both of those ideas at the same time. As Harmon and Mazmanian write,

The stories that circulate about the smartphone in American culture matter. They matter for how individuals experience the device, the ways that designers envision future technologies, and the ways that researchers frame their questions.

While Four the Holidays doesn’t tell the most complex or nuanced story about connectivity and smartphone use, the narrative that Cricket and Psyop created veers away from a utopian imagining of the world with tech, and instead begins to reflect  some of the inherent tensions and contradictions of smartphone use and mobile connectivity that are experienced as a part of everyday life.

by Richmond at December 20, 2018 05:36 AM

December 19, 2018

Ph.D. student

The politics of AI ethics is a seductive diversion from fixing our broken capitalist system

There is a lot of heat these days in the tech policy and ethics discourse. There is an enormous amount of valuable work being done on all fronts. And yet there is also sometimes bitter disciplinary infighting and political intrigue about who has the moral high ground.

The smartest thing I’ve read on this recently is Irina Raicu’s “False Dilemmas” piece, where she argues:

  • “Tech ethics” research, including research explore the space of ethics in algorithm design, is really code for industry self-regulation
  • Industry self-regulation and state regulation are complementary
  • Any claims that “the field” is dominated by one perspective or agenda or another is overstated

All this sounds very sane but it doesn’t exactly explain why there’s all this heated discussion in the first place. I think Luke Stark gets it right:

But what does it mean to say “the problem is mostly capitalism”? And why is it impolite to say it?

To say “the problem [with technology ethics and policy] is capitalism” is to note that most if not all of the social problems we associate with today’s technology have been problems with technology ever since the industrial revolution. For example, James Beniger‘s The Control Revolution, Horkheimer‘s Eclipse of Reason, and so on all speak to the tight link that there has always been between engineering and the capitalist economy as a whole. The link has persisted through the recent iterations of recognizing first data science, then later artificial intelligence, as disruptive triumphs of engineering with a variety of problematic social effects. These are old problems.

It’s impolite to say this because it cuts down on the urgency that might drive political action. More generally, it’s an embarrassment to anybody in the business of talking as if they just discovered something, which is what journalists and many academics do. The buzz of novelty is what gets people’s attention.

It also suggests that the blame for how technology has gone wrong lies with capitalists, meaning, venture capitalists, financiers, and early stage employees with stock options. But also, since it’s the 21st century, pension funds and university endowments are just as much a part of the capitalist investing system as anybody else. In capitalism, if you are saving, you are investing. Lots of people have a diffuse interest in preserving capitalism in some form.

There’s a lot of interesting work to be done on financial regulation, but it has very little to do with, say, science and technology studies and consumer products. So to acknowledge that the problem with technology is capitalism changes the subject to something remote and far more politically awkward than to say the problem is technology or technologists.

As I’ve argued elsewhere, a lot of what’s happening with technology ethics can be thought of as an extension of what Nancy Fraser called progressive neoliberalism: the alliance of neoliberalism with progressive political movements. It is still hegemonic in the smart, critical, academic and advocacy scene. Neoliberalism, or what is today perhaps better characterized as finance capitalism or surveillance capitalism, is what is causing the money to be invested in projects that design and deploy technology in certain ways. It is a system of economic distribution that is still hegemonic.

Because it’s hegemonic, it’s impolite to say so. So instead a lot of the technology criticism gets framed in terms of the next available moral compass, which is progressivism. Progressivism is a system of distribution of recognition. It calls for patterns of recognizing people for their demographic and, because it’s correlated in a sensitive way, professional identities. Nancy Fraser’s insight is that neoliberalism and progressivism have been closely allied for many years. One way that progressivism is allied with neoliberalism is that progressivism serves as a moral smokescreen for problems that are in part caused by neoliberalism, preventing an effective, actionable critique of the root cause of many technology-related problems.

Progressivism encourages political conflict to be articulated as an ‘us vs. them’ problem of populations and their attitudes, rather than as problem of institutions and their design. This “us versus them” framing is baldly stated than in the 2018 AI Now Report:

The AI accountability gap is growing: The technology scandals of 2018 have shown that the gap between those who develop and profit from AI—and those most likely to suffer the consequences of its negative effects—is growing larger, not smaller. There are several reasons for this, including a lack of government regulation, a highly concentrated AI sector, insufficient governance structures within technology companies, power asymmetries between companies and the people they serve, and a stark cultural divide between the engineering cohort responsible for technical research, and the vastly diverse populations where AI systems are deployed. (Emphasis mine)

There are several institutional reforms called for in the report, but the focus on a particular sector that it constructs as “the technology industry” composed on many “AI systems”, it cannot address broader economic issues such as unfair taxation or gerrymandering. Discussion of the overall economy is absent from the report; it is not the cause of anything. Rather, the root cause is a schism between kinds of people. The moral thrust of this claim hinges on the implied progressivism: the AI/tech people, who are developing and profiting, are a culture apart. The victims are “diverse”, and yet paradoxically unified in their culture as not the developers. This framing depends on the appeal of progressivism as a unifying culture whose moral force is due in large part because of its diversity. The AI developer culture is a threat in part because it is separate from diverse people–code for its being white and male.

This thread continues throughout the report, as various critical perspectives are cited in the report. For example:

A second problem relates to the deeper assumptions and worldviews of the designers of ethical codes in the technology industry. In response to the proliferation of corporate ethics initiatives, Greene et al. undertook a systematic critical review of high-profile “vision statements for ethical AI.” One of their findings was that these statements tend to adopt a technologically deterministic worldview, one where ethical agency and decision making was delegated to experts, “a narrow circle of who can or should adjudicate ethical concerns around AI/ML” on behalf of the rest of us. These statements often assert that AI promises both great benefits and risks to a universal humanity, without acknowledgement of more specific risks to marginalized populations. Rather than asking fundamental ethical and political questions about whether AI systems should be built, these documents implicitly frame technological progress as inevitable, calling for better building.

That systematic critical reviews of corporate policies express self-serving views that ultimately promote the legitimacy of the corporate efforts is a surprise to no one; it is no more a surprise than the fact that critical research institutes staffed by lawyers and soft social scientists write reports recommending that their expertise is vitally important for society and justice. As has been the case in every major technology and ethical scandal for years, the first thing the commentariat does is publish a lot of pieces justifying their own positions and, if they are brave, arguing that other people are getting too much attention or money. But since everybody in either business depends on capitalist finance in one way or another, the economic system is not subject to critique. In other words, once can’t argue that industrial visions of ‘ethical AI’ are favorable to building new AI products because they are written in service to capitalist investors who profit from the sale of new AI products. Rather, one must argue that they are written in this way because the authors have a weird technocratic worldview that isn’t diverse enough. One can’t argue that the commercial AI products neglect marginal populations because these populations have less purchasing power; one has to argue that the marginal populations are not represented or recognized enough.

And yet, the report paradoxically both repeatedly claims that AI developers are culturally and politically out of touch and lauds the internal protests at companies like Google that have exposed wrongdoing within those corporations. The actions of “technology industry” employees belies the idea that problem is mainly cultural; there is a managerial profit-making impulse that is, in large, stable companies in particular, distinct from that the rank-and-file engineer. This can be explained in terms of corporate incentives and so on, and indeed the report does in places call for whistleblower protections and labor organizing. But these calls for change cut against and contradict other politically loaded themes.

There are many different arguments contained in the long report; it is hard to find a reasonable position that has been completely omitted. But as a comprehensive survey of recent work on ethics and regulation in AI, its biases and blind spots are indicative of the larger debate. The report concludes with a call for a change in the intellectual basis for considering AI and its impact:

It is imperative that the balance of power shifts back in the public’s favor. This will require significant structural change that goes well beyond a focus on technical systems, including a willingness to alter the standard operational assumptions that govern the modern AI industry players. The current focus on discrete technical fixes to systems should expand to draw on socially-engaged disciplines, histories, and strategies capable of providing a deeper understanding of the various social contexts that shape the development and use of AI systems.

As more universities turn their focus to the study of AI’s social implications, computer science and engineering can no longer be the unquestioned center, but should collaborate more equally with social and humanistic disciplines, as well as with civil society organizations and affected

communities. (Emphasis mine)

The “technology ethics” field is often construed, in this report but also in the broader conversation, as one of tension between computer science on the one hand, and socially engaged and humanistic disciplines on the other. For example, Selbst et al.’s “Fairness and Abstraction in Sociotechnical Systems” presents a thorough account of pitfalls of computer science’s approach to fairness in machine learning, and proposes a Science and Technology Studies. The refrain is that by considering more social context, more nuance, and so on, STS and humanistic disciplines avoids the problems that engineers, who try to provide portable, formal solutions, don’t want to address. As the AI Now report frames it, a benefit of the humanistic approach is that it brings the diverse non-AI populations to the table, shifting the balance of power back to the public. STS and related disciplines claim the status of relevant expertise in matters of technology that is somehow not the kind of expertise that is alienating or inaccessible to the public, unlike engineering, which allegedly dominates the higher education system.

I am personally baffled by these arguments; so often they appear to conflate academic disciplines with business practices in ways that most practitioners I engage with would not endorse. (Try asking an engineer how much they learned in school, versus on the job, about what it’s like to work in a corporate setting.) But beyond the strange extrapolation from academic disciplinary disputes (which are so often about the internal bureaucracies of universities it is, I’d argue after learning the hard way, unwise to take them seriously from either an intellectual or political perspective), there is also a profound absence of some fields from the debate, as framed in these reports.

I’m referring to the quantitative social sciences, such as economics and quantitative sociology, or what might be more be more generally converging on computational social science. These are the disciplines that one would need to use to understand the large-scale, systemic impact of technology on people, including the ways costs and benefits are distributed. These disciplines deal with social systems and include technology–there is a long tradition within economics studying the relationship between people, goods, and capital that never once requires the term “sociotechnical”–in a systematic way that can be used to predict the impact of policy. They can also connect, through applications of business and finance, the ways that capital flows and investment drive technology design decisions and corporate competition.

But these fields are awkwardly placed in technology ethics and politics. They don’t fit into the engineering vs. humanities dichotomy that entrances so many graduate students in this field. They often invoke mathematics, which makes them another form of suspicious, alien, insufficiently diverse expertise. And yet, it may be that these fields are the only ones that can correctly diagnose the problems caused by technology in society. In a sense, the progressive framing of the problems of technology makes technogy’s ills a problem of social context because it is unequipped to address them as a problem of economic context, and it wouldn’t want know that it is an economic problem anyway, for two somewhat opposed reasons: (a) acknowledging the underlying economic problems is taboo under hegemonic neoliberalism, and (b) it upsets the progressive view that more popularly accessible (and, if you think about it quantitatively, therefore as a result of how it is generated and constructed more diverse) humanistic fields need to be recognized as much as fields of narrow expertise. There is no credence given to the idea that narrow and mathematized expertise might actually be especially well-suited to understand what the hell is going on, and that this is precisely why members of these fields are so highly sought after by investors to work at their companies. (Consider, for example, who would be best positioned to analyze the “full stack supply chain” of artificial intelligence systems, as is called for by the AI Now report: sociologists, electrical engineers trained in the power use and design of computer chips, or management science/operations research types whose job is to optimize production given the many inputs and contingencies of chip manufacture?)

At the end of the day, the problem with the “technology ethics” debate is a dialectic cycle whereby (a) basic research is done by engineers, (b) that basic research is developed in a corporate setting as a product funded by capitalists, (c) that product raises political hackles and makes the corporations a lot of money, (d) humanities scholars escalate the political hackles, (e) basic researchers try to invent some new basic research because the politics have created more funding opportunities, (f) corporations do some PR work trying to CYA and engage in self-regulation to avoid litigation, (g) humanities scholars, loathe to cede the moral high ground, insist the scientific research is inadequate and that the corporate PR is bull. But this cycle is not necessarily productive. Rather, it sustains itself as part of a larger capitalist system that is bigger than any of these debates, structures its terms, and controls all sides of the dialog. Meanwhile the experts on how that larger system works are silent or ignored.

References

Fraser, Nancy. “Progressive neoliberalism versus reactionary populism: A choice that feminists should refuse.” NORA-Nordic Journal of Feminist and Gender Research 24.4 (2016): 281-284.

Greene, Daniel, Anna Laura Hoffman, and Luke Stark. “Better, Nicer, Clearer, Fairer: A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning.” Hawaii International Conference on System Sciences, Maui, forthcoming. Vol. 2019. 2018.

Raicu, Irina. “False Dilemmas”. 2018.

Selbst, Andrew D., et al. “Fairness and Abstraction in Sociotechnical Systems.” ACM Conference on Fairness, Accountability, and Transparency (FAT*). 2018.

Whittaker, Meredith et al. “AI Now Report 2018”. 2018.

by Sebastian Benthall at December 19, 2018 04:54 AM

December 14, 2018

Ph.D. student

The secret to social forms has been in institutional economics all along?

A long-standing mystery for me has been about the ontology of social forms (1) (2): under what conditions is it right to call a particular assemblage of people a thing, and why? Most people don’t worry about this; in literatures I’m familiar with it’s easy to take a sociotechnical complex or assemblage, or a company, or whatever, as a basic unit of analysis.

A lot of the trickiness comes from thinking about this as a problem of identifying social structure (Sawyer, 200; Cederman, 2005). This implies that people are in some sense together and obeying shared norms, and raises questions about whether those norms exist in their own heads or not, and so on. So far I haven’t seen a lot that really nails it.

But what if the answer has been lurking in institutional economics all along? The “theory of the firm” is essentially a question of why a particular social form–the firm–exists as opposed to a bunch of disorganized transactions. The answers that have come up are quite good.

Take for example Holmstrom (1982), who argues that in a situation where collective outcomes depend on individual efforts, individuals will be tempted to free-ride. That makes it beneficial to have somebody monitor the activities of the other people and have their utility be tied to the net success of the organization. That person becomes the owner of the company, in a capitalist firm.

What’s nice about this example is that it explains social structure based on an efficiency argument; we would expect organizations shaped like this to be bigger and command more resources than others that are less well organized. And indeed, we have many enormous hierarchical organizations in the wild to observe!

Another theory of the firm is Williamson’s transaction cost economics (TCE) theory, which is largely about the make-or-buy decision. If the transaction between a business and its supplier has “asset specificity”, meaning that the asset being traded is specific to the two parties and their transaction, then any investment from either party will induce a kind of ‘lock-in’ or ‘switching cost’ or, in Williamson’s language, a ‘bilateral dependence’. The more of that dependence, the more a free market relationship between the two parties will expose them to opportunistic hazards. Hence, complex contracts, or in the extreme case outright ownership and internalization, tie the firms together.

I’d argue: bilateral dependence and the complex ‘contracts’ the connect entities are very much the stuff of “social forms”. Cooperation between people is valuable; the relation between people who cooperate is valuable as a consequence; and so both parties are ‘structurated’ (to mangle a Giddens term) individually into maintaining the reality of the relation!

References

Cederman, L.E., 2005. Computational models of social forms: Advancing generative process theory 1. American Journal of Sociology, 110(4), pp.864-893.

Holmstrom, Bengt. “Moral hazard in teams.” The Bell Journal of Economics (1982): 324-340.

Sawyer, R. Keith. “Simulating emergence and downward causation in small groups.” Multi-agent-based simulation. Springer Berlin Heidelberg, 2000. 49-67.

Williamson, Oliver E. “Transaction cost economics.” Handbook of new institutional economics. Springer, Berlin, Heidelberg, 2008. 41-65.

by Sebastian Benthall at December 14, 2018 04:04 AM

December 09, 2018

Ph.D. student

Transaction cost economics and privacy: looking at Hoofnagle and Whittington’s “Free”

As I’ve been reading about transaction cost economics (TCE) and independently scrutinizing the business model of search engines, it stands to reason that I should look to the key paper holding down the connection between TCE and privacy, Hoofnagle and Whittinton’s “Free: Accounting for the Costs of the Internet’s Most Popular Price” (2014).

I want to preface the topic by saying I stand by what I wrote earlier: that at the heart of what’s going on with search engines, you have a trade of attention; it requires imagining the user has have attention-time as a scarce resource. The user has a query and has the option to find material relevant to the query in a variety of ways (like going to a library). Often (!) they will do so in a way that costs them as little attention as possible: they use a search engine, which gives an almost instant and often high-quality response; they are also shown advertisements which consume some small amount of their attention, but less than they would expend searching through other means. Advertisers pay the search engine for this exposure to the user’s attention, which funds the service that is “free”, in dollars (but not in attention) to the users.

Hoofnagle and Whittington make a very different argument about what’s going on with “free” web services, which includes free search engines. They argue that the claim that these web services are “free” is deceptive because the user may incur costs after the transaction on account of potential uses of their personal data. An example:

The freemium business model Anderson refers to is popular among industries online. Among them, online games provide examples of free services with hidden costs. By prefacing play with the disclosure of personal identification, the firms that own and operate games can contact and monitor each person in ways that are difficult for the consumer to realize or foresee. This is the case for many games, including Disney’s “Club Penguin,” an entertainment website for children. After providing personal information to the firm, consumers of Club Penguin receive limited exposure to basic game features and can see numerous opportunities to enrich their play with additional features. In order to enrich the free service, consumers must buy all sort of enhancements, such as an upgraded igloo or pets for one’s penguin. Disney, like others in the industry, places financial value on the number of consumers it identifies, the personal information they provide, and the extent to which Disney can track consumer activity in order to modify the game and thus increase the rate of conversion of consumers from free players to paying customers.

There are a number of claims here. Let’s enumerate them:

  1. This is an example of a ‘free’ service with hidden costs to users.
  2. The consumer doesn’t know what the game company will do with their personal information.
  3. In fact, the game will use the personal information to personalize pitches for in-game purchases that ‘enrich’ the free service.
  4. The goal of the company is to convert free players to paying customers.

Working backwards, claim (4) is totally true. The company wants to make money by getting their customers to pay, and they will use personal information to make paying attractive to the customers (3). But this does not mean that the customer is always unwitting. Maybe children don’t understand the business model when they begin playing Penguin Club, but especially today parents certainly do. App Stores, for example, now label apps when they have “in-app purchases”, which is a pretty strong signal. Perhaps this is a recent change due to some saber rattling by the FTC, which to be fair would be attributable as a triumph to the authors if this article had influence on getting that to happen. On the other hand, this is a very simple form of customer notice.

I am not totally confident that even if (2), (3), and (4) are true, that that entails (1), that there are “hidden costs” to free services. Elsewhere, Hoofnagle and Whittington raise more convincing examples of “costs” to release of PII, including being denied a job and resolving identity theft. But being convincingly sold an upgraded igloo for your digital penguin seems so trivial. Even if it’s personalized, how could it be a hidden cost? It’s a separate transaction, no? Do you or do you not buy the igloo?

Parsing this through requires, perhaps, a deeper look at TCE. According to TCE, agents are boundedly rational (they can’t know everything) and opportunistic (they will make an advantageous decision in the moment). Meanwhile, the world is complicated. These conditions imply that there’s a lot of uncertainty about future behavior, as agents will act strategically in ways that they can’t themselves predict. Nevertheless, agents engage in contracts with some kinds of obligations in them in the course of a transaction. TCE’s point is that these contracts are always incomplete, meaning that there are always uncertainties left unresolved in contracts that will need to be negotiated in certain contingent cases. All these costs of drafting, negotiating, and safeguarding the agreement are transaction costs.

Take an example of software contracting, which I happen to know about from personal experience. A software vendor gets a contract from a client to do some customization. The client and the vendor negotiated some sort of scope of work ex ante. But always(!), the client doesn’t actually know what they want, and if the vendor delivers on the specification literally the client doesn’t like it. Then begins the ex post negotiation as the client tries to get the vendor to tweak the system into something more usable.

Software contracting often resolves this by getting off the fixed cost contracting model and onto a cost-and-materials contact that allows billing by hours of developer time. Alternatively, the vendor can internalize the costs into the contract by inflating the cost “estimates” to cover for contingencies. In general, this all amounts to having more contract and a stronger relationship between the client and vendor, a “bilateral dependency” which TCE sees as a natural evolution of the incomplete contract under several common conditions, like “asset specificity”, which means that the asset is specialized to a particular transaction (or the two agents involved in it). Another term for this is lock-in, or the presence of high switching costs, though this way of thinking about it reintroduces the idea of a classical market for essentially comparable goods and services that TCE is designed to mitigate against. This explains how technical dependencies of an organization become baked in more or less constitutionally as part of the organization, leading to the robustness of installed base of a computing platform over time.

This ebb and flow of contract negotiation with software vendors was a bit unsettling to me when I first encountered it on the job, but I think it’s safe to say that most people working in the industry accept this as How Things Work. Perhaps it’s the continued influence of orthodox economics that makes this all seem inefficient somehow, and TCE is the right way to conceptualize things that makes better sense of reality.

But back to the Penguins…

Hoofnagle and Whittington make the case that sharing PII with a service that then personalizes its offerings to you creates a kind of bilateral dependence between service and user. They also argue that loss of privacy, due to the many possible uses of this personal information (some nefarious), is a hidden cost that can be thought of as an ex post transaction cost that is a hazard because it has not been factored into the price ex ante. The fact that this data is valuable to the platform/service for paying their production costs, which is not part of the “free” transaction, is an indication that this data is a lot more valuable than consumers think it is.

I am still on the fence about this.

I can’t get over the feeling that successfully selling a user a personalized, upgraded digital igloo is such an absurd example of a “hidden cost” that it belies the whole argument that these services have hidden costs.

Splitting hairs perhaps, it seems reasonable to say that Penguin Club has a free version, which is negotiated as one transaction. Then, conditional on the first transaction, it offers personalized igloos for real dollars. This purchase, if engaged in, would be another, different transaction, not an ex post renegotiation of the original contract with the Disney. This small difference changes the cost of the igloo from a hidden transaction cost into a normal, transparent cost. So it’s no big deal!

Does the use of PII create a bilateral dependence between Disney and the users of Penguin Club? Yes, in a sense. Any application of attention to an information service, learning how to use it and getting it to be part of your life, is in a sense a bilateral dependence with a switching cost. But there are so many other free games to play on the internet that these costs seem hardly hidden. They could just be understood as part of the game. Meanwhile, we are basically unconcerned with Disney’s “dependence” on the consumer data, because Disney can get new users easily (unless the user is a “whale”, who actual pays the company). And “dependence” Disney has on particular users is a hidden cost for Disney, not for the user, and who cares about Disney.

The cases of identity theft or job loss are strange cases that seem to have more to do with freaky data reuse than what’s going on with a particular transaction. Purpose binding notices and restrictions, which are being normed on through generalized GDPR compliance, seem adequate to deal with these cases.

So, I have two conclusions:

(1) Maybe TCE is the right lens for making an economic argument for why purpose binding restrictions are a good idea. They make transactions with platforms less incomplete, avoiding the moral hazard of ex post use of data in ways that incurs asymmetrically unknown effects on users.

(2) This TCE analysis of platforms doesn’t address the explanatorily powerful point that attention is part of the trade. In addition to being concretely what the user is “giving up” to the platform and directly explaining monetization in some circumstances, the fact that attention is “sticky” and creates some amount of asset-specific learning is a feature of the information economy more generally. Maybe it needs a closer look.

References

Hoofnagle, Chris Jay, and Jan Whittington. “Free: accounting for the costs of the internet’s most popular price.” UCLA L. Rev. 61 (2013): 606.

by Sebastian Benthall at December 09, 2018 09:01 PM

December 06, 2018

Ph.D. student

Data isn’t labor because using search engines is really easy

A theme I’ve heard raised in a couple places recently, including Ibarra et al. “Should We Treat Data As Labor?” and the AI Now 2018 Report, is that there is something wrong with how “data”, particularly data “produced” by people on the web, is conceptualized as part of the economy. Creating data, the argument goes, requires labor. And as the product of labor, it should be protected according to the values and practices of labor movements in the past. In particular, the current uses of data in, say, targeted advertising, social media, and search, are exploitative; the idea that consumers ‘pay’ for these services with their data is misleading and ultimately unfair to the consumer. Somehow the value created by the data should be reapportioned back to the user.

This is a sexy and popular argument among a certain subset of intellectuals who care about these things. I believe the core emotional appeal of this proposal is this: It is well known that a few well-known search engine and social media companies, namely Google and Facebook, are rich. If the value added by user data were in part returned to the users, the users, who are compared to Google and Facebook not rich, would get something they otherwise would not get. I.e., the benefits for recognizing the labor involved in creating data is redistribution of surplus to The Rest of Us.

I don’t have a problem personally with that redistributive impulse. However, I don’t think the “data is labor” argument actually makes much sense.

Why not? Well, let’s take the example of a search engine. Here is the transaction between a user and a search engine:

  • Alice types a query, “avocado toast recipes”, into the search engine. This submits data to the company computers.
  • The company computers use that data to generate a list of results that it deems relevant to that query.
  • Alice sees the results, and maybe clicks on one or two of them, if they are good, in the process of navigating to the thing she was looking for in the first place.
  • The search engine records that click as well, in order to better calibrate how to respond to others making that query.

We might forget that the search engine is providing Alice a service and isn’t just a ubiquitous part of the infrastructure we should take for granted. The search engine has provided Alice with relevant search results. What this does is (dramatically) reduce Alice’s search costs; had she tried to find the relevant URL by asking her friends, organically surfing the web, or using the library, who knows what she would have found or how long it would take her. But we would assume that Alice is using the search engine because it gets her more relevant results, faster.

It is not clear how Alice could get this thing she wants without going through the motions of typing and clicking and submitting data. These actions all seem like a bare minimum of what is necessary to conduct this kind of transaction. Similarly, when I got to a grocery store and buy vegetables, I have to get out my credit card and swipe it at the machine. This creates data–the data about my credit card transaction. But I would never advocate for recognizing my hidden labor at the credit card machine is necessary to avoid the exploitation of the credit card companies, who then use that information to go about their business. That would be insane.

Indeed, it is a principle of user interface design that the most compelling user interfaces are those that require the least effort from their users. Using search engines is really, really easy because they are designed that way. The fact that oodles of data are collected from a person without that person exerting much effort may be problematic in a lot of ways. But it’s not problematic because it’s laborious for the user; it is designed and compelling precisely because it is labor-saving. The smart home device industry has taken this even further, building voice-activated products for people who would rather not use their hands to input data. That is, if anything, less labor for the user, but more data and more processing on the automated part of the transaction. That the data is work for the company, and less work for the user, indicates that data is not the same thing as user labor.

There is a version of this argument that brings up feminism. Women’s labor, feminists point out, has long been insufficiently recognized and not properly remunerated. For example, domestic labor traditionally performed by women has been taken for granted, and emotional labor (the work of controlling ones emotions on the job), which has often been feminized, has not been taken seriously enough. This is a problem, and the social cause of recognizing women’s labor and rewarding it is, ceteris paribus, a great thing. But, and I know I’m on dicey ground here, so bear with me, this does not mean that everything that women do that they are not paid to do is unrecognized labor in the sense that is relevant for feminist critiques. Case in point, both men and women use credit cards to buy things, and make telephone calls, and drive vehicles through toll booths, and use search engines, and do any number of things that generate “data”, and in most of these cases it is not remunerated directly; but this lack of remuneration isn’t gendered. I would say, perhaps controversially, that the feminist critique does not actually apply to the general case of user generated data much at all! (Though is may apply in specific cases that I haven’t thought of.)

So in conclusion, data isn’t labor, and labor isn’t data. They are different things. We may want a better, more just, political outcome with respect to the distribution of surplus from the technology economy. But trying to get there through an analogy between data and labor is a kind of incoherent way to go about it. We should come up with a better, different way.

So what’s a better alternative? If the revenue streams of search engines are any indication, then it would seem that users “pay” for search engines through being exposed to advertising. So the “resource” that users are giving up in order to use the search engine is attention, or mental time; hence the term, attention economy.

Framing the user cost of search engines in terms of attention does not easily lend itself to an argument for economic reform. Why? Because search engines are already saving people a lot of that attention by making it so easy to look stuff up. Really the transaction looks like:

  • Alice pays some attention to Gob (the search engine).
  • Gob gives Alice some good search results back in return, and then…
  • Gob passes on some of Alice’s attention through to Bob, the advertiser, in return for money.

So Alice gives up attention but gets back search results and the advertisement. Gob gets money. Bob gets attention. The “data” that matters is not the data transmitted from Alice’s computer up to Gob. Rather, the valuable data is the data that Alice receives through her eyes: of this data, the search results are positively valued, the advertisement is negatively valued, but the value of the bundled good is net positive.

If there is something unjust about this economic situation, it has to be due to the way consumer’s attention is being managed by Gob. Interestingly, those who have studied the value of ‘free’ services in attentional terms have chalked up a substantial consumer surplus due to saved attention (Brynjolfsson and Oh, 2012) This appears to be the perspective of management scientists, who tend to be pro-business, and is not a point repeated often by legal scholars, who tend to be more litigious in outlook. For example, legal scholarship has detailed the view of how attention could be abused through digital market manipulation (Calo, 2013).

Ironically for data-as-labor theorists, the search-engine-as-liberator-of-attention argument could be read as the view that what people get from using search engines is more time, or more ability to do other things with their time. In other words, we would use a search engine instead of some other, more laborious discovery mechanism precisely because it would cost us net negative labor. That absolutely throws a wrench in any argument that the users of search engines should be rewarded on dignity of labor grounds. Instead, what’s happened is that search engines are ubiquitous because consumers have undergone a phase transition in their willingness to work to discover things, and now very happily use search engines which, on the whole, seem like a pretty good deal! (The cost of being-advertised-to is small compared to the benefits of the search results.)

If we start seeing search engines as a compelling labor-saving device rather than a exploiter of laborious clickwork, then some of the disregard consumers have for privacy on search engines becomes more understandable. People are willing to give up their data, even if they would rather not, because search engines are saving them so much time. The privacy harms that come as consequence, then, can be seen as externalities to what is essentially a healthy transaction, rather than a perverse matter of a business model that is evil to the bone.

This is, I wager, on the whole a common sense view, one that I’d momentarily forgotten because of my intellectual milieu but now am ashamed to have overlooked. It is, on the whole, far more optimistic than other attempt to characterize the zeitgeist of new technology economy.

Somehow, this rubric for understanding the digital economy appears to have fallen out of fashion. Davenport and Beck (2001) wrote a business book declaring attention to be “the new currency of business”, which if the prior analysis is correct makes more sense than data being the new currency (or oil) of business. The term appears to have originated in an article by Goldhaber (1997). Ironically, the term appears to have had no uptake in the economics literature, despite it being the key to everything! The concept was understood, however, by Herbert Simon, in 1971 (see also Terranova, 2012):

In an information-rich world, the wealth of information means a dearth of something else: a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.

(A digitized version of this essay, which amazingly appears to be set by a typewriter and then hand-edited (by Simon himself?) can be found here.)

This is where I bottom out–the discover that the line of thought I’ve been on all day starts with Herbert Simon, that the sciences of the artificial are not new, they are just forgotten (because of the glut of other information), and exhaustingly hyped. The attention economy discovered by Simon explains why each year we are surrounded with new theories about how to organize ourselves with technology, when perhaps the wisest perspectives on these topics are ones that will not hype themselves because their authors cannot tweet from the grave.

References

Arrieta-Ibarra, Imanol, et al. “Should We Treat Data as Labor? Moving beyond” Free”.” AEA Papers and Proceedings. Vol. 108. 2018.

Brynjolfsson, Erik, and JooHee Oh. “The attention economy: measuring the value of free digital services on the Internet.” (2012).

Calo, Ryan. “Digital market manipulation.” Geo. Wash. L. Rev. 82 (2013): 995.

Davenport, Thomas H., and John C. Beck. The attention economy: Understanding the new currency of business. Harvard Business Press, 2001.

Goldhaber, Michael H. “The attention economy and the net.” First Monday 2.4 (1997).

Simon, Herbert A. “Designing organizations for an information-rich world.” (1971): 37-72.

Terranova, Tiziana. “Attention, economy and the brain.” Culture Machine 13 (2012).

by Sebastian Benthall at December 06, 2018 08:23 PM

December 04, 2018

Ph.D. student

the make or buy decision (TCE) in the software and cybersecurity

The paradigmatic case of transaction cost economics (TCE) is the make-or-buy decision. A firm, F, needs something, C. Do they make it in-house or do they buy it from somewhere else?

If the firm makes it in-house, they will incur some bureaucratic overhead costs in addition to the costs of production. But they will also be able to specialize C for their purposes. They can institute their own internal quality controls. And so on.

If the firm buys it on the open market from some other firm, say, G, they don’t pay the overhead costs. They do lose the benefits of specialization, and the quality controls are only those based on economic competitive pressure on suppliers.

There is an intermediate option, which is a contract between F and G which establishes an ongoing relationship between the two firms. This contract creates a field in which C can be specialized for F, and there can be assurances of quality, while the overhead is distributed efficiently between F and G.

This situation is both extremely common in business practice and not well handled by neoclassical, orthodox economics. It’s the case that TCE is tremendously preoccupied with.


My background and research is in the software industry, which is rife with cases like these.

Developers are constantly faced with a decision to make-or-buy software components. In principle, they can developer any component themselves. In practice, this is rarely cost-effective.

In software, open source software components are a prevalent solution to this problem. This can be thought of as a very strange market where all the prices are zero. The most popular open source libraries are very generic , having little “asset specificity” in TCE terms.

The lack of contract between developers and open source components/communities is sometimes seen as a source of hazard in using open source components. The recent event-stream hack, where an upstream component was injected with malicious code by a developer who had taken over maintaining the package, illustrates the problems of outsourcing technical dependencies without a contract. In this case, the quality problem is manifest as a supply chain cybersecurity problem.

In Williamson’s analysis, these kinds of hazards are what drive firms away from purchasing on spot markets and towards contracting or in-house development. In practice, the role of open source support companies fills the role of being a responsible entity G that firm F can build a relationship with.

by Sebastian Benthall at December 04, 2018 04:51 PM

December 03, 2018

MIMS 2012

My Beliefs About Design

Trees by Spencer Backman on Unsplash Photo by Spencer Backman on Unsplash

  • Design doesn’t own the customer experience. A great customer experience is the emergent outcome of the contributions of every department.
  • Design is not the center of the universe. Design is one function of many at an organization.
  • Other departments have more customer contact than you. Listen to them.
  • Don’t hand your work down from on high and expect everyone to worship its genius. You need to bring people along for the ride so that they see how you got to your solution, and can get there themselves.
  • Everyone can improve the customer’s experience, not just designers. Foster an environment where everyone applies a user-centered mindset to their work.
  • There’s no perfect, one-size-fits-all design process. Skilled designers have a variety of tools in their tool belt, and know when to use each one.
  • Done is better than perfect.
  • No design is perfect. Always be iterating.
  • Don’t fall in love with your designs. Be willing to kill your darlings.
  • You should always feel a little uncomfortable showing your work to peers. If you don’t, you’ve waited too long.
  • The only thing that matters to customers is what ships. Not your prototypes, wireframes, user journeys, or any other artifact of the design process.
  • The only true measure of your design’s success is the response from customers.
  • Stay curious. Regularly seek out new ideas, experiences, and perspectives.
  • Stay humble. You don’t know what’s best just because the word “designer” is in your title.
  • Don’t hide behind jargon and the cloak of the “creative.”
  • Great design is rooted in empathy. Empathy not just for the end user, but also for your coworkers, company, and society.
  • Having empathy for your customers means actually talking to them.
  • Don’t automatically ignore someone’s feedback because they’re more junior than you, or don’t have “designer” in their title. Don’t automatically listen to someone’s feedback because they’re more senior than you.
  • Design needs to be aligned to the needs of the business, and deliver measurable business value. Don’t design for design’s sake.

by Jeff Zych at December 03, 2018 09:12 PM

Ph.D. student

Williamson on four injunctions for good economics

Williamson (2008) (pdf) concludes with a description of four injunctions for doing good economics, which I will quote verbatim.

Robert Solow’s prescription for doing good economics is set out in three injunctions: keep it simple; get it right; make it plausible (2001, p. 111). Keeping it simple entails stripping away the inessentials and going for the main case (the jugular). Getting it right “includes translating economic concepts into accurate
mathematics (or diagrams, or words) and making sure that further logical operations are correctly performed and verified” (Solow, 2001, p. 112). Making it plausible entails describing human actors in (reasonably) veridical ways and maintaining meaningful contact with the phenomena of interest (contractual or otherwise).

To this, moreover, I would add a fourth injunction: derive refutable implications to which the relevant (often microanalytic) data are brought to bear. Nicholas Georgescu-Roegen has a felicitous way of putting it: “The purpose of science in general is not prediction, but knowledge for its own sake,” yet prediction is “the touchstone of scientific knowledge” (1971, p. 37).

Why the fourth injunction? This is necessitated by the need to choose among alternative theories that purport to deal with the same phenomenon—say vertical integration—and (more or less) satisfy the first three injunctions. Thus assume that all of the models are tractable, that the logic of each hangs together, and that agreement cannot be reached as to what constitutes veridicality and meaningful contact with the phenomena. Does each candidate theory then have equal claimsfor our attention? Or should we be more demanding? This is where refutable implications and empirical testing come in: ask each would-be theory to stand up and be counted.

Why more economists are not insistent upon deriving refutable implications and submitting these to empirical tests is a puzzle. One possibility is that the world of theory is set apart and has a life of its own. A second possibility is that some economists do not agree that refutable implications and testing are
important. Another is that some theories are truly fanciful and their protagonists would be discomfited by disclosure. A fourth is that the refutable implications of favored theories are contradicted by the data. And perhaps there are still other reasons. Be that as it may, a multiplicity of theories, some of which are
vacuous and others of which are fanciful, is an embarrassment to the pragmatically oriented members of the tribe. Among this subset, insistence upon the fourth injunction—derive refutable implications and submit these to the data—is growing.

References

Williamson, Oliver E. “Transaction cost economics.” Handbook of new institutional economics. Springer, Berlin, Heidelberg, 2008. 41-65.

by Sebastian Benthall at December 03, 2018 04:41 PM

December 02, 2018

Ph.D. student

Discovering transaction cost economics (TCE)

I’m in the process of discovering transaction cost economics (TCE), the branch of economics devoted to the study of transaction costs, which include bargaining and search costs. Oliver Williamson, who is a professor at UC Berkeley, won the Nobel Prize for his work on TCE in 2009. I’m starting with the Williamson, 2008 article (in the References) which seems like a late-stage overview of what is a large body of work.

Personally, this is yet another time when I’ve discovered that the answers or proper theoretical language for understanding something I am struggling with has simply been Somewhere Else all alone. Delight and frustration are pretty much evening each other out at this point.

Why is TCE so critical (to me)?

  • I think the real story about how the Internet and AI have changed things, which is the topic constantly reiterated in so many policy and HCI studies about platforms, is that they reduced search costs. However, it’s hard to make the case for that without a respectable theorization of search costs and how they matter to the economy. This, I think, what transaction cost economics are about.
  • You may recall I wrote my doctoral dissertation about “data economics” on the presumption (which was, truly, presumptuous) that a proper treatment of the role of data in the economy had not yet been done. This was due mainly to the deficiencies of the discussion of information in neoclassical economic theory. But perhaps I was a fool, because it may be that this missing-link work on information economics has been in transaction cost economics all along! Interestingly, Pat Bajari, who is Chief Economist at Amazon, has done some TCE work, suggesting that like Hal Varian’s economics, this is stuff that actually works in a business context, which is more or less the epistemic standard you want economics to meet. (I would argue that economics should be seen, foremost, as a discipline of social engineering.)
  • A whole other line of research I’ve worked on over the years has been trying to understand the software supply chain, especially with respect to open source software (Benthall 2016; Benthall, 2017). That’s a tricky topic because the idea of “supply” and “chain” in that domain are both highly metaphorical and essentially inaccurate. Yet there are clearly profound questions about the relationships between sociotechnical organizations, their internal and external complexity, and so on to be found there, along with (and this is really what’s exciting about it) ample empirical basis to support arguments about it, just by the nature of it. Well, it turns out that the paradigmatic case for transaction cost economics is vertical integration, or the “make-or-buy” decision wherein a firm decides to (A) purchase it from an open market, (D) produce something in-house, or (C) (and this is the case that transaction cost economics really tries to develop) engage with the supplier in a contract which creates an ongoing and secure relationship between them. Labor contracts are all, for reasons that I may go into later, of this (C) kind.

So, here comes TCE, with its firm roots in organization theory, Hayekian theories of the market, Coase’s and other theories of the firm, and firm emphasis on the supply chain relation between sociotechnical organizations. And I HAVEN’T STUDIED IT. There is even solid work on its relation to privacy done by Whittington and Hoofnagle (2011; 2013). How did I not know about this? Again, if I were not so delighted, I would be livid.

Please expect a long series of posts as I read through the literature on TCE and try to apply it to various cases of interest.

References

Benthall, S. (2017) Assessing Software Supply Chain Risk Using Public Data. IEEE STC 2017 Software Technology Conference.

Benthall, S., Pinney, T., Herz, J., Plummer, K. (2016) An Ecological Approach to Software Supply Chain Risk Management. Proceedings of the 15th Python in Science Conference. p. 136-142. Ed. Sebastian Benthall and Scott Rostrup.

Hoofnagle, Chris Jay, and Jan Whittington. “Free: accounting for the costs of the internet’s most popular price.” UCLA L. Rev. 61 (2013): 606.

Whittington, Jan, and Chris Jay Hoofnagle. “Unpacking Privacy’s Price.” NCL Rev. 90 (2011): 1327.

Williamson, Oliver E. “Transaction cost economics.” Handbook of new institutional economics. Springer, Berlin, Heidelberg, 2008. 41-65.

by Sebastian Benthall at December 02, 2018 10:31 PM

November 30, 2018

adjunct professor

Amsterdam Privacy Conference 2018

Opening keynote talk on The Tethered Economy, 87(4) Geo. Wash. L. Rev. ___ (2019)(with Aaron Perzanowski and Aniket Kesari), Amsterdam Privacy Conference, Oct. 2018.

by chris at November 30, 2018 05:55 AM

November 29, 2018

Ph.D. student

For fairness in machine learning, we need to consider the unfairness of racial categorization

Pre-prints of papers accepted to this coming 2019 Fairness, Accountability, and Transparency conference are floating around Twitter. From the looks of it, many of these papers add a wealth of historical and political context, which I feel is a big improvement.

A noteworthy paper, in this regard, is Hutchinson and Mitchell’s “50 Years of Test (Un)fairness: Lessons for Machine Learning”, which puts recent ‘fairness in machine learning’ work in the context of very analogous debates from the 60’s and 70’s that concerned the use of testing that could be biased due to cultural factors.

I like this paper a lot, in part because it is very thorough and in part because it tees up a line of argument that’s dear to me. Hutchinson and Mitchell raise the question of how to properly think about fairness in machine learning when the protected categories invoked by nondiscrimination law are themselves social constructs.

Some work on practically assessing fairness in ML has tackled the problem of using race as a construct. This echoes concerns in the testing literature that stem back to at least 1966: “one stumbles immediately over the scientific difficulty of establishing clear yardsticks by which people can be classified into convenient racial categories” [30]. Recent approaches have used Fitzpatrick skin type or unsupervised clustering to avoid racial categorizations [7, 55]. We note that the testing literature of the 1960s and 1970s frequently uses the phrase “cultural fairness” when referring to parity between blacks and whites.

They conclude that this is one of the areas where there can be a lot more useful work:

This short review of historical connections in fairness suggest several concrete steps forward for future research in ML fairness: Diving more deeply into the question of how subgroups are defined, suggested as early as 1966 [30], including questioning whether subgroups should be treated as discrete categories at all, and how intersectionality can be modeled. This might include, for example, how to quantify fairness along one dimension (e.g., age) conditioned on another dimension (e.g., skin tone), as recent work has begun to address [27, 39].

This is all very cool to read, because this is precisely the topic that Bruce Haynes and I address in our FAT* paper, “Racial categories in machine learning” (arXiv link). The problem we confront in this paper is that the racial categories we are used to using in the United States (White, Black, Asian) originate in the white supremacy that was enshrined into the Constitution when it was formed and perpetuated since then through the legal system (with some countervailing activity during the Civil Rights Movement, for example). This puts “fair machine learning” researchers in a bind: either they can use these categories, which have always been about perpetuating social inequality, or they can ignore the categories and reproduce the patterns of social inequality that prevail in fact because of the history of race.

In the paper, we propose a third option. First, rather than reify racial categories, we propose breaking race down into the kinds of personal features that get inscribed with racial meaning. Phenotype properties like skin type and ocular folds are one such set of features. Another set are events that indicate position in social class, such as being arrested or receiving welfare. Another set are facts about the national and geographic origin of ones ancestors. These facts about a person are clearly relevant to how racial distinctions are made, but are themselves more granular and multidimensional than race.

The next step is to detect race-like categories by looking at who is segregated from each other. We propose an unsupervised machine learning technique that works with the distribution of the phenotype, class, and ancestry features across spatial tracts (as in when considering where people physically live) or across a social network (as in when considering people’s professional networks, for example). Principal component analysis can identify what race-like dimensions capture the greatest amounts of spatial and social separation. We hypothesize that these dimensions will encode the ways racial categorization has shaped the social structure in tangible ways; these effects may include both politically recognized forms of discrimination as well as forms of discrimination that have not yet been surfaced. These dimensions can then be used to classify people in race-like ways as input to fairness interventions in machine learning.

A key part of our proposal is that race-like classification depends on the empirical distribution of persons in physical and social space, and so are not fixed. This operationalizes the way that race is socially and politically constructed without reifying the categories in terms that reproduce their white supremacist origins.

I’m quite stoked about this research, though obviously it raises a lot of serious challenges in terms of validation.

by Sebastian Benthall at November 29, 2018 03:05 PM

November 27, 2018

Ph.D. student

directions to migrate your WebFaction site to HTTPS

Hiya friends using WebFaction,

Securing the Web, even our little websites, is important — to set a good example, to maintain the confidentiality and integrity of our visitors, to get the best Google search ranking. While secure Web connections had been difficult and/or costly in the past, more recently, migrating a site to HTTPS has become fairly straightforward and costs $0 a year. It may get even easier in the future, but for now, the following steps should do the trick.

Hope this helps, and please let me know if you have any issues,
Nick

P.S. Yes, other friends, I recommend WebFaction as a host; I’ve been very happy with them. Services are reasonably priced and easy to use and I can SSH into a server and install stuff. Sign up via this affiliate link and maybe I get a discount on my service or something.

P.S. And really, let me know if and when you have issues. Encrypting access to your website has gotten easier, but it needs to become much easier still, and one part of that is knowing which parts of the process prove to be the most cumbersome. I’ll make sure your feedback gets to the appropriate people who can, for realsies, make changes as necessary to standards and implementations.

Updated 27 November 2018: As of Fall 2018, WebFaction's control panel now handles installing and renewing Let's Encrypt certificates, and that functionality also breaks by default the scripts described below (you'll likely start getting email errors regarding a 404 error in loading .well-known/acme-challenge). I recommend using WebFaction's Let's Encrypt support, review their simple one-button documentation. This blog post contains the full documentation in case it still proves useful, but if you want to run these scripts, you'll also want to review this issue regarding nginx configuration.

Updated 16 July 2016: to fix the cron job command, which may not have always worked depending on environment variables

Updated 2 December 2016: to use new letsencrypt-webfaction design, which uses WebFaction's API and doesn't require emails and waiting for manual certificate installation.


One day soon I hope WebFaction will make more of these steps unnecessary, but the configuring and testing will be something you have to do manually in pretty much any case. WebFaction now supports installing and renewing certificates with Let's Encrypt just by clicking a button in the control panel! While the full instructions are still included here, you should mostly only need to follow my directions for Create a secure version of your website in the WebFaction Control Panel, Test your website over HTTPS, and Redirect your HTTP site. You should be able to complete all of this in an hour some evening.

Create a secure version of your website in the WebFaction Control Panel

Login to the Web Faction Control Panel, choose the “DOMAINS/WEBSITES” tab and then click “Websites”.

“Add new website”, one that will correspond to one of your existing websites. I suggest choosing a name like existingname-secure. Choose “Encrypted website (https)”. For Domains, testing will be easiest if you choose both your custom domain and a subdomain of yourusername.webfactional.com. (If you don’t have one of those subdomains set up, switch to the Domains tab and add it real quick.) So, for my site, I chose npdoty.name and npdoty.npd.webfactional.com.

Finally, for “Contents”, click “Re-use an existing application” and select whatever application (or multiple applications) you’re currently using for your http:// site.

Click “Save” and this step is done. This shouldn’t affect your existing site one whit.

Test to make sure your site works over HTTPS

Now you can test how your site works over HTTPS, even before you’ve created any certificates, by going to https://subdomain.yourusername.webfactional.com in your browser. Hopefully everything will load smoothly, but it’s reasonably likely that you’ll have some mixed content issues. The debug console of your browser should show them to you: that’s Apple-Option-K in Firefox or Apple-Option-J in Chrome. You may see some warnings like this, telling you that an image, a stylesheet or a script is being requested over HTTP instead of HTTPS:

Mixed Content: The page at ‘https://npdoty.name/’ was loaded over HTTPS, but requested an insecure image ‘http://example.com/blah.jpg’. This content should also be served over HTTPS.

Change these URLs so that they point to https://example.com/blah.jpg (you could also use a scheme-relative URL, like //example.com/blah.jpg) and update the files on the webserver and re-test.

Good job! Now, https://subdomain.yourusername.webfactional.com should work just fine, but https://yourcustomdomain.com shows a really scary message. You need a proper certificate.

Get a free certificate for your domain

Let’s Encrypt is a new, free, automated certificate authority from a bunch of wonderful people. But to get it to setup certificates on WebFaction is a little tricky, so we’ll use the letsencrypt-webfaction utility —- thanks will-in-wi!

SSH into the server with ssh yourusername@yourusername.webfactional.com.

To install, run this command:

GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib gem2.2 install letsencrypt_webfaction

(Run the same command to upgrade; necesary if you followed these instructions before Fall 2016.)

For convenience, you can add this as a function to make it easier to call. Edit ~/.bash_profile to include:

function letsencrypt_webfaction {
    PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction $*
}

Now, let’s test the certificate creation process. You’ll need your email address, the domain you're getting a certificate for, the path to the files for the root of your website on the server, e.g. /home/yourusername/webapps/sitename/ and the WebFaction username and password you use to log in. Filling those in as appropriate, run this command:

letsencrypt_webfaction --letsencrypt_account_email you@example.com --domains yourcustomdomain.com --public /home/yourusername/webapps/sitename/ --username webfaction_username --password webfaction_password

If all went well, you’ll see nothing on the command line. To confirm that the certificate was created successfully, check the SSL certificates tab on the WebFaction Control Panel. ("Aren't these more properly called TLS certificates?" Yes. So it goes.) You should see a certificate listed that is valid for your domain yourcustomdomain.com; click on it and you can see the expiry date and a bunch of gobblydegook which actually is the contents of the certificate.

To actually apply that certificate, head back to the Websites tab, select the -secure version of your website from the list and in the Security section, choose the certificate you just created from the dropdown menu.

Test your website over HTTPS

This time you get to test it for real. Load https://yourcustomdomain.com in your browser. (You may need to force refresh to get the new certificate.) Hopefully it loads smoothly and without any mixed content warnings. Congrats, your site is available over HTTPS!

You are not done. You might think you are done, but if you think so, you are wrong.

Set up automatic renewal of your certificates

Certificates from Let’s Encrypt expire in no more than 90 days. (Why? There are two good reasons.) Your certificates aren’t truly set up until you’ve set them up to renew automatically. You do not want to do this manually every few months; you will forget, I promise.

Cron lets us run code on WebFaction’s server automatically on a regular schedule. If you haven’t set up a cron job before, it’s just a fancy way of editing a special text file. Run this command:

EDITOR=nano crontab -e

If you haven’t done this before, this file will be empty, and you’ll want to test it to see how it works. Paste the following line of code exactly, and then hit Ctrl-O and Ctrl-X to save and exit.

* * * * * echo "cron is running" >> $HOME/logs/user/cron.log 2>&1

This will output to that log every single minute; not a good cron job to have in general, but a handy test. Wait a few minutes and check ~/logs/user/cron.log to make sure it’s working.

Rather than including our username and password in our cron job, we'll set up a configuration file with those details. Create a file config.yml, perhaps at the location ~/le_certs. (If necessary, mkdir le_certs, touch le_certs/config.yml, nano le_certs/config.yml.) In this file, paste the following, and then customize with your details:

letsencrypt_account_email: 'you@example.com'
api_url: 'https://api.webfaction.com/'
username: 'webfaction_username'
password: 'webfaction_password'

(Ctrl-O and Ctrl-X to save and close it.) Now, let’s edit the crontab to remove the test line and add the renewal line, being sure to fill in your domain name, the path to your website’s directory, and the path to the configuration file you just created:

0 4 15 */2 * PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib /usr/local/bin/ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction --domains example.com --public /home/yourusername/webapps/sitename/ --config /home/yourusername/le_certs/config.yml >> $HOME/logs/user/cron.log 2>&1

You’ll probably want to create the line in a text editor on your computer and then copy and paste it to make sure you get all the substitutions right. Paths must be fully specified as the above; don't use ~ for your home directory. Ctrl-O and Ctrl-X to save and close it. Check with crontab -l that it looks correct. As a test to make sure the config file setup is correct, you can run the command part directly; if it works, you shouldn't see any error messages on the command line. (Copy and paste the line below, making the the same substitutions as you just did for the crontab.)

PATH=$PATH:$GEM_HOME/bin GEM_HOME=$HOME/.letsencrypt_webfaction/gems RUBYLIB=$GEM_HOME/lib /usr/local/bin/ruby2.2 $HOME/.letsencrypt_webfaction/gems/bin/letsencrypt_webfaction --domains example.com --public /home/yourusername/webapps/sitename/ --config /home/yourusername/le_certs/config.yml

With that cron job configured, you'll automatically get a new certificate at 4am on the 15th of alternating months (January, March, May, July, September, November). New certificates every two months is fine, though one day in the future we might change this to get a new certificate every few days; before then WebFaction will have taken over the renewal process anyway. Debugging cron jobs can be tricky (I've had to update the command in this post once already); I recommend adding an alert to your calendar for the day after the first time this renewal is supposed to happen, to remind yourself to confirm that it worked. If it didn't work, any error messages should be stored in the cron.log file.

Redirect your HTTP site (optional, but recommended)

Now you’re serving your website in parallel via http:// and https://. You can keep doing that for a while, but everyone who follows old links to the HTTP site won’t get the added security, so it’s best to start permanently re-directing the HTTP version to HTTPS.

WebFaction has very good documentation on how to do this, and I won’t duplicate it all here. In short, you’ll create a new static application named “redirect”, which just has a .htaccess file with, for example, the following:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
RewriteCond %{HTTP:X-Forwarded-SSL} !on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

This particular variation will both redirect any URLs that have www to the “naked” domain and make all requests HTTPS. And in the Control Panel, make the redirect application the only one on the HTTP version of your site. You can re-use the “redirect” application for different domains.

Test to make sure it’s working! http://yourcustomdomain.com, http://www.yourcustomdomain.com, https://www.yourcustomdomain.com and https://yourcustomdomain.com should all end up at https://yourcustomdomain.com. (You may need to force refresh a couple of times.)

by nick@npdoty.name at November 27, 2018 09:01 PM

November 26, 2018

Ph.D. student

Is competition good for cybersecurity?

A question that keeps coming up in various forms, but for example in response to recent events around the ‘trade war’ between the U.S. and China and its impact on technology companies, is whether or not market competition is good or bad for cyber-security.

Here is a simple argument for why competition could be good for cyber-security: The security of technical products is a positive quality of them, something that consumers would like. Market competition is what gets producers to make higher quality products at lower cost. Therefore, competition is good for security.

Here is an argument for why competition could be bad for cyber-security: Security is a hard thing for any consumer to understand; since most won’t, we have an information asymmetry here and therefore a ‘market for lemons’ kind of market failure. Therefore, competition is bad for security. It would be better to have a well-regulated monopoly.

This argument echoes, though it doesn’t exactly parallel, some of the arguments in Pasquale’s work on Hamiltonian’s and Jeffersonian’s in technology platform regulation.

by Sebastian Benthall at November 26, 2018 12:36 AM

November 18, 2018

Ph.D. student

“the privatization of public functions”

An emerging theme from the conference on Trade Secrets and Algorithmic Systems was that legal scholars have become concerned about the privatization of public functions. For example, the use of proprietary risk assessment tools instead of the discretion of judges who are supposed to be publicly accountable is a problem. More generally, use of “trade secrecy” in court settings to prevent inquiry into software systems is bogus and moves more societal control into the realm of private ordering.

Many remedies were proposed. Most involved some kind of disclosure and audit to experts. The most extreme form of disclosure is making the software and, where it’s a matter of public record, training data publicly available.

It is striking to me to be encountering the call for government use of open source systems because…this is not a new issue. The conversation about federal use of open source software was alive and well over five years ago. Then, the arguments were about vendor lock-in; now, they are about accountability of AI. But the essential problem of whether core governing logic should be available to public scrutiny, and the effects of its privatization, have been the same.

If we are concerned with the reliability of a closed and large-scale decision-making process of any kind, we are dealing with problems of credibility, opacity, and complexity. The prospects of an efficient market for these kinds of systems are dim. These market conditions are the conditions of sustainability of open source infrastructure. Failures in sustainability are manifest as software vulnerabilities, which are one of the key reasons why governments are warned against OSS now, though the process of measurement and evaluation of OSS software vulnerability versus proprietary vulnerabilities is methodologically highly fraught.

by Sebastian Benthall at November 18, 2018 05:25 PM

November 16, 2018

Ph.D. student

Trade secrecy, “an FDA for algorithms”, a software bills of materials (SBOM) #SecretAlgos

At the Conference on Trade Secrets and Algorithmic Systems at NYU today, the target of most critiques is the use of trade secrecy by proprietary technology providers to prevent courts and the public from seeing the inner workings of algorithms that determine people’s credit scores, health care, criminal sentencing, and so on. The overarching theme is that sometimes companies will use trade secrecy to hide the ways that their software is bad, and that that is a problem.

In one panel, the question of whether an “FDA for Algorithms” is on the table–referring the Food and Drug Administration’s approval of pharmaceuticals. It was not dealt with in too much depth, which is too bad, because it is a nice example of how government oversight of potentially dangerous technology is managed in a way that respects trade secrecy.

According to this article, when filing for FDA approval, a company can declare some of their ingredients to be trade secrets. The upshot of that is that those trade secrets are not subject to FOIA requests. However, these ingredients are still considered when approval is granted by the FDA.

It so happens that in the cybersecurity policy conversation (more so than in privacy) the question of openness of “ingredients” to inspection has been coming up in a serious way. NTIA has been hosting multistakeholder meetings about standards and policy around Software Component Transparency. In particular they are encouraging standardizations of Software Bills of Materials (SBOM) like the Linux Foundation’s Software Package Data Exchange (SPDX). SPDX (and SBOM’s more generally) describe the “ingredients” in a software package at a higher level of resolution than exposing the full source code, but at a level specific enough useful for security audits.

It’s possible that a similar method could be used for algorithmic audits with fairness (i.e., nondiscrimination compliance) and privacy (i.e., information sharing to third-parties) in mind. Particular components could be audited (perhaps in a way that protects trade secrecy), and then those components could be listed as “ingredients” by other vendors.

by Sebastian Benthall at November 16, 2018 08:56 PM

The paradox of ‘data markets’

We often hear that companies are “selling out data”, or that we are “paying for services” with our data. Data brokers literally buy and sell data about people. There are other forms of expensive data sources or data sets. There is, undoubtedly, one or more data markets.

We know that classically, perfect competition in markets depends on perfect information. Buyers and sellers on the market need to have equal and instantaneous access to information about utility curves and prices in order for the market to price things efficiently.

Since the bread and butter of the data market is information asymmetry, we know that data markets can never be perfectly competitive. If it was, the data market would cease to exist, because the perfect information condition would entail that there is nothing to buy and sell.

Data markets therefore have to be imperfectly competitive. But since these are the markets that perfect information in other markets might depend on, this imperfection is viral. The vicissitudes of the data market are the vicissitudes of the economy in general.

The upshot is that the challenges of information economics are not only those that appear in special sectors like insurance markets. They are at the heart of all economic activity, and there are no equilibrium guarantees.

by Sebastian Benthall at November 16, 2018 04:50 PM

November 15, 2018

Center for Technology, Society & Policy

Using Crowdsourcing to address Disparities in Police Reported Data: Addressing Challenges in Technology and Community Engagement

This is a project update from a CTSP project from 2017: Assessing Race and Income Disparities in Crowdsourced Safety Data Collection (with Kate BeckAditya Medury, and Jesus M. Barajas)

Project Update

This work has led to the development of Street Story, a community engagement tool that collects street safety information from the public, through UC Berkeley SafeTREC.

The tool collects qualitative and quantitative information, and then creates maps and tables that can be publicly viewed and downloaded. The Street Story program aims to collect information that can create a fuller picture of transportation safety issues, and make community-provided information publicly accessible.

 

The Problem

Low-income groups, people with disabilities, seniors and racial minorities are at higher risk of being injured while walking and biking, but experts have limited information on what these groups need to reduce these disparities. Transportation agencies typically rely on statistics about transportation crashes aggregated from police reports to decide where to make safety improvements. However, police-reported data is limited in a number of ways. First, crashes involving pedestrians or cyclists are significantly under-reported to police, with reports finding that up to 60% of pedestrian and bicycle crashes go unreported. Second, some demographic groups, including low-income groups, people of color and undocumented immigrants, have histories of contentious relationships with police. Therefore, they may be less likely to report crashes to the police when they do occur. Third, crash data doesn’t include locations where near–misses have happened, or locations where individuals feel unsafe but an issue has not yet happened. In other words, the data allow professionals to react to safety issues, but don’t necessarily allow them to be proactive about them.

One solution to improve and augment the data agencies use to make decisions and allocate resources is to provide a way for people to report transportation safety issues themselves. Some public agencies and private firms are developing apps and websites whether people can report issues for this purpose. But one concern is that the people who are likely to use these crowdsourcing platforms are those who have access to smart phones or the internet and who trust that government agencies with use the data to make changes, biasing the data toward the needs of these privileged groups.

Our Initial Research Plan

We chose to examine whether crowdsourced traffic safety data reflected similar patterns of underreporting and potential bias as police-reported safety data. To do this, we created an online mapping tool that people could use to report traffic crashes, near-misses and general safety issues. We planned to work with a city to release this tool to and collected data from the general public, then work directly with a historically marginalized community, under-represented in police-reported data, to target data collection in a high-need neighborhood. We planned to reduce barriers to entry for this community, including meeting the participants in person to explain the tool, providing them with in-person and online training, providing participants with cell phones, and compensating their data plans for the month. By crowdsourcing data from the general public and from this specific community, we planned to analyze whether there were any differences in the types of information reported by different demographics.

This plan seemed to work well with the research question and with community engagement best practices. However, we came up against a number of challenges with our research plan. Although many municipal agencies and community organizations found the work we were doing interesting and were working to address similar transportation safety issues we were focusing on, many organizations and agencies seemed daunted by the prospect of using technology to address underlying issues of under-reporting. Finally, we found that a year was not enough time to build trusting relationships with the organizations and agencies we had hoped to work with. Nevertheless, we were able to release a web-based mapping tool to collect some crowdsourced safety data from the public.

Changing our Research Plan

To better understand how more well-integrated digital crowdsourcing platforms perform, we pivoted our research project to explore how different neighborhoods engage with government platforms to report non-emergency service needs. We assumed some of these non-emergency services would mirror the negative perceptions of bicycle and pedestrian safety we were interested in collecting via our crowdsourcing safety platform. The City of Oakland relies on SeeClickFix, a smartphone app, to allow residents to request service for several types of issues: infrastructure issues, such as potholes, damaged sidewalks, or malfunctioning traffic signals; and non-infrastructure issues such as illegal dumping or graffiti. The city also provides phone, web, and email-based platforms for reporting the same types of service requests. These alternative platforms are collectively known as 311 services. We looked at 45,744 SeeClickFix-reports and 35,271 311-reports made between January 2013 and May 2016. We classified Oakland neighborhoods by status as community of concern. In the city of Oakland, 69 neighborhoods meet the definition for communities of concern, while 43 do not. Because we did not have data on the characteristics of each person reporting a service request, we made the assumption that people reporting requests also lived in the neighborhood where the request was needed.

How did communities of concern interact with the SeeClickFix and 311 platforms to report service needs? Our analysis highlighted two main takeaways. First, we found that communities of concern were more engaged in reporting than other communities, but had different reporting dynamics based on the type of issue they were reporting. About 70 percent of service issues came from communities of concern, even though they represent only about 60 percent of the communities in Oakland. They were nearly twice as likely to use SeeClickFix than to report via the 311 platforms overall, but only for non-infrastructure issues. Second, we found that even though communities of concern were more engaged, the level of engagement was not equal for everyone in those communities. For example, neighborhoods with higher proportions of limited-English proficient households were less likely to report any type of incident by 311 or SeeClickFix.

Preliminary Findings from Crowdsourcing Transportation Safety Data

We deployed the online tool in August 2017. The crowdsourcing platform was aimed at collecting transportation safety-related concerns pertaining to pedestrian and bicycle crashes, near misses, perceptions of safety, and incidents of crime while walking and bicycling in the Bay Area. We disseminated the link to the crowdsourcing platform primarily through Twitter and some email lists. . Examples of organizations who were contacted through Twitter-based outreach and also subsequently interacted with the tweet (through likes and retweets) include Transform Oakland, Silicon Valley Bike Coalition, Walk Bike Livermore, California Walks, Streetsblog CA, and Oakland Built. By December 2017, we had received 290 responses from 105 respondents. Half of the responses corresponded to perceptions of traffic safety concerns (“I feel unsafe walking/cycling here”), while 34% corresponded to near misses (“I almost got into a crash but avoided it”). In comparison, 12% of responses reported an actual pedestrian or bicycle crash, and 4% of incidents reported a crime while walking or bicycling. The sample size of the responses is too small to report any statistical differences.

Figure 1 shows the spatial patterns of the responses in the Bay Area aggregated to census tracts. Most of the responses were concentrated in Oakland and Berkeley. Oakland was specifically targeted as part of the outreach efforts since it has significant income and racial/ethnic diversity.

Figure 1 Spatial Distribution of the Crowdsourcing Survey Responses

Figure 1 Spatial Distribution of the Crowdsourcing Survey Responses

 

In order to assess the disparities in the crowdsourced data collection, we compared responses between census tracts that are classified as communities of concern or not. A community of concern (COC), as defined by the Metropolitan Transportation Commission, a regional planning agency, is a census tract that ranks highly on several markers of marginalization, including proportion of racial minorities, low-income households, limited-English speakers, and households without vehicles, among others.

Table 1 shows the comparison between the census tracts that received at least one crowdsourcing survey response. The average number of responses received in COCs versus non-COCs across the entire Bay Area were similar and statistically indistinguishable. However, when focusing on Oakland-based tracts, the results reveal that average number of crowdsourced responses in non-COCs were statistically higher. To assess how the trends of self-reported pedestrian/cyclist concerns compare with police-reported crashes, an assessment of pedestrian and bicycle-related police-reported crashes (from 2013-2016) shows that more police-reported pedestrian/bicycle crashes were observed on an average in COCs across the Bay Area as well as in Oakland. The difference in trends observed in the crowdsourced concerns and police-reported crashes suggest that either walking/cycling concerns are greater in non-COCs (thus underrepresented in police crashes), or that participation from among COCs is relatively underrepresented.

Table 1 Comparison of crowdsourced concerns and police-reported pedestrian/bicycle crashes in census tracts that received at least 1 response

Table 1 Comparison of crowdsourced concerns and police-reported pedestrian/bicycle crashes in census tracts that received at least 1 response

Table 2 compares the self-reported income and race/ethnicity characteristics of the respondents with the locations where the responses were reported. For reference purposes, Bay Area’s median household income in 2015 was estimated to be $85,000 (Source: http://www.vitalsigns.mtc.ca.gov/income), and Bay Area’s population was estimated to be 58% White, per the 2010 Census, (Source: http://www.bayareacensus.ca.gov/bayarea.htm).

Table 2 Distribution of all Bay Area responses based on the location of response and the self-reported income and race/ethnicity of respondents

The results reveal that White, medium-to-high income respondents were observed to report more walking/cycling -related safety issues in our survey, and more so in non-COCs. This trend is also consistent with the definition of COCs, which tend to have a higher representation of low-income people and people of color. However, if digital crowdsourcing without widespread community outreach is more likely to attract responses from medium-to-high income groups, and more importantly, if they only live, work, or play in a small portion of the region being investigated, the aggregated results will reflect a biased picture of a region’s transportation safety concerns. Thus, while the scalability of digital crowdsourcing provides an opportunity for capturing underrepresented transportation concerns, it may require greater collaboration with low-income, diverse neighborhoods to ensure uniform adoption of the platform.

Lessons Learned

From our attempts to work directly with community groups and agencies and our subsequent decision to change our research focus, we learned a number of lessons:

  1. Develop a research plan in partnership with communities and agencies. This would have allowed us to ensure that we began with a research plan in which community groups and agencies were better able to partner with us on, and this would have ensured that the partners were on board the topic of interest and the methods we hoped to use.
  2. Recognize the time it takes to build relationships. We found that building relationships with agencies and communities was more time intensive and took longer that we had hoped. These groups often have limitations on the time they can dedicate to unfunded projects. Next time, we should plan for this in our initial research plan.
  3. Use existing data sources to supplement research. We found that using See-Click-Fix and 311 data was a way to collect and analyze information to add context to our research question. Although the data did not have all demographic information we had hoped to analyze, this data source added additional context to the data we collected.
  4. Speak in a language that the general public understands. We found that when we used the term self-reporting, rather than crowdsourcing, when talking to potential partners and to members of the public, these individuals were more willing to consider the use of technology to collect information on safety issues from the public as legitimate. Using vocabulary and phrasing that people are familiar with is crucial when attempting to use technology to benefit the social good.

by Daniel Griffin at November 15, 2018 05:44 PM

Ph.D. student

The Crevasse: a meditation on accountability of firms in the face of opacity as the complexity of scale

To recap:

(A1) Beneath corporate secrecy and user technical illiteracy, a fundamental source of opacity in “algorithms” and “machine learning” is the complexity of scale, especially scale of data inputs. (Burrell, 2016)

(A2) The opacity of the operation of companies using consumer data makes those consumers unable to engage with them as informed market actors. The consequence has been a “free fall” of market failure (Strandburg, 2013).

(A3) Ironically, this “free” fall has been “free” (zero price) for consumers; they appear to get something for nothing without knowing what has been given up or changed as a consequence (Hoofnagle and Whittington, 2013).

Comments:

(B1) The above line of argument conflates “algorithms”, “machine learning”, “data”, and “tech companies”, as is common in the broad discourse. That this conflation is possible speaks to the ignorance of the scholarly position on these topics, and ignorance that is implied by corporate secrecy, technical illiteracy, and complexity of scale simultaneously. We can, if we choose, distinguish between these factors analytically. But because, from the standpoint of the discourse, the internals are unknown, the general indication of a ‘black box’ organization is intuitively compelling.

(B1a) Giving in to the lazy conflation is an error because it prevents informed and effective praxis. If we do not distinguish between a corporate entity and its multiple internal human departments and technical subsystems, then we may confuse ourselves into thinking that a fair and interpretable algorithm can give us a fair and interpretable tech company. Nothing about the former guarantees the latter because tech companies operate in a larger operational field.

(B2) The opacity as the complexity of scale, a property of the functioning of machine learning algorithms, is also a property of the functioning of sociotechnical organizations more broadly. Universities, for example, are often opaque to themselves, because of their own internal complexity and scale. This is because the mathematics governing opacity as a function of complexity and scale are the same in both technical and sociotechnical systems (Benthall, 2016).

(B3) If we discuss the complexity of firms, as opposed the the complexity of algorithms, we should conclude that firms that are complex due to scale of operations and data inputs (including number of customers) will be opaque and therefore have strategic advantage in the market against less complex market actors (consumers) with stiffer bounds on rationality.

(B4) In other words, big, complex, data rich firms will be smarter than individual consumers and outmaneuver them in the market. That’s not just “tech companies”. It’s part of the MO of every firm to do this. Corporate entities are “artificial general intelligences” and they compete in a complex ecosystem in which consumers are a small and vulnerable part.

Twist:

(C1) Another source of opacity in data is that the meaning of data come from the causal context that generates it. (Benthall, 2018)

(C2) Learning causal structure from observational data is hard, both in terms of being data-intensive and being computationally complex (NP). (c.f. Friedman et al., 1998)

(C3) Internal complexity, for a firm, is not sufficient to be “all-knowing” about the data that is coming it; the firm has epistemic challenges of secrecy, illiteracy, and scale with respect to external complexity.

(C4) This is why many applications of machine learning are overrated and so many “AI” products kind of suck.

(C5) There is, in fact, an epistemic crevasse between all autonomous entities, each containing its own complexity and constituting a larger ecological field that is the external/being/environment for any other autonomy.

To do:

The most promising direction based on this analysis is a deeper read into transaction cost economics as a ‘theory of the firm’. This is where the formalization of the idea that what the Internet changed most are search costs (a kind of transaction cost) should be.

It would be nice if those insights could be expressed in the mathematics of “AI”.

There’s still a deep idea in here that I haven’t yet found the articulation for, something to do with autopoeisis.

References

Benthall, Sebastian. (2016) The Human is the Data Science. Workshop on Developing a Research Agenda for Human-Centered Data Science. Computer Supported Cooperative Work 2016. (link)

Sebastian Benthall. Context, Causality, and Information Flow: Implications for Privacy Engineering, Security, and Data Economics. Ph.D. dissertation. Advisors: John Chuang and Deirdre Mulligan. University of California, Berkeley. 2018.

Burrell, Jenna. “How the machine ‘thinks’: Understanding opacity in machine learning algorithms.” Big Data & Society 3.1 (2016): 2053951715622512.

Friedman, Nir, Kevin Murphy, and Stuart Russell. “Learning the structure of dynamic probabilistic networks.” Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1998.

Hoofnagle, Chris Jay, and Jan Whittington. “Free: accounting for the costs of the internet’s most popular price.” UCLA L. Rev. 61 (2013): 606.

Strandburg, Katherine J. “Free fall: The online market’s consumer preference disconnect.” U. Chi. Legal F. (2013): 95.

by Sebastian Benthall at November 15, 2018 04:10 PM

open source sustainability and autonomy, revisited

Some recent chats with Chris Holdgraf and colleagues at NYU interested in “critical digital infrastracture” have gotten me thinking again about the sustainability and autonomy of open source projects again.

I’ll admit to having had naive views about this topic in the past. Certainly, doing empirical data science work on open source software projects has given me a firmer perspective on things. Here are what I feel are the hardest earned insights on the matter:

  • There is tremendous heterogeneity in open source software projects. Almost all quantitative features of these projects fall in log-normal distributions. This suggests that the keys to open source software success are myriad and exogenous (how the technology fits in the larger ecosystem, how outside funding and recognition is accomplished, …) rather than endogenous factors (community policies, etc.) While many open source projects start as hobby and unpaid academic projects, those that go on to be successful find one or more funding sources. This funding is an exogenous factor.
  • The most significant exogenous factors to an open source software project’s success are the industrial organization of private tech companies. Developing an open technology is part of the strategic repertoire of these companies: for example, to undermine the position of a monopolist, developing an open source alternative decreases barriers to market entry and allows for a more competitive field in that sector. Another example: Google funded Mozilla for so long arguably to deflect antitrust action over Google Chrome.
  • There is some truth to Chris Kelty’s idea of open source communities as recursive publics, cultures that have autonomy that can assert political independence at the boundaries of other political forces. This autonomy comes from: the way developers of OSS get specific and valuable human capital in the process of working with the software and their communities; the way institutions begin to depend on OSS as part of their technical stack, creating an installed base; and how many different institutions may support the same project, creating competition for the scarce human capital of the developers. Essentially, at the point where the software and the skills needed to deploy it effectively and the community of people with those skills is self-organized, the OSS community has gained some economic and political autonomy. Often this autonomy will manifest itself in some kind of formal organization, whether a foundation, a non-profit, or a company like Redhat or Canonical or Enthought. If the community is large and diverse enough it may have multiple organizations supporting it. This is in principle good for the autonomy of the project but may also reflect political tensions that can lead to a schism or fork.
  • In general, since OSS development is internally most often very fluid, with the primary regulatory mechanism being the fork, the shape of OSS communities is more determined by exogenous factors than endogenous ones. When exogenous demand for the technology rises, the OSS community can find itself with a ‘surplus’, which can be channeled into autonomous operations.

by Sebastian Benthall at November 15, 2018 02:45 PM

November 13, 2018

MIMS 2012

How to Write Effective Advertisements, according to David Ogilvy

David Ogilvy is known as the “Father of Advertising.” He earned that moniker by pioneering the use of research to come up with effective ads and measure their impact. This was decades before the internet and the deluge of data we have available to us now. I can only imagine how much more potent he would be today.

He breaks down his methods in his book Ogivly on Advertising, which is just as relevant today as it was when it was written in 1983. Since I’ve found his techniques useful, I’m publishing my notes here so I can easily refer back to them and share them.

How to Write Headlines That Sell

Headlines are the most important part of your advertisements. According to research done by Ogilvy, “five times as many people read the headlines as read the body copy. It follows that unless your headline sells your product, you have wasted 90 per cent of your money.”

  • Promise a benefit. Make sure the benefit is important to your customer. For example, “whiter wash, more miles per gallon, freedom from pimples, fewer cavities.”
    • Make it persuasive, and make it unique. Persuasive headlines that aren’t unique, which your competitors can claim, aren’t effective.
  • Make it specific. Use percentages, time elapsed, dollars saved.
  • Personalize it to your audience, such as the city they’re in. (Or the words in their search query)
  • Include the brand and product name.
  • Make it as long or as short as it needs to be. Ogilvy’s research found that, “headlines with more than ten words get less readership than short headlines. On the other hand, a study of retail advertisements found that headlines of ten words sell more merchandise than short headlines. Conclusion: if you need a long headline, go ahead and write one, and if you want a short headline, that’s all right too.”
  • Make it clear and to the point, not clever or tricky.
  • Don’t use superlatives like, “Our product is the best in the world.” Market researcher George Gallup calls this “Brag and Boast.” They convince nobody.

Ideas for Headlines

  • Headlines that contain news are surefire. The news can be announcing a new product, or a new way to use an existing product. “And don’t scorn tried-and-true words like amazing, introducing, now, suddenly.”
  • Include information that’s useful to the reader, provided the information involves your product.
  • Try including a quote, such as from an expert or customers.

How to Write Persuasive Body Copy

According to Ogilvy, body copy is seldom read by more than 10% of people. But the 10% who read it are prospects. What you say determines the success of your ad, so it’s worth spending the time to get it right.

  • Address readers directly, as if you are speaking to them. "One human being to another, second person singular.”
  • Write short sentences and short paragraphs. Avoid complicated words. Use plain, everyday language.
  • Don’t write long-winded, philosophical essays. “Tell your reader what your product will do for him or her, and tell it with specifics.”
  • Write your copy in the form of a story. The headline can be a hook.
  • Avoid analogies. People often misunderstand them.
  • Just like with headlines, stay away from superlatives like, “Our product is the best in the world.”
  • Use testimonials from customers or experts (also known as “social proof”). Avoid celebrity testimonials. Most people forget the product and remember the celebrity. Further, people assume the celebrity has been bought, which is usually true.
  • Coupons and special offers work.
  • Always include the price of your products. “You may see a necklace in a jeweler’s window, but you don’t consider buying it because the price is not shown and you are too shy to go in and ask. It is the same way with advertisements. When the price of the product is left out, people have a way of turning the page.”
  • Long copy sells more than short. “I believe, without any research to support me, that advertisements with long copy convey the impression that you have something important to say, whether people read the copy or not.”
  • Stick to the facts about what your product is and can do.
  • Make the first paragraph a grabber to draw people into reading your copy.
  • Sub-headlines make copy more readable and scannable.
  • People often skip from the headline to the coupon to see the offer, so make the coupons mini-ads, complete with brand name, promise, and a mini photo of the product.
  • To keep prospects on the hook, try “limited edition,” “limited supply,” “last time at this price,” or “special price for promptness.”

Suggestions for Images

After headlines, images are the most important part of advertisements. They draw people in. Here’s what makes imagery effective:

  • The best images arouse the viewer’s curiosity. They look at it and ask, “What’s going on here?” This leads them to read the copy to find out. This is called “Story Appeal.”
  • If you don’t have a good story to tell, make your product the subject.
  • Show the end result of using your product. Before-and-after photographs are highly effective.
  • Photographs attract more readers, are more believable, and better remembered than illustrations.
  • Human faces that are larger than life size repel readers. Don’t use them.
  • Historical subjects bore people.
  • If your picture includes people, it’s most effective if it uses people your audience can identify with. Doctors if you’re trying to sell to doctors, men if you’re trying to appeal to men, and so on.
  • Include captions under your photographs. More people read captions than body copy, so make the caption a mini-advertisement.

Layout

  • KISS – Keep It Simple, Stupid.
  • “Readers look first at the illustration, then at the headline, then at the copy. So put these elements in that order.” This also follows the normal order of scanning.
  • More people read captions of images than body copy, so always include a caption under it. Captions should be mini-advertisements, so include the brand name and promise.

A Few More Tips for Effective Ads

These are some other principles I picked up from the book, which can be useful in many different types of ads.

  • Demonstrations of how well your product works are effective. Try coming up with a demonstration that your reader can perform.
  • Don’t name competitors. The ad is less believable and more confusing. People often think the competitor is the hero.
  • Problem-solution is a tried-and-true ad technique.
  • Give people a reason why they should buy.
  • Emotion can be highly effective. Nostalgia, charm, sentimentality, etc. Consumers need a rational excuse to justify their emotional decisions.
  • Cartoons don’t sell well to adults.
  • The most successful products and services are differentiated from their competitors. This is most effective if you can differentiate via low cost or highest quality. A differentiator doesn’t need to be relevant to the product’s performance, however, to be effective. For example, Owens-Corning differentiated their insulation by advertising the color of the product, which has nothing to do with how the product performs.

Ogilvy’s principles are surprisingly evergreen, despite the technological changes. Towards the end of the book he quotes Bill Bernbach, another advertising giant, on why this is:

Human nature hasn’t changed for a billion years. It won’t even vary in the next billion years. Only the superficial things have changed. It is fashionable to talk about changing man. A communicator must be concerned with unchanging man – what compulsions drive him, what instincts dominate his every action, even though his language too often camouflages what really motivates him. For if you know these things about a man, you can touch him at the core of his being. One thing is unchangingly sure. The creative man with an insight into human nature, with the artistry to touch and move people, will succeed. Without them he will fail.

Human nature hasn’t changed much, indeed.


Get the book here: Ogivly on Advertising

by Jeff Zych at November 13, 2018 06:07 AM

November 12, 2018

Ph.D. student

What proportion of data protection violations are due to “dark data” flows?

“Data protection” refers to the aspect of privacy that is concerned with the use and misuse of personal data by those that process it. Though widely debated, scholars continue to converge (e.g.) on ideal data protection consisting of alignment between the purposes the data processor will use the data for and the expectations of the user, along with collection limitations that reduce exposure to misuse. Through its extraterritorial enforcement mechanism, the GDPR has threatened to make these standards global.

The implication of these trends is that there will be a global field of data flows regulated by these kinds of rules. Many of the large and important actors that process user data can be held accountable to the law. Privacy violations by these actors will be due to a failure to act within the bounds of the law that applies to them.

On the other hand, there is also cybercrime, an economy of data theft and information flows that exists “outside the law”.

I wonder what proportion of data protection violations are due to dark data flows–flows of personal data that are handled by organizations operating outside of any effective regulation.

I’m trying to draw an analogy to a global phenomenon that I know little about but which strikes me as perhaps more pressing than data protection: the interrelated problems of money laundering, off-shore finance, and dark money contributions to election campaigns. While surely oversimplifying the issue, my impression is that the network of financial flows can be divided into those that are more and less regulated by effective global law. Wealth seeks out these opportunities in the dark corners.

How much personal data flows in these dark networks? And how much is it responsible for privacy violations around the world? Versus how much is data protection effectively in the domain of accountable organizations (that may just make mistakes here and there)? Or is the dichotomy false, with truly no firm boundary between licit and illicit data flow networks?

by Sebastian Benthall at November 12, 2018 01:37 PM

November 11, 2018

Ph.D. student

Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks [Talk]

This blog post is a version of a talk I gave at the 2018 ACM Computer Supported Cooperative Work and Social Computing (CSCW) Conference based on a paper written with Deirdre Mulligan, Ellen Van Wyk, John Chuang, and James Pierce, entitled Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks, which was honored with a best paper award. Find out more on our project page, our summary blog post, or download the paper: [PDF link] [ACM link]

In the work described in our paper, we created a set of conceptual speculative designs to explore privacy issues around emerging biosensing technologies, technologies that sense human bodies. We then used these designs to help elicit discussions about privacy with students training to be technologists. We argue that this approach can be useful for Values in Design and Privacy by Design research and practice.

dhs slide

Image from publicintelligence.net. Note the middle bullet point in the middle column – “avoids all privacy issues.”

Let me start with a motivating example, which I’ve discussed in previous talks. In 2007, the US Department of Homeland Security proposed a program to try to predict criminal behavior in advance of the crime itself –using thermal sensing, computer vision, eye tracking, gait sensing, and other physiological signals. And supposedly it would “avoid all privacy issues.” But it seems pretty clear that privacy was not fully thought through in this project. Now Homeland Security projects actually do go through privacy impact assessments and I would guess that in this case, they would probably go through the impact assessment process, find that the system doesn’t store the biosensed data, so privacy is protected. But while this might  address one conception of privacy related to storing data, there are other conceptions of privacy at play. There are still questions here about consent and movement in public space, about data use and collection, or about fairness and privacy from algorithmic bias.

While that particular imagined future hasn’t come to fruition; a lot of these types of sensors are now becoming available as consumer devices, used in applications ranging from health and quantified self, to interpersonal interactions, to tracking and monitoring. And it often seems like privacy isn’t fully thought through before new sensing devices and services are publicly announced or released.

A lot of existing privacy approaches, like privacy impact assessments, are deductive, checklist-based, or assume that privacy problems already known and well-defined in advance which often isn’t the case. Furthermore, the term “design” in discussions of Privacy by Design, is often seen as a way of providing solutions to problems identified by law, rather than viewing design as a generative set of practices useful to understanding what privacy issues might need to be considered in the first place. We argue that speculative design-inspired approaches can help explore and define problem spaces of privacy in inductive, situated, and contextual ways.

Design and Research Approach

We created a design workbook of speculative designs. Workbooks are collections of conceptual designs drawn together to allow designers to explore and reflect on a design space. Speculative design is a practice of using design to ask social questions, by creating conceptual designs or artifacts that help create or suggest a fictional world. We can create speculative designs explore different configurations of the world, imagine and understand possible alternative futures, which helps us think through issues that have relevance in the present. So rather than start with trying to find design solutions for privacy, we wanted to use design workbooks and speculative designs together to create a collection of designs to help us explore the what problem space of privacy might look like with emerging biosensing technologies.

workbook pages

A sampling of the conceptual designs we created as part of our design workbook

In our prior work, we created a design workbook to do this exploration and reflection. Inspired by recent research, science fiction, and trends from the technology industry, we created a couple dozen fictional products, interfaces, and webpages of biosensing technologies. These included smart camera enabled neighborhood watch systems, advanced surveillance systems, implantable tracking devices, and non-contact remote sensors that detect people’s heartrates. This process is documented in a paper from Designing Interactive Systems. These were created as part of a self-reflective exercise, for us as design researchers to explore the problem space of privacy. However, we wanted to know how non-researchers, particularly technology practitioners might discuss privacy in relation to these conceptual designs.

A note on how we’re approaching privacy and values.  Following other values in design work and privacy research, we want to avoid providing a single universalizing definition of privacy as a social value. We recognize privacy as inherently multiple – something that is situated and differs within different contexts and situations.

Our goal was to use our workbook as a way to elicit values reflections and discussion about privacy from our participants – rather than looking for “stakeholder values” to generate design requirements for privacy solutions. In other words, we were interested in how technologists-in-training would use privacy and other values to make sense of the designs.

Growing regulatory calls for “Privacy by Design” suggest that privacy should be embedded into all aspects of the design process, and at least partially done by designers and engineers. Because of this, the ability for technology professionals to surface, discuss, and address privacy and related values is vital. We wanted to know how people training for those jobs might use privacy to discuss their reactions to these designs. We conducted an interview study, recruiting 10 graduate students from a West Coast US University who are training to go into technology professions, most of whom had prior tech industry experience via prior jobs or internships. At the start of the interview, we gave them a physical copy of the designs and explained that the designs were conceptual, but didn’t tell them that the designs were initially made to think about privacy issues. In the following slides, I’ll show a few examples of the speculative design concepts we showed – you can see more of them in the paper. And then I’ll discuss the ways in which participants used values to make sense of or react to some of the designs.

Design examples

 

 This design depicts an imagined surveillance system for public spaces like airports that automatically assigns threat statuses to people by color-coding them. We intentionally left it ambiguous how the design makes its color-coding determinations to try to invite questions about how the system classifies people.

truwork

Conceptual TruWork design – “An integrated solution for your office or workplace!”

In our designs, we also began to iterate on ideas relating to tracking implants, and different types of social contexts they could be used in. Here’s a scenario advertising a workplace implantable tracking device called TruWork. Employers can subscribe to the service and make their employees implant these devices to keep track of their whereabouts and work activities to improve efficiency.

coupletrack3

Conceptual CoupleTrack infographic depicting an implantable tracking chip for couples

We also re-imagined the implant as “coupletrack,” an implantable tracking chip for couples to use, as shown in this infographic.

Findings

We found that participants centered values in their discussions when looking at the designs – predominantly privacy, but also related values such as trust, fairness, security, and due process. We found eight themes of how participants interacted with the designs in ways that surfaced discussion of values, but I’ll highlight three here: Imagining the designs as real; seeing one’s self as multiple users; and seeing one’s self as a technology professional. The rest are discussed in more detail in the paper.

Imagining the Designs as Real

peta-cam-2

Conceptual product page for a small, hidden, wearable camera

Even though participants were aware that the designs were imagined, Some participants imagined the designs as seemingly real by thinking about long term effects in the fictional world of the design. This design (pictured above) is an easily hideable, wearable, live streaming HD camera. One participant imagined what could happen to social norms if these became widely adopted, saying “If anyone can do it, then the definition of wrong-doing would be questioned, would be scrutinized.” He suggests that previously unmonitored activities would become open for surveillance and tracking like “are the nannies picking up my children at the right time or not? The definition of wrong-doing will be challenged”. Participants became actively involved fleshing out and creating the worlds in which these designs might exist. This reflection is also interesting, because it begins to consider some secondary implications of widespread adoption, highlighting potential changes in social norms with increasing data collection.

Seeing One’s Self as Multiple Users

Second, participants took multiple user subject positions in relation to the designs. One participant read the webpage for TruWork and laughed at the design’s claim to create a “happier, more efficient workplace,” saying, “This is again, positioned to the person who would be doing the tracking, not the person who would be tracked.”  She notes that the website is really aimed at the employer. She then imagines herself as an employee using the system, saying:

If I called in sick to work, it shouldn’t actually matter if I’m really sick. […] There’s lots of reasons why I might not wanna say, “This is why I’m not coming to work.” The idea that someone can check up on what I said—it’s not fair.

This participant put herself in both the viewpoint of an employer using the system and as an employee using the system, bringing up issues of workplace surveillance and fairness. This allowed participants to see values implications of the designs from different subject positions or stakeholder viewpoints.

Seeing One’s Self as a Technology Professional

Third, participants also looked at the designs through the lens of being a technology practitioner, relating the designs to their own professional practices. Looking at the design that automatically flags and detects supposedly suspicious people, one participant reflected on his self-identification as a data scientist and the values implications of predicting criminal behavior with data when he said:

the creepy thing, the bad thing is, like—and I am a data scientist, so it’s probably bad for me too, but—the data science is predicting, like Minority Report… [and then half-jokingly says] …Basically, you don’t hire data scientists.

Here he began to reflect on how his practices as data scientist might be implicated in this product’s creepiness – that a his initial propensity to want to use the data to predict if subjects are criminals or not might not be a good way to approach this problem and have implications for due process.

Another participant compared the CoupleTrack design to a project he was working on. He said:

[CoupleTrack] is very similar to our idea. […] except ours is not embedded in your skin. It’s like an IOT charm which people [in relationships] carry around. […] It’s voluntary, and that makes all the difference. You can choose to keep it or not to keep it.

In comparing the fictional CoupleTrack product to the product he’s working on in his own technical practice, the value of consent, and how one might revoke consent, became very clear to this participant. Again, we thought it was compelling that the designs led some participants to begin reflecting on the privacy implications in their own technical practices.

Reflections and Takeaways

Given the workbooks’ ability to help elicit reflections on and discussion of privacy in multiple ways, we see this approach as useful for future Values in Design and Privacy by Design work.

The speculative workbooks helped open up discussions about values, similar to some of what Katie Shilton identifies as “values levers,” activities that foreground values, and cause them to be viewed as relevant and useful to design. Participants’ seeing themselves as users to reflect on privacy harms is similar to prior work showing how self-testing can lead to discussion of values. Participants looking at the designs from multiple subject positions evokes value sensitive design’s foregrounding of multiple stakeholder perspectives. Participants reflected on the designs both from stakeholder subject positions and through the lenses of their professional practices as technology practitioners in training.

While Shilton identifies a range of people who might surface values discussions, we see the workbook as an actor to help surface values discussions. By depicting some provocative designs that raised some visceral and affective reactions, the workbooks brought attention to questions about potential sociotechnical configurations of biosensing technologies. Future values in design work might consider creating and sharing speculative design workbooks for eliciting values reflections with experts and technology practitioners.

More specifically, with this project’s focus on privacy, we think that this approach might be useful for “Privacy by Design”, particularly for technologists trying to surface discussions about the nature of the privacy problem at play for an emerging technology. We analyzed participants’ responses using Mulligan et al’s privacy analytic framework. The paper discusses this in more detail, but the important thing is that participants went beyond just saying privacy and other values are important to think about. They began to grapple with specific, situated, and contextual aspects of privacy – such as considering different ways to consent to data collection, or noting different types of harms that might emerge when the same technology is used in a workplace setting compared to an intimate relationship. Privacy professionals are looking for tools to help them “look around corners,” to help understand what new types of problems related to privacy might occur in emerging technologies and contexts. This provides a potential new tool for privacy professionals in addition to many of the current top-down, checklist approaches–which assume that the concepts of privacy at play are well known in advance. Speculative design practices can be particularly useful here – not to predict the future, but in helping to open and explore the space of possibilities.

Thank you to my collaborators, our participants, and the anonymous reviewers.

Paper citation: Richmond Y. Wong, Deirdre K. Mulligan, Ellen Van Wyk, James Pierce, and John Chuang. 2017. Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks. Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 111 (December 2017), 26 pages. DOI: https://doi.org/10.1145/3134746

by Richmond at November 11, 2018 11:06 PM

November 07, 2018

Ph.D. student

the resilience of agonistic control centers of global trade

This post is merely notes; I’m fairly confident that I don’t know what I’m writing about. However, I want to learn more. Please recommend anything that could fill me in about this! I owe most of this to discussion with a colleague who I’m not sure would like to be acknowledged.

Following the logic of James Beniger, an increasingly integrated global economy requires more points of information integration and control.

Bourgeois (in the sense of ‘capitalist’) legal institutions exist precisely for the purpose of arbitrating between merchants.

Hence, on the one hand we would expect international trade law to be Habermasian. However, international trade need not rest on a foundation of German idealism (which increasingly strikes me as the core of European law). Rather, it is an evolved mechanism.

A key part of this mechanism, as I’ve heard, is that it is decentered. Multiple countries compete to be the sites of transnational arbitration, much like multiple nations compete to be tax havens. Sovereignty and discretion are factors of production in the economy of control.

This means, effectively, that one cannot defeat capitalism by chopping off its head. It is rather much more like a hydra: the “heads” are the creation of two-sided markets. These heads have no internalized sense of the public good. Rather, they are optimized to be attractive to the transnational corporations in bilateral negotiation. The plaintiffs and defendants in these cases are corporations and states–social forms and institutions of complexity far beyond that of any individual person. This is where, so to speak, the AI’s clash.

by Sebastian Benthall at November 07, 2018 01:38 PM

October 31, 2018

Ph.D. student

Best Practices Team Challenges

By Stuart Geiger and Dan Sholler, based on a conversation with Aaron Culich, Ciera Martinez, Fernando Hoces, Francois Lanusse, Kellie Ottoboni, Marla Stuart, Maryam Vareth, Sara Stoudt, and Stéfan van der Walt. This post first appeared on the BIDS Blog.

This post is a summary of the first BIDS Best Practices lunch, in which we bring people together from across the Berkeley campus and beyond to discuss a particular challenge or issue in doing data-intensive research. The goal of the series is to informally share experiences and ideas on how to do data science well (or at least better) from many disciplines and contexts. The topic for this week was doing data-intensive research in teams, labs, and other groups. For this first meeting, we focused on just identifying and diagnosing the many different kinds of challenges. In future meetings, we will dive deeper into some of these specific issues and try to identify best practices for dealing with them.

We began planning for this series by reviewing many of the published papers and series around “best practices” in scientific computing (e.g. Wilson et al, 2014), “good enough practices” (Wilson et al, 2017) and PLOS Computational Biology’s “ten simple rules” series (e.g. Sandve et al, 2013; Goodman et al, 2014). We also see this series as an intellectual successor to the collection of case studies in reproducible research published by several BIDS fellows (Kitzes, Turek, and Deniz, 2018). One reason we chose to identify issues with doing data science in teams and groups is because many of us felt like we understood how to best practice data-intensive research individually, but struggled with how to do this well in teams and groups.

Compute and data challenges

Getting on the same stack

Some of the major challenges in doing data-intensive research in teams is around technology use, particularly in using the same tools. Today’s computational researchers have an overwhelming number of options to choose in terms of programming languages, software libraries, data formats, operating systems, compute infrastructures, version control systems, collaboration platforms, and more. One of the major challenges we discussed was that members of a team often have been trained to work with different technologies, which also often come with their own ways of working on a problem. Getting everyone on the same technical stack often takes far more time than is anticipated, and new members can spend much time learning to work in a new stack.

One of the biggest divides our group had experienced was in the choice of using programming languages, as many of us were more comfortable with either R or Python. These programming languages have their own extensive software libraries, like the tidyverse vs. the numpy/pandas/matplotlib stack. There are also many different software environments to choose from at various layers of the stack, from development environments like Jupyter notebooks versus RStudio and RMarkdown to the many options for package and dependency management. While most of the people in the room were committed to open source languages and environments, many people are trained to use proprietary software like MATLAB or SPSS, which raises an additional challenge in teams and groups.

Another major issue is where the actual computing and data storage will take place. Members of a team often come in knowing how to run code on their own laptops, but there are many options for groups to work, including a lab’s own shared physical server, campus clusters, national grid/supercomputer infrastructures, corporate cloud services, and more.

Workflow and pipeline management

Getting everyone to use an interoperable software and hardware environment is as much of a social challenge as it is a technical one, and we had a great discussion about whether a group leader should (or could) require members to use the same language, environment, or infrastructure. One of the technical solutions to this issue — working in staged data analysis pipelines — comes with its own set of challenges. With staged pipelines, data processing and analysis tasks are separated into modular tasks that an individual can solve in their own way, then output their work to a standardized file for the next stage of the pipeline to take as input.

The ideal end goal is often imagined to be a fully-automated (or ‘one click’) data processing and analysis pipeline, but this is difficult to achieve and maintain in practice. Several people in our group said they personally spend substantial amounts of time setting up these pipelines and making sure that each person’s piece works with everyone else’s. Even with groups that had formalized detailed data management plans, a common theme was that someone had to constantly make sure that team members were actually following these standards so that the pipeline keep running.

External handoffs to and from the team

Many of the research projects we discussed involved not only handoffs between members of the team, but also handoffs between the team and external groups. The “raw” data a team begins with is often the final output of another research team, government agency, or company. In these cases, our group discussed issues that ranged from technical to social, from data formats that are technically difficult to integrate at scale (like Excel spreadsheets) to not having adequate documentation to be able to interpret what the data actually means. Similarly, teams often must deliver data to external partners, who may have very different needs, expectations, and standards than the team has for itself. Finally, some teams have sensitive data privacy issues and requirements, which makes collaboration even more difficult. How can these external relationships be managed in mutually beneficial ways?

Team management challenges

Beyond technical challenges, a number of management issues face research groups aspiring to implement best practices for data-intensive research. Our discussion highlighted the difficulties of composing a well-balanced team, of dealing with fluid membership, and of fostering generative coordination and communication among group members.

Composing a well-balanced team

Data-intensive research groups require a team with varied expertise. A consequence of varied expertise is varied capabilities and end goals, so project leads must devote attention to managing team composition. Whereas one or two members might be capable of carrying out tasks across the various stages of research, others might specialize in a particular area. How then can research groups ensure that no one member of the team departing would collapse the project and that the team holds the necessary expertise to accomplish the shared research goal? Furthermore, some members may participate simply to acquire skills, while others seek to establish or build an academic track record. How might groups achieve alignment between personal and team goals?

Dealing with voluntary and fluid membership

A practical management problem also relates to the quasi-voluntary and fluid nature of research groups. Research groups largely rely extensively on students and postdocs, with an expectation that they join the team temporarily to gain new skills and experience, then leave. Turnover becomes a problem when processes, practices, and tacit institutional knowledge are difficult to standardize or document. What strategies might project leads employ to alleviate the difficulties associated with voluntary, fluid membership?

Fostering coordination and communication

The issues of team composition and voluntary or fluid membership raise a third challenge: fostering open communication among group members. Previous research and guidelines for managing teams (Edmondson, 1999; Google re:Work, 2017) emphasize the vital role of psychological safety in ensuring that team members share knowledge and collaborate effectively. Adequate psychological safety ensures that team members are comfortable speaking up about their ideas and welcoming of others’ feedback. Yet fostering psychological safety is a difficult task when research groups comprise members with various levels of expertise, career experience, and, increasingly, communities of practice (as in the case of data scientists working with domain experts). How can projects establish avenues for open communication between diverse members?

Not abandoning best practices when deadlines loom

One of the major issues that resonated across our group was the tendency for a team to stop following various best practices when deadlines rapidly approach. In the rush to do everything that is needed to get a publication submitted, it is easy to accrue what software engineers call “technical debt.” For example, substantial “collaboration debt” or “reproducibility debt” can be foisted on a team when a member works outside of the established workflow to produce a figure or fails to document their changes to analysis code. These stressful moments can also be difficult for the team’s psychological safety, particularly if there is an expectation to work late hours to make the deadline.

Concluding thoughts and plans

Are there universal best practices for all cases and contexts?

At the conclusion of our first substantive meeting, we began to evaluate topics for future discussions that might help us identify potential solutions to the challenges faced by data-intensive research groups. In doing so, we were quickly confronted with the diversity of technologies, research agendas, disciplinary norms, team compositions, and governance structures, and other factors that characterize scientific research groups. Are solutions that work for large teams appropriate for smaller teams? Do cross-institutional or inter-disciplinary teams face different problems than those working in the same institution or discipline? Are solutions that work in astronomy or physics appropriate for ecology or social sciences? Dealing with such diversity and contextuality, then, might require adjusting our line of inquiry to the following question: At what level should we attempt to generalize best practices?

Our future plans

The differences within and between research groups are meaningful and deserve adequate attention, but commonalities do exist. This semester, our group will aggregate and develop input from a diverse community of practitioners to construct sets of thoughtful, grounded recommendations. For example, we’ll aim to provide recommendations on issues such as how to build and maintain pipelines and workflows, as well as strategies for achieving diversity and inclusion in teams. In our next post, we’ll offer some insights on how to manage the common problem of perpetual turnover in team membership. On all topics, we welcome feedback and recommendations.

Combatting impostor syndrome

Finally, many people who attended told us afterwards how positive and valuable it was to share these kinds of issues and experiences, particularly for combatting the “impostor syndrome” that many of us often feel. We typically only present the final end-product of research. Even sharing one’s final code and data in perfectly reproducible pipelines can still hide all the messy, complex, and challenging work that goes into the research process. People deeply appreciated hearing others talk openly about the difficulties and challenges that come with doing data-intensive research and how they tried to deal with them. The format of sharing challenges followed by strategies for dealing with those challenges may be a meta-level best practice for this kind of work, versus the more standard approach of listing more abstract rules and principles. Through these kinds of conversations, we hope to continue to shed light on the doing of data science in ways that will be constructive and generative across the many fields, areas, and contexts in which we all work.

by R. Stuart Geiger at October 31, 2018 07:00 AM

October 23, 2018

Ph.D. student

For a more ethical Silicon Valley, we need a wiser economics of data

Kara Swisher’s NYT op-ed about the dubious ethics of Silicon Valley and Nitasha Tiku’s WIRED article reviewing books with alternative (and perhaps more cynical than otherwise stated) stories about the rise of Silicon Valley has generated discussion and buzz among the tech commentariat.

One point of debate is whether the focus should be on “ethics” or on something more substantively defined, such as human rights. Another point is whether the emphasis should be on “ethics” or on something more substantively enforced, like laws which impose penalties between 1% and 4% of profits, referring of course to the GDPR.

While I’m sympathetic to the European approach (laws enforcing human rights with real teeth), I think there is something naive about it. We have not yet seen whether it’s ever really possible to comply with the GDPR could wind up being a kind of heavy tax on Big Tech companies operating in the EU, but one that doesn’t truly wind up changing how people’s data are used. In any case, the broad principles of European privacy are based on individual human dignity, and so they do not take into account the ways that corporations are social structures, i.e. sociotechnical organizations that transcend individual people. The European regulations address the problem of individual privacy while leaving mystified the question of why the current corporate organization of the world’s personal information is what it is. This sets up the fight over ‘technology ethics’ to be a political conflict between different kinds of actors whose positions are defined as much by their social habitus as by their intellectual reasons.

My own (unpopular!) view is that the solution to our problems of technology ethics are going to have to rely on a better adapted technology economics. We often forget today that economics was originally a branch of moral philosophy. Adam Smith wrote The Theory of Moral Sentiments (1759) before An Inquiry into the Nature and Causes of the Wealth of Nations (1776). Since then the main purpose of economics has been to intellectually grasp the major changes to society due to production, trade, markets, and so on in order to better steer policy and business strategy towards more fruitful equilibria. The discipline has a bad reputation among many “critical” scholars due to its role in supporting neoliberal ideology and policies, but it must be noted that this ideology and policy work is not entirely cynical; it was a successful centrist hegemony for some time. Now that it is under threat, partly due to the successes of the big tech companies that benefited under its regime, it’s worth considering what new lessons we have to learn to steer the economy in an improved direction.

The difference between an economic approach to the problems of the tech economy and either an ‘ethics’ or a ‘law’ based approach is that it inherently acknowledges that there are a wide variety of strategic actors co-creating social outcomes. Individual “ethics” will not be able to settle the outcomes of the economy because the outcomes depend on collective and uncoordinated actions. A fundamentally decent person may still do harm to others due to their own bounded rationality; “the road to hell is paved with good intentions”. Meanwhile, regulatory law is not the same as command; it is at best a way of setting the rules of a game that will be played, faithfully or not, by many others. Putting regulations in place without a good sense of how the game will play out differently because of them is just as irresponsible as implementing a sweeping business practice without thinking through the results, if not more so because the relationship between the state and citizens is coercive, not voluntary as the relationship between businesses and customers is.

Perhaps the biggest obstacle to shifting the debate about technology ethics to one about technology economics is that it requires a change in register. It drains the conversation of the pathos which is so instrumental in surfacing it as an important political topic. Sound analysis often ruins parties like this. Nevertheless, it must be done if we are to progress towards a more just solution to the crises technology gives us today.

by Sebastian Benthall at October 23, 2018 03:04 PM

October 17, 2018

Ph.D. student

Engaging Technologists to Reflect on Privacy Using Design Workbooks

This post summarizes a research paper, Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks, co-authored with Deirdre Mulligan, Ellen Van Wyk, John Chuang, and James Pierce. The paper will be presented at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW) on Monday November 5th (in the afternoon Privacy in Social Media session). Full paper available here.

Recent wearable and sensing devices, such as Google GlassStrava, and internet-connected toys have raised questions about ways in which privacy and other social values might be implicated by their development, use, and adoption. At the same time, legal, policy, and technical advocates for “privacy by design” have suggested that privacy should embedded into all aspects of the design process, rather than being addressed after a product is released, or rather than being addressed as just a legal issue. By advocating that privacy be addressed through technical design processes, the ability for technology professionals to surface, discuss, and address privacy and other social values becomes vital.

Companies and technologists already use a range of tools and practices to help address privacy, including privacy engineering practices, or making privacy policies more readable and usable. But many existing privacy mitigation tools are either deductive, or assume that privacy problems already known and well-defined in advance. However we often don’t have privacy concerns well-conceptualized in advance when creating systems. Our research shows that design approaches (drawing on a set of techniques called speculative design and design fiction) can help better explore, define, perhaps even anticipate, the what we mean by “privacy” in a given situation. Rather than trying to look at a single, abstract, universal definition of privacy, these methods help us think about privacy as relations among people, technologies, and institutions in different types of contexts and situations.

Creating Design Workbooks

We created a set of design workbooks — collections of design proposals or conceptual designs, drawn together to allow designers to investigate, explore, reflect on, and expand a design space. We drew on speculative design practices: in brief, our goal was to create a set of slightly provocative conceptual designs to help engage people in reflections or discussions about privacy (rather than propose specific solutions to problems posed by privacy).

A set of sketches that comprise the design workbook

Inspired by science fiction, technology research, and trends from the technology industry, we created a couple dozen fictional products, interfaces, and webpages of biosensing technologies, or technologies that sense people. These included smart camera enabled neighborhood watch systems, advanced surveillance systems, implantable tracking devices, and non-contact remote sensors that detect people’s heartrates. In earlier design work, we reflected on how putting the same technologies in different types of situations, scenarios, and social contexts, would vary the types of privacy concerns that emerged (such as the different types of privacy concerns that would emerge if advanced miniatures cameras were used by the police, by political advocates, or by the general public). However, we wanted to see how non-researchers might react to and discuss the conceptual designs.

How Did Technologists-In-Training View the Designs?

Through a series of interviews, we shared our workbook of designs with masters students in an information technology program who were training to go into the tech industry. We found several ways in which they brought up privacy-related issues while interacting with the workbooks, and highlight three of those ways here.

TruWork — A product webpage for a fictional system that uses an implanted chip allowing employers to keep track of employees’ location, activities, and health, 24/7.

First, our interviewees discussed privacy by taking on multiple user subject positions in relation to the designs. For instance, one participant looked at the fictional TruWork workplace implant design by imagining herself in the positions of an employer using the system and an employee using the system, noting how the product’s claim of creating a “happier, more efficient workplace,” was a value proposition aimed at the employer rather than the employee. While the system promises to tell employers whether or not their employees are lying about why they need a sick day, the participant noted that there might be many reasons why an employee might need to take a sick day, and those reasons should be private from their employer. These reflections are valuable, as prior work has documented how considering the viewpoints of direct and indirect stakeholders is important for considering social values in design practices.

CoupleTrack — an advertising graphic for a fictional system that uses an implanted chip for people in a relationship wear in order to keep track of each other’s location and activities.

A second way privacy reflections emerged was when participants discussed the designs in relation to their professional technical practices. One participant compared the fictional CoupleTrack implant to a wearable device for couples that he was building, in order to discuss different ways in which consent to data collection can be obtained and revoked. CoupleTrack’s embedded nature makes it much more difficult to revoke consent, while a wearable device can be more easily removed. This is useful because we’re looking for ways workbooks of speculative designs can help technologists discuss privacy in ways that they can relate back to their own technical practices.

Airport Tracking System — a sketch of an interface for a fictional system that automatically detects and flags “suspicious people” by color-coding people in surveillance camera footage.

A third theme that we found was that participants discussed and compared multiple ways in which a design could be configured or implemented. Our designs tend to describe products’ functions but do not specify technical implementation details, allowing participants to imagine multiple implementations. For example, a participant looking at the fictional automatic airport tracking and flagging system discussed the privacy implication of two possible implementations: one where the system only identifies and flags people with a prior criminal history (which might create extra burdens for people who have already served their time for a crime and have been released from prison); and one where the system uses behavioral predictors to try to identify “suspicious” behavior (which might go against a notion of “innocent until proven guilty”). The designs were useful at provoking conversations about the privacy and values implications of different design decisions.

Thinking About Privacy and Social Values Implications of Technologies

This work provides a case study showing how design workbooks and speculative design can be useful for thinking about the social values implications of technology, particularly privacy. In the time since we’ve made these designs, some (sometimes eerily) similar technologies have been developed or released, such as workers at a Swedish company embedding RFID chips in their hands, or Logitech’s Circle Camera.

But our design work isn’t meant to predict the future. Instead, what we tried to do is take some technologies that are emerging or on the near horizon, and think seriously about ways in which they might get adopted, or used and misused, or interact with existing social systems — such as the workplace, or government surveillance, or school systems. How might privacy and other values be at stake in those contexts and situations? We aim for for these designs to help shed light on the space of possibilities, in an effort to help technologists make more socially informed design decisions in the present.

We find it compelling that our design workbooks helped technologists-in-training discuss emerging technologies in relation to everyday, situated contexts. These workbooks don’t depict far off speculative science fiction with flying cars and spaceships. Rather they imagine future uses of technologies by having someone look at a product website, or a amazon.com page or an interface and thinking about the real and diverse ways in which people might experience those technology products. Using these techniques that focus on the potential adoptions and uses of emerging technologies in everyday contexts helps raise issues which might not be immediately obvious if we only think about positive social implications of technologies, and they also help surface issues that we might not see if we only think about social implications of technologies in terms of “worst case scenarios” or dystopias.

Paper Citation:

Richmond Y. Wong, Deirdre K. Mulligan, Ellen Van Wyk, James Pierce, and John Chuang. 2017. Eliciting Values Reflections by Engaging Privacy Futures Using Design Workbooks. Proc. ACM Hum.-Comput. Interact. 1, CSCW, Article 111 (December 2017), 26 pages. DOI: https://doi.org/10.1145/3134746


This post is crossposted with the ACM CSCW Blog

by Richmond at October 17, 2018 04:40 PM

October 15, 2018

Ph.D. student

Privacy of practicing high-level martial artists (BJJ, CI)

Continuing my somewhat lazy “ethnographic” study of Brazilian Jiu Jitsu, an interesting occurrence happened the other day that illustrates something interesting about BJJ that is reflective of privacy as contextual integrity.

Spencer (2016) has accounted for the changes in martial arts culture, and especially Brazilian Jiu Jitsu, due to the proliferation of video on-line. Social media is now a major vector for the skill acquisition in BJJ. It is also, in my gym, part of the social experience. A few dedicated accounts on social media platforms that share images and video from the practice. There is a group chat where gym members cheer each other on, share BJJ culture (memes, tips), and communicate with the instructors.

Several members have been taking pictures and videos of others in practice and sharing them to the group chat. These are generally met with enthusiastic acclaim and acceptance. The instructors have also been inviting in very experienced (black belt) players for one-off classes. These classes are opportunities for the less experienced folks to see another perspective on the game. Because it is a complex sport, there are a wide variety of styles and in general it is exciting and beneficial to see moves and attitudes of masters besides the ones we normally train with.

After some videos of a new guest instructor were posted to the group chat, one of the permanent instructors (“A”) asked not to do this:

A: “As a general rule of etiquette, you need permission from a black belt and esp if two black belts are rolling to record them training, be it drilling not [sic] rolling live.”

A: “Whether you post it somewhere or not, you need permission from both to record then [sic] training.”

B: “Heard”

C: “That’s totally fine by me, but im not really sure why…?

B: “I’m thinking it’s a respect thing.”

A: “Black belt may not want footage of him rolling or training. as a general rule if two black belts are training together it’s not to be recorded unless expressly asked. if they’re teaching, that’s how they pay their bills so you need permission to record them teaching. So either way, you need permission to record a black belt.”

A: “I’m just clarifying for everyone in class on etiquette, and for visiting other schools. Unless told by X, Y, [other gym staff], etc., or given permission at a school you’re visiting, you’re not to record black belts and visiting upper belts while rolling and potentially even just regular training or class. Some schools take it very seriously.”

C: “OK! Totally fine!”

D: “[thumbs up emoji] gots it :)”

D: “totally makes sense”

A few observations on this exchange.

First, there is the intriguing point that for martial arts black belts teaching, their instruction is part of their livelihood. The knowledge of the expert martial arts practitioner is hard-earned and valuable “intellectual property”, and it is exchanged through being observed. Training at a gym with high-rank players is a privilege that lower ranks pay for. The use of video recording has changed the economy of martial arts training. This has in many ways opened up the sport; it also opens up potential opportunities for the black belt in producing training videos.

Second, this is framed as etiquette, not as a legal obligation. I’m not sure what the law would say about recordings in this case. It’s interesting that as a point of etiquette, it applies only to videos of high belt players. Recording low belt players doesn’t seem to be a problem according to the agreement in the discussion. (I personally have asked not to be recorded at one point at the gym when an instructor explicitly asked to be recorded in order to create demo videos. This was out of embarrassment at my own poor skills; I was also feeling badly because I was injured at the time. This sort of consideration does not, it seem, currently operate as privacy etiquette within the BJJ community. Perhaps these norms are currently being negotiated or are otherwise in flux.)

Third, there is a sense in which high rank in BJJ comes with authority and privileges that do not require any justification. The “trainings are livelihood” argument does apply directly to general practice roles; the argument is not airtight. There is something else about the authority and gravitas of the black belt that is being preserved here. There is a sense of earned respect. Somehow this translates into a different form of privacy (information flow) norm.

References

Spencer, D. C. (2016). From many masters to many Students: YouTube, Brazilian Jiu Jitsu, and communities of practice. Jomec Journal, (5).

by Sebastian Benthall at October 15, 2018 05:11 PM

September 27, 2018

Center for Technology, Society & Policy

CTSP Alumni Updates

We’re thrilled to highlight some recent updates from our fellows:

Gracen Brilmyer, now a PhD student at UCLA, has published a single authored work in one of the leading journals in archival studies, Archival Science: “Archival Assemblages: Applying Disability Studies’ Political/Relational Model to Archival Description” and presented their work on archives, disability, and justice at a number of events over the past two years, including The Archival Education and Research Initiative (AERI), the Allied Media Conference, the International Communications Association (ICA) Preconference, Disability as Spectacle, and their research will be presented at the upcoming Community Informatics Research Network (CIRN).

CTSP Funded Project 2016: Vision Archive


Originating in the 2017 project “Assessing Race and Income Disparities in Crowdsourced Safety Data Collection” done by Fellows Kate Beck, Aditya Medury, and Jesus Barajas, the Safe Transportation and Research Center will launch a new project, Street Story, in October 2018. Street Story is an online platform that allows community groups and agencies to collect community input about transportation collisions, near-misses, general hazards and safe locations to travel. The platform will be available throughout California and is funded through the California Office of Traffic Safety.

CTSP Funded Project 2017: Assessing Race and Income Disparities in Crowdsourced Safety Data Collection


Fellow Roel Dobbe has begun a postdoctoral scholar position at the new AI Now Institute. Inspired by his 2018 CTSP project, he has co-authored a position paper with Sarah Dean, Tom Gilbert and Nitin Kohli titled A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics.

CTSP Funded Project 2018: Unpacking the Black Box of Machine Learning Processes


We are also looking forward to a CTSP Fellow filled Computer Supported Cooperative Work conference in November this year! CTSP affiliated papers include:

We also look forward to seeing CTSP affiliates presenting other work, including 2018 Fellows Richmond Wong, Noura Howell, Sarah Fox, and more!

 

by Anne Jonas at September 27, 2018 10:40 PM

September 25, 2018

Center for Technology, Society & Policy

October 25th: Digital Security Crash Course

Thursday, October 25, 5-7pm, followed by reception

UC Berkeley, South Hall Room 210

Open to the public!

RSVP is required.

Understanding how to protect your personal digital security is more important than ever. Confused about two factor authentication options? Which messaging app is the most secure? What happens if you forget your password manager password, or lose the phone you use for 2 factor authentication? How do you keep your private material from being shared or stolen? And how do you help your friends and family consider the potential dangers and work to prevent harm, especially given increased threats to vulnerable communities and unprecedented data breaches?

Whether you are concerned about snooping family and friends, bullies and exes who are out to hack and harass you, thieves who want to impersonate you and steal your funds, or government and corporate spying, we can help you with this fun, straightforward training in how to protect your information and communications.

Join us for a couple hours of discussion and hands-on set up. We’ll go over various scenarios you might want to protect against, talk about good tools and best practices, and explore trade offs between usability and security. This training is designed for people at all levels of expertise, and those who want both personal and professional digital security protection.

Refreshments and hardware keys provided! Bring your laptop or other digital device. Take home a hardware key and better digital security practices.

This crash course is sponsored by the Center for Technology, Society & Policy and generously funded by the Charles Koch Foundation. Jessy Irwin will be our facilitator and guide. Jessy is Head of Security at Tendermint, where she excels at translating complex cybersecurity problems into relatable terms, and is responsible for developing, maintaining and delivering comprehensive security strategy that supports and enables the needs of her organization and its people. Prior to her role at Tendermint, she worked to solve security obstacles for non-expert users as a strategic advisor, security executive and former Security Empress at 1Password. She regularly writes and presents about human-centric security, and believes that people should not have to become experts in technology, security or privacy to be safe online.

RSVP here!

by Daniel Griffin at September 25, 2018 05:25 PM

September 09, 2018

Ph.D. student

Brazilian Jiu Jitsu (BJJ) and the sociology of martial knowledge

Maybe 15 months ago, I started training in Brazilian Jiu Jitsu (BJJ), a martial art that focuses on grappling and ground-fighting. Matches are won through points based on position (e.g., “mount”, where you are sitting on somebody else) and through submission, when a player taps out due to hyperextension under a joint lock or asphyxiation by choking. I recommend it heartily to anybody as a fascinating, smart workout that also has a vibrant and supportive community around it.

One of the impressive aspects of BJJ, which differentiates it from many other martial arts, is its emphasis on live drilling and sparring (“rolling”), which can offer a third or more of a training session. In the context of sparring, there is opportunity for experimentation and rapid feedback about technique. In addition to being good fun and practice, regular sparring continually reaffirms the hierarchical ranking of skill. As in some other martial arts, rank is awarded as different colored “belts”–white, blue, purple, brown, black. Intermediary progress is given as “stripes” on the belt. White belts can spar with higher belts; more often than not, when they do so they get submitted.

BJJ also has tournaments, which allow players from different dojos to compete against each other. I attended my first tournament in August and thought it was a great experience. There is nothing like meeting a stranger for the first time and then engage them in single combat to kindle a profound respect for the value of sportsmanship. Off the mat, I’ve had some of the most courteous encounters with anybody I have ever met in New York City.

At tournaments, hundreds of contestants are divided into brackets. The brackets are determined by belt (white, blue, etc.), weight (up to 155 lbs, up to 170 lbs, etc.), sex (men and women), and age (kids age groups, adult, 30+ adult). There is an “absolute” bracket for those who would rise above the division of weight classes. There are “gi” and “no gi” variants of BJJ; the former requires wearing special uniform of jacket and pants, which are used in many techniques.

Overall, it is an efficient system for training a skill.


The few readers of this blog will recall that for some time I studied sociology of science and engineering, especially through the lens of Bourdieu’s Science of Science and Reflexivity. This was in turn a reaction to a somewhat startling exposure to sociology of science and education, and intellectual encounter that I never intended to have. I have been interested for a long time in the foundations of science. It was a rude shock, and one that I mostly regret, to have gone to grad school to become a better data scientist and find myself having to engage with the work of Bruno Latour. I did not know how to respond intellectually to the attack on scientific legitimacy on the basis that its self-understanding is insufficiently sociological until encountering Bourdieu, who refuted the Latourian critique and provides a clear-sighted view of how social structure under-girds scientific objectivity, when it works. Better was my encounter with Jean Lave, who introduced me to more phenomenological methods for understanding education through her class and works (Chaiklin and Lave, 1996). This made me more aware of the role of apprenticeship as well as the nuances of culture, framing, context, and purpose in education. Had I not encountered this work, I would likely never have found my way to Contextual Integrity, which draws more abstract themes about privacy from such subtle observations.

Now it’s impossible for me to do something as productive and enjoyable as BJJ without considering it through these kinds of lenses. One day I would like to do more formal work along these lines, but as has been my habit I have a few notes to jot down at the moment.

The first point, which is a minor one, is that there is something objectively known by experienced BJJ players, and that this knowledge is quintessentially grounded in intersubjective experience. The sparring encounter is the site at which technique is tested and knowledge is confirmed. Sparring simulates conditions of a fight for survival; indeed, if a choke is allowed to progress, a combatant can lose consciousness on the mat. This recalls Hegel’s observation that it is in single combat that a human being is forced to see the limits of their own solipsism. When the Other can kill you, that is an Other that you must see as, in some sense, equivalent in metaphysical status to oneself. This is a sadly forgotten truth in almost every formal academic environment I’ve found myself in, and that, I would argue, is why there is so much bullshit in academia. But now I digress.

The second point, which is perhaps more significant, is that BJJ has figured out how to be an inclusive field of knowledge despite the pervasive and ongoing politics of what I have called in another post body agonism. We are at a point where political conflict in the United States and elsewhere seems to be at root about the fact that people have different kinds of bodies, and these differences are upsetting for liberalism. How can we have functioning liberal society when, for example, some people have male bodies and other people have female bodies? It’s an absurd question, perhaps, but nevertheless it seems to be the question of the day. It is certainly a question that plagues academic politics.

BJJ provides a wealth of interesting case studies in how to deal productively with body agonism. BJJ is an unarmed martial art. The fact that there are different body types is an instrinsic aspect of the sport. Interestingly, in the dojo practices I’ve seen, trainings are co-ed and all body types (e.g., weight classes) train together. This leads to a dynamic and irregular practice environment that perhaps is better for teaching BJJ as a practical form of self-defense. Anecdotally, self-defense is an important motivation for why especially women are interested in BJJ, and in the context of a gym, sparring with men is a way to safely gain practical skill in defending against male assailants. On the other hand, as far as ranking progress is concerned, different bodies are considered in relation to other similar bodies through the tournament bracket system. While I know a badass 40-year old who submitted two college kids in the last tournament, that was extra. For the purposes of measuring my improvement in the discipline, I will be in the 30+ men’s bracket, compared with other guys approximately my weight. The general sense within the community is that progress in BJJ is a function of time spent practicing (something like the mantra that it takes 10,000 hours to master something), not any other intrinsic talent. Some people who are more dedicated to their training advance faster, and others advance slower.

Training in BJJ has been a positive experience for me, and I often wonder whether other social systems could be more like BJJ. There are important lessons to be learned from it, as it is a mental discipline, full of subtlety and intellectual play, in its own right.

References

Bourdieu, Pierre. Science of science and reflexivity. Polity, 2004.

Chaiklin, Seth, and Jean Lave, eds. Understanding practice: Perspectives on activity and context. Cambridge University Press, 1996.

by Sebastian Benthall at September 09, 2018 02:08 PM

September 08, 2018

Ph.D. student

On Hill’s work on ‘Greater Male Variability Hypothesis’ (GMVH)

I’m writing in response to Ted Hill’s recent piece describe the acceptance and subsequent removal of a paper about the ‘Greater Male Variability Hypothesis’, the controversial idea that there is more variability in male intelligence than female intelligence, i.e. “that there are more idiots and more geniuses among men than among women.”

I have no reason to doubt Hill’s account of events–his collaboration, his acceptance to a journal, and the mysterious political barriers to publication–and assume them for the purposes of this post. If these are refuted by future controversy somehow, I’ll stand corrected.

The few of you who have followed this blog for some time will know that I’ve devoted some energy to understanding the controversy around gender and STEM. One post, criticizing how Donna Haraway, widely used in Science and Technology Studies, can be read as implying that women should not become ‘hard scientists’ in the mathematical mode, has gotten a lot of hits (and some pushback). Hill’s piece makes me revisit the issue.

The paper itself is quite dry and the following quote is its main thesis:

SELECTIVITY-VARIABILITY PRINCIPLE. In a species with two sexes A and B, both of which are needed for reproduction, suppose that sex A is relatively selective, i.e., will mate only with a top tier (less than half ) of B candidates. Then from one generation to the next, among subpopulations of B with comparable average attributes, those with greater variability will tend to prevail over those with lesser variability. Conversely, if A is relatively non-selective, accepting all but a bottom fraction (less than half ) of the opposite sex, then subpopulations of B with lesser variability will tend to prevail over those with comparable means and greater variability.

This mathematical thesis is supported in the paper by computational simulations and mathematical proofs. From this, one can get the GMVH if one assumes that: (a) (human) males are less selective in their choice of (human) females when choosing to mate, and (b) traits that drive variability in intelligence are intergenerationally heritable, whether biologically or culturally. While not uncontroversial, neither of these are crazy ideas. In fact, if they weren’t both widely accepted, then we wouldn’t be having this conversation.

Is this the kind of result that should be published? This is the controversy. I am less interested in the truth or falsehood of broad implications of the mathematical work than I am in the arguments for why the mathematical work should not be published (in a mathematics journal).

As far as I can tell from Hill’s account and also from conversations and cultural osmosis on the matter, there are a number of reasons why research of this kind should not be published.

The first reason might be that there are errors in the mathematical or simulation work. In other words, the Selectivity-Variability Principle may be false, and falsely supported. If that is the case, then the reviewers should have rejected the paper on those grounds. However, the principle is intuitively plausible and the reviewers accepted it. Few of Hill’s critics (though some) attacked the piece on mathematical grounds. Rather, the objections were of a social and political nature. I want to focus on these latter objections, though if there is a mathematical refutation of the Selectivity-Variability Principle I’m not aware of, I’ll stand corrected.

The crux of the problem seems to be this: the two assumptions (a) and (b) are both so plausible that publishing a defense of (c) the Selectivity-Variability Principle would imply (d) the Greater Male Variability Hypothesis (GMVH). And if GMVH is true, then (e) there is a reason why more of the celebrated high-end of the STEM professions are male. It is because at the high-end, we’re looking at the thin tails of the human distribution, and the male tail is longer. (It is also longer at the low end, but nobody cares about the low end.)

The argument goes that if this claim (e) were widely known by aspiring females in STEM fields, then they will be discouraged from pursuing these promising careers, because “women have a lesser chance to succeed in mathematics at the very top end”, which would be a biased, sexist view. (e) could be used to defend the idea that (f) normatively, there’s nothing wrong with men having most success at the top end of mathematics, though there is a big is/ought distinction there.

My concern with this argument is that it assumes, at its heart, the idea that women aspiring to be STEM professionals are emotionally vulnerable to being dissuaded by this kind of mathematical argument, even when it is neither an empirical case (it is a mathematical model, not empirically confirmed within the paper) nor does it reflect on the capacity of any particular woman, and especially not after she has been selected for by the myriad social sorting mechanisms available. The argument that GMVH is professionally discouraging assumes many other hypotheses about human professional motivation, for example, the idea that it is only worth taking on a profession if one can expect to have a higher-than-average chance of achieving extremely high relative standing in that field. Given that extremely high relative standing in any field is going to be rare, it’s hard to say this is a good motivation for any profession, for men or for women, in the long run. In general, those that extrapolate from population level gender tendencies to individual cases are committing the ecological fallacy. It is ironic that under the assumption of the critics, potential female entrants into STEM might be screened out precisely because of their inability to understand a mathematical abstraction, along with its limitations and questionable applicability, through a cloud of political tension. Whereas if one were really interested in reaching mathematics in an equitable way, that would require teaching the capacity to see through political tension to the precise form of a mathematical abstraction. That is precisely what top performance in the STEM field should be about, and that it should be unflinchingly encouraged as part of the educational process for both men and women.

My point, really, is this: the argument that publishing and discussing GMVH is detrimental to the career aspirations of women, because of how individual women will internalize the result, depends on a host of sexist assumptions that are as if not more pernicious than GMVH. It is based on the idea that women as a whole need special protection from mathematical ideas in order to pursue careers in mathematics, which is self-defeating crazy talk if I’ve ever heard it. The whole point of academic publication is to enable a debate of defeasible positions on their intellectual merits. In the case of mathematics research, the standards of merit are especially clear. If there’s a problem with Hill’s model, that’s a great opportunity for another, better model, on a topic that is clearly politically and socially relevant. (If the reviewers ignored a lot prior work that settled the scientific relevance of the question, then that’s a different story. One gathers that is not what happened.)

As a caveat, there are other vectors through which GMVH could lead to bias against women pursuing STEM careers. For example, it could bias their less smart families or colleagues into believing less in their potential on the basis of their sex. But GMVH is about the variance, not the mean, of mathematical ability. So the only population that it’s relevant to is that in the very top tier of performers. That nuance is itself probably beyond the reach of most people who do not have at least some training in STEM, and indeed if somebody is reasoning from GMVH to an assumption about women’s competency in math then they are almost certainly conflating it with a dumber hypothesis about population means which is otherwise irrelevant.

This is perhaps the most baffling thing about this debate: that it boils down to a very rarefied form of elite conflict. “Should a respected mathematics journal publish a paper that implies that there is greater variance in mathematical ability between sexes based on their selectivity and therefore…” is a sentence that already selects for a very small segment of the population, a population that should know better than to censor a mathematical proof rather than to take the opportunity to engage it as an opportunity to educate people in STEM and why it is an interesting field. Nobody is objecting to the publication of support for GMVH on the grounds that it implies that more men are grossly incompetent and stupid than women, and it’s worth considering why that is. If our first reaction to GMVH is “but can no one woman never be the best off?”, we are showing that our concerns lie with who gets to be on top, not the welfare of those on bottom.

by Sebastian Benthall at September 08, 2018 09:06 PM

September 07, 2018

Ph.D. student

Note on Austin’s “Cyber Policy in China”: on the emphasis on ‘ethics’

I’ve had recommended to me Greg Austin’s “Cyber Policy in China” (2014) as a good, recent work. I am not sure what I was expecting–something about facts and numbers, how companies are being regulated, etc. Just looking at the preface, it looks like this book is about something else.

The preface frames the book in the discourse, beginning in the 20th century, about the “information society”. It explicitly mentions the UN’s World Summit on the Information Society (WSIS) as a touchstone of international consensus about what the information society is, as society “where everyone can create, access, utilise and share information and knowledge’ to ‘achieve their full potential’ in ‘improving their quality of life’. It is ‘people-centered’.

In Chinese, the word for information society is xinxi shehui (Please forgive me: I’ve got little to know understanding of the Chinese language and that includes not knowing how to put the appropriate diacritics into transliterations of Chinese terms.) It is related to a term “informatization” (xinxihua) that is compared to industrialization. It means the historical process by which information technology is fully used, information resources are developed and utilized, the exchange of information and knowledge sharing are promoted, the quality of economic growth is improved, and the transformation of economic and social development is promoted”. Austin’s interesting point is that this is “less people-centered than the UN vision and more in the mould of the materialist and technocratic traditions that Chinese Communists have preferred.”

This is an interesting statement on the difference between policy articulations by the United Nations and the CCP. It does not come as a surprise.

What did come as a surprise is how Austin chooses to orient his book.

On the assumption that outcomes in the information society are ethically determined, the analytical framework used in the book revolves around ideal policy values for achieving an advanced information society. This framework is derived from a study of ethics. Thus, the analysis is not presented as a work of social science (be that political science, industry policy or strategic studies). It is more an effort to situate the values of China’s leaders within an ethical framework implied by their acceptance of the ambition to become and advanced information society.

This comes as a surprise to me because what I was expected from a book titled “Cyber Policy in China” is really something more like industry policy or strategic studies. I was not ready for, and am frankly a bit disappointed by, the idea that this is really a work of applied philosophy.

Why? I do love philosophy as a discipline and have studied it carefully for many years. I’ve written and published about ethics and technological design. But my conclusion after so much study is that “the assumption that outcomes in the information society are ethically determined” is totally incorrect. I have been situated for some time in discussions of “technology ethics” and my main conclusion from them is that (a) “ethics” in this space are more often than not an attempt to universalize what are more narrow political and economic interests, and that (b) “ethics” are constantly getting compromised by economic motivations as well as the mundane difficulty of getting information technology to work as it is intended to in a narrow, functionally defined way. The real world is much bigger and more complex than any particular ethical lens can take in. Attempt to define technological change in terms of “ethics” are almost always a political maneuver, for good or for ill, of some kind that is reducing the real complexity of technological development into a soundbite. A true ethical analysis of cyber policy would need to address industrial policy and strategic aspects, as this is what drives the “cyber” part of it.

The irony is that there is something terribly un-emic about this approach. By Austin’s own admission, the CCP cyber policy is motivated by material concerns about the distribution of technology and economic growth. Austin could have approached China’s cyber policy in the technocratic terms they see themselves in. But instead Austin’s approach is “human-centered”, with a focus on leaders and their values. I already doubt the research on anthropological grounds because of the distance between the researcher and the subjects.

So I’m not sure what to do about this book. The preface makes it sound like it belongs to a genre of scholarship that reads well, and maybe does important ideological translation work, but does provide something like scientific knowledge of China’s cyber policy, which is what I’m most interested in. Perhaps I should move on, or take other recommendations for reading on this topic.

by Sebastian Benthall at September 07, 2018 03:59 PM

September 04, 2018

Center for Technology, Society & Policy

Backstage Decisions, Front-stage Experts: Interviewing Genome-Editing Scientists

by Santiago Molina and Gordon PherriboCTSP Fellows

This is the first in a series of posts on the project “Democratizing” Technology: Expertise and Innovation in Genetic Engineering

When we think about who is making decisions that will impact the future health and wellbeing of society, one would hope that these individuals would wield their expertise in a way that addresses the social and economic issues affecting our communities. Scientists often fill this role: for example, an ecologist advising a state environmental committee on river water redistribution [1], a geologist consulting for an architectural team building a skyscraper [2], an oncologist discussing the best treatment options based on the patient’s diagnosis and values [3] or an economist brought in by a city government to help develop a strategy for allocating grants to elementary schools. Part of the general contract between technical experts and their democracies is that they inform relevant actors so that decisions are made with the strongest possible factual basis.

The three examples above describe scientists going outside of the boundaries of their disciplines to present for people outside of the scientific community “on stage” [4]. But what about decisions made by scientists behind the scenes about new technologies that could affect more than daily laboratory life? In the 1970s, genetic engineers used their technical expertise to make a call about an exciting new technology, recombinant DNA (rDNA). This technology allowed scientists to mix and add DNA from different organisms; later giving rise to engineered bacteria that could produce insulin and eventually transgenic crops. The expert decision making process and outcome, in this case, had little to do with the possibility of commercializing biotechnology or the economic impacts of GMO seed monopolies. This happened before the patenting of whole biological organisms [5], and the use of rDNA in plants in 1982. Instead, the emerging issues surrounding rDNA were dealt with as a technical issue of containment. Researchers wanted to ensure that anything tinkered with genetically stayed not just inside the lab, but inside specially marked and isolated rooms in the lab, eventually given rise to well-established institution of biosafety. A technical fix, for a technical issue.

Today, scientists are similarly engaged in a process of expert decision making around another exciting new technology, the CRISPR-Cas9 system. This technology allows scientists to make highly specific changes, “edits”, to the DNA of virtually any organism. Following the original publication that showed that CRISPR-Cas9 could be used to modify DNA in a “programmable” way, scientists have developed the system into a laboratory toolbox and laboratories across the life sciences are using it to tinker away at bacteria, butterflies, corn, frogs, fruit flies, human liver cells, nematodes, and many other organisms. Maybe because most people do not have strong feelings about nematodes, most of the attention in both popular news coverage and in expert circles about this technology has had to do with whether modifications that could affect human offspring (i.e. germline editing) are moral.  

We have been interviewing faculty members directly engaged in these critical conversations about the potential benefits and risks of new genome editing technologies. As we continue to analyze these interviews, we want to better understand the nature of these backstage conversations and learn how the experiences and professional development activities of these expects influenced their decision-making. In subsequent posts we’ll be sharing some of our findings from these interviews, which so far have highlighted the role of a wide range of technical experiences and skills for the individuals engaged in these discussions, the strength of personal social connections and reputation in getting you a seat at the table and the dynamic nature of expert decision making.

[1]  Scoville, C. (2017). “We Need Social Scientists!” The Allure and Assumptions of Economistic Optimization in Applied Environmental Science. Science as Culture, 26(4), 468-480.

[2] Wildermuth and Dineen (2017) “How ready will Bay Area be for next Quake?” SF Chronicle. Available online at: https://www.sfchronicle.com/news/article/How-ready-will-Bay-Area-be-for-next-big-quake-12216401.php

[3] Sprangers, M. A., & Aaronson, N. K. (1992). The role of health care providers and significant others in evaluating the quality of life of patients with chronic disease: a review. Journal of clinical epidemiology, 45(7), 743-760.

[4] Hilgartner, S. (2000). Science on stage: Expert advice as public drama. Stanford University Press.

[5] Diamond v Chakrabarty was in 1980, upheld first whole-scale organism patent (bacterium that could digest crude oil).

by Anne Jonas at September 04, 2018 06:18 PM

September 03, 2018

Ph.D. student

How trade protection can increase labor wages (the Stolper-Samuelson theorem)

I’m continuing a look into trade policy 8/08/30/trade-policy-and-income-distribution-effects/”>using Corden’s (1997) book on the topic.

Picking up where the last post left off, I’m operating on the assumption that any reader is familiar with the arguments for free trade that are an extension of those arguments of laissez-faire markets. I will assume that these arguments are true as far as they go: that the economy grows with free trade, that tariffs create a dead weight loss, that subsidies are expensive, but that both tariffs and subsidies do shift the market towards imports.

The question raised by Corden is why, despite its deleterious effects on the economy as a whole, protectionism enjoys political support by some sectors of the economy. He hints, earlier in Chapter 5, that this may be due to income distribution effects. He clarifies this with reference to an answer to this question that was given as early as 1941 by Stolper and Samuelson; their result is now celebrated as the Stolper-Samuelson theorem.

The mathematics of the theorem can be read in many places. Like any economic model, it depends on some assumptions that may or may not be the case. Its main advantage is that it articulates how it is possible for protectionism to benefit a class of the population, and not just in relative but in absolute terms. It does this by modeling the returns to different factors of production, which classically have been labor, land, and capital.

Roughly, the argument goes like this. Suppose and economy has two commodities, one for import and one for export. Suppose that the imported good is produced with a higher labor to land ratio than the export good. Suppose a protectionist policy increases the amount of the import good produced relative to the export good. Then the return on labor will increase (because more labor is used in supply), and the return on land will decrease (because less land is used in supply). Wages will increase and rent on land will decrease.

These breakdowns of the economy into “factors of production” feels very old school. You rarely read economists discuss the economy in these terms now, which is itself interesting. One reason why (and I am only speculating here) is that these models clarify how laborers, land-owners, and capital-owners have different political interests in economic intervention, and that can lead to the kind of thinking that was flushed out of the American academy during the McCarthy era. Another reason may be that “capital” has changed meaning from being about ownership of machine goods into being about having liquid funds available for financial investment.

I’m interested in these kinds of models today partly because I’m interested in the political interests in various policies, and also because I’m interested in particular in the economics of supply chain logistics. The “factors of production” approach is a crude way to model the ‘supply chain’ in a broad sense, but one that has proven to be an effective source of insights in the past.

References

Corden, W. Max. “Trade policy and economic welfare.” OUP Catalogue (1997).

Stolper, Wolfgang F., and Paul A. Samuelson. “Protection and real wages.” The Review of Economic Studies 9.1 (1941): 58-73.

by Sebastian Benthall at September 03, 2018 05:55 PM

August 30, 2018

Ph.D. student

trade policy and income distribution effects

And now for something completely different

I am going to start researching trade policy, meaning policies around trade between different countries; imports and exports. Why?

  • It is politically relevant in the U.S. today.
  • It is a key component to national cybersecurity strategy, both defensive and offensive, which hinges in many cases on supply chain issues.
  • It maybe ought to be a component of national tech regulation and privacy policy, if e-commerce is seen as a trade activity. (This could be see as ‘cybersecurity’ policy, more broadly writ).
  • Formal models from trade policy may be informative in other domains as well.

In general, years of life experience and study have taught me that economics, however much it is maligned, is a wise and fundamental social science without which any other understanding of politics and society is incomplete, especially when considering the role of technology in society.

Plenty of good reasons! Onward!

As a starting point, I’m working through Max Corden’s Trade policy and social welfare (1997), which appears to be a well regarded text on the subject. In it, he sets out to describe a normative theory of trade policy. Here are two notable points based on a first perusal.

1. (from Chapter 1, “Introduction”) Corden identifies three “stages of thought” about trade policy. The first is the discovery of the benefits of free trade with the great original economists Adam Smith and David Ricardo. Here, the new appreciation of free trade was simultaneous with the new appreciation of the free market in general. “Indeed, the case for free trade was really a special case of the argument for laissez-faire.”

In the second phase, laissez-faire policies came into question. These policies may not lead to full employment, and the income distribution effects (which Corden takes seriously throughout the book, by the way) may not be desirable. Parallel to this, the argument for free trade was challenged. Some of these challenges were endorsed by John Stuart Mill. One argument is that tariffs might be necessary to protect “infant industries”.

As time went on, the favorability of free trade more or less tracked the favorability of laissez-faire. Both were popular in Western Europe and failed to get traction in most other countries (almost all of which were ‘developing’).

Corden traces the third stage of thought to Meade’s (1955) Trade and welfare. “In the third stage the link between the case for free trade and the case for laissez-faire was broken.“. The normative case for free trade, in this stage, did not depend on a normative case for laissez-faire, but existed despite normative reasons for government intervention in the economy. The point made in this approach, called the theory of domestic distortions, is that it is generally better for the kinds of government intervention made to solve domestic problems to be domestic interventions, not trade interventions.

This third stage came with a much more sophisticated toolkit for comparing the effects of different kinds of policies, which is the subject of exposition for a large part of Corden’s book.

2. (from Chapter 5, “Protection and Income Distribution) Corden devotes at least one whole chapter to an aspect of the trade policy discussion that is very rarely addressed in, say, the mainstream business press. This is the fact that trade policy can have an effect on internal income distribution, and that this has been throughout history a major source of the political momentum for protectionist policies. This explains why the domestic politics of protectionism and free trade can be so heated and are really often independent from arguments about the effect of trade policy on the economy as a whole, which, it must be said, few people realize they have a real stake in.

Corden’s examples involve the creation of fledgling industries under the conditions of war, which often cut off foreign supplies. When the war ends, those businesses that flourished during war exert political pressure to protect themselves from erosion from market forces. “Thus the Napoleonic Wars cut off supplies of corn (wheat) to Britain from the Continent and led to expansion of acreage and higher prices of corn. When the war was over, the Corn Law of 1815 was designed to maintain prices, with an import prohibition as long as the domestic price was below a certain level.” It goes almost without saying that this served the interests of a section of the community, the domestic corn farmers, and not of others. This is what Corden means by an “income distribution effect”.

“Any history book will show that these income distribution effects are the very stuff of politics. The great free trade versus protection controversies of the nineteenth century in Great Britain and in the United States brought out the conflicting interests of different sections of the community. It was the debate about the effects of the Corn Laws which really stimulated the beginnings of the modern theory of international trade.”

Extending this argument a bit, one might say that a major reason why economics gets such a bad rap as a social science is that nobody really cares about Pareto optimality except for those sections of the economy that are well served by a policy that can be justified as being Pareto optimal (in practice, this would seem to be correlated with how much somebody has invested in mutual funds, as these track economic growth). The “stuff of politics” is people using political institutions to change their income outcomes, and the potential for this makes trade policy a very divisive topic.

Implication for future research:

The two key takeaways for trade policy in cybersecurity are:

1) The trade policy discussion need not remain within the narrow frame of free trade versus protectionism, but rather a more nuanced set of policy analysis tools should be brought to bear on the problem, and

2) An outcome of these policy analyses should be the identification not just of total effects on the economy, or security posture, or what have you, but on the particular effects on different sections of the economy and population.

References

Corden, W. Max. “Trade policy and economic welfare.” OUP Catalogue (1997).

Meade, James Edward. Trade and welfare. Vol. 2. Oxford University Press, 1955.

by Sebastian Benthall at August 30, 2018 08:49 PM

August 21, 2018

Center for Technology, Society & Policy

Standing up for truth in the age of disinformation

Professor Deirdre K. Mulligan and PhD student (and CTSP Co-Director) Daniel Griffin have an op-ed in The Guardian considering how Google might consider its human rights obligations in the face of state censorship demands: If Google goes to China, will it tell the truth about Tiananmen Square?

The op-ed advances a line of argument developed in a recent article of theirs in the Georgetown Law Technology Review: “Rescripting Search to Respect the Right to Truth”

by Daniel Griffin at August 21, 2018 10:28 PM

August 16, 2018

Center for Technology, Society & Policy

Social Impact Un-Pitch Day 2018

On Thursday, October 4th at 5:30pm the Center for Technology, Society & Policy (CTSP) and the School of Information’s Information Management Student Association (IMSA) are co-hosting their third annual Social Impact Un-Pitch Day!

Join CTSP and IMSA to brainstorm ideas for projects that address the challenges of technology, society, and policy. We welcome students, community organizations, local municipal partners, faculty, and campus initiatives to discuss discrete problems that project teams can take on over the course of this academic year. Teams will be encouraged to apply to CTSP to fund their projects.

Location: Room 202, in South Hall.

RSVP here!

Agenda

  • 5:40 Introductions from IMSA and CTSP
  • 5:45 Example Projects
  • 5:50 Sharing Un-Pitches

We’ve increased the time for Un-Pitches! (Still 3-minutes per Un-Pitch)

  • 6:40 Mixer (with snacks and refreshments)

 

Un-Pitches

Un-Pitches are meant to be informal and brief introductions of yourself, your idea, or your organization’s problem situation. Un-pitches can include designing technology, research, policy recommendations, and more. Students and social impact representatives will be given 3 minutes to present their Un-Pitch. In order to un-pitch, please share 1-3 slides, as PDF and/or a less than 500-word description—at this email: ctsp@nullberkeley.edu. You can share slides and/or description of your ideas even if you aren’t able to attend. Deadline to share materials: midnight October 1st, 2018.

Funding Opportunities

The next application round for fellows will open in November. CTSP’s fellowship program will provide small grants to individuals and small teams of fellows for 2019. CTSP also has a recurring offer of small project support.

Prior Projects & Collaborations

Here are several examples of projects that members of the I School community have pursued as MIMS final projects or CTSP Fellow projects (see more projects from 2016, 2017, and 2018).

 

Skills & Interests of Students

The above projects demonstrate a range of interests and skills of the I School community. Students here and more broadly on the UC Berkeley campus are interested and skilled in all aspects of where information and technology meets people—from design and data science, to user research and information policy.

RSVP here!

by Daniel Griffin at August 16, 2018 03:51 AM

August 30th, 5:30pm: Habeas Data Panel Discussion

Location: South Hall Rm 202

Time: 5:30-7pm (followed by light refreshments)

CTSP’s first event of the semester!

Co-Sponsored with the Center for Long-Term Cybersecurity

Please join us for a panel discussion featuring award-winning tech reporter Cyrus Farivar, whose new book, Habeas Data, explores how the explosive growth of surveillance technology has outpaced our understanding of the ethics, mores, and laws of privacy. Habeas Data explores ten historic court decisions that defined our privacy rights and matches them against the capabilities of modern technology. Mitch Kapor, co-founder, Electronic Frontier Foundation, said the book was “Essential reading for anyone concerned with how technology has overrun privacy.”

The panel will be moderated by 2017 and 2018 CTSP Fellow Steve Trush, a MIMS 2018 graduate and now a Research Fellow at the Center for Long-Term Cybersecurity (CLTC). He was on a CTSP project starting in 2017 that provided a report to the Oakland Privacy Advisory Commission—read an East Bay Express write-up on their work here.

The panelists will discuss what public governance models can help local governments protect the privacy of citizens—and what role citizen technologists can play in shaping these models. The discussion will showcase the ongoing collaboration between the UC Berkeley School of Information and the Oakland Privacy Advisory Commission (OPAC). Attendees will learn how they can get involved in addressing issues of governance, privacy, fairness, and justice related to state surveillance.

Panel:

  • Cyrus Farivar, Author, Habeas Data: Privacy vs. the Rise of Surveillance Tech
  • Deirdre Mulligan, Associate Professor in the School of Information at UC Berkeley, Faculty Director, UC Berkeley Center for Law & Technology
  • Catherine Crump, Assistant Clinical Professor of Law, UC Berkeley; Director, Samuelson Law, Technology & Public Policy Clinic.
  • Camille Ochoa, Coordinator, Grassroots Advocacy; Electronic Frontier Foundation
  • Moderated by Steve Trush, Research Fellow, UC Berkeley Center for Long-Term Cybersecurity

The panel will be followed by a reception with light refreshments. Building is wheelchair accessible – wheelchair users can enter through the ground floor level and take the elevator to the second floor.

This event will not be taped or live-streamed.

RSVP here to attend.

 

Panelist Bios:

Cyrus [“suh-ROOS”] Farivar is a Senior Tech Policy Reporter at Ars Technica, and is also an author and radio producer. His second book, Habeas Data, about the legal cases over the last 50 years that have had an outsized impact on surveillance and privacy law in America, is out now from Melville House. His first book, The Internet of Elsewhere—about the history and effects of the Internet on different countries around the world, including Senegal, Iran, Estonia and South Korea—was published in April 2011. He previously was the Sci-Tech Editor, and host of “Spectrum” at Deutsche Welle English, Germany’s international broadcaster. He has also reported for the Canadian Broadcasting Corporation, National Public Radio, Public Radio International, The Economist, Wired, The New York Times and many others. His PGP key and other secure channels are available here.

Deirdre K. Mulligan is an Associate Professor in the School of Information at UC Berkeley, a faculty Director of the Berkeley Center for Law & Technology, and an affiliated faculty on the Center for Long-Term Cybersecurity.  Mulligan’s research explores legal and technical means of protecting values such as privacy, freedom of expression, and fairness in emerging technical systems.  Her book, Privacy on the Ground: Driving Corporate Behavior in the United States and Europe, a study of privacy practices in large corporations in five countries, conducted with UC Berkeley Law Prof. Kenneth Bamberger was recently published by MIT Press. Mulligan and  Bamberger received the 2016 International Association of Privacy Professionals Leadership Award for their research contributions to the field of privacy protection.

Catherine Crump: Catherine Crump is an Assistant Clinical Professor of Law and Director of the Samuelson Law, Technology & Public Policy Clinic. An experienced litigator specializing in constitutional matters, she has represented a broad range of clients seeking to vindicate their First and Fourth Amendment rights. She also has extensive experience litigating to compel the disclosure of government records under the Freedom of Information Act. Professor Crump’s primary interest is the impact of new technologies on civil liberties. Representative matters include serving as counsel in the ACLU’s challenge to the National Security Agency’s mass collection of Americans’ call records; representing artists, media outlets and others challenging a federal internet censorship law, and representing a variety of clients seeking to invalidate the government’s policy of conducting suspicionless searches of laptops and other electronic devices at the international border.

Prior to coming to Berkeley, Professor Crump served as a staff attorney at the ACLU for nearly nine years. Before that, she was a law clerk for Judge M. Margaret McKeown at the United States Court of Appeals for the Ninth Circuit.

Camille Ochoa: Camille promotes the Electronic Frontier Foundation’s grassroots advocacy initiative (the Electronic Frontier Alliance) and coordinates outreach to student groups, community groups, and hacker spaces throughout the country. She has very strong opinions about food deserts, the school-to-prison pipeline, educational apartheid in America, the takeover of our food system by chemical companies, the general takeover of everything in American life by large conglomerates, and the right to not be spied on by governments or corporations.

by Daniel Griffin at August 16, 2018 03:50 AM

August 12, 2018

Ph.D. student

“the politicization of the social” and “politics of identity” in Omi and Winant, Cha. 6

A confusing debate in my corner of the intellectual Internet is about (a) whether the progressive left has a coherent intellectual stance that can be articulated, (b) what to call this stance, (c) whether the right-wing critics of this stance have the intellectual credentials to refer to it and thereby land any kind of rhetorical punch. What may be true is that both “sides” reflect social movements more than they reflect coherent philosophies as such, and so trying to bridge between them intellectually is fruitless.

Happily, reading through Omi and Winant, which among other things outlines a history of what I think of as the progressive left, or the “social justice”, “identity politics” movement in the United States. They address this in their Chapter 6: “The Great Transformation”. They use “the Great Transformation” to refer to “racial upsurges” in the 1950’s and 1960’s.

They are, as far as I can tell, the only people who ever use “The Great Transformation” to refer to this period. I don’t think it is going to stick. They name it this because they see this period as a great victorious period for democracy in the United States. Omi and Winant refer to previous periods in the United States as “racial despotism”, meaning that the state was actively treating nonwhites as second class citizens and preventing them from engaging in democracy in a real way. “Racial democracy”, which would involve true integration across race lines, is an ideal future or political trajectory that was approached during the Great Transformation but not realized fully.

The story of the civil rights movements in the mid-20th century are textbook material and I won’t repeat Omi and Winant’s account, which is interesting for a lot of reasons. One reason why it is interesting is how explicitly influenced by Gramsci their analysis is. As the “despotic” elements of United States power structures fade, the racial order is maintained less by coercion and more by consent. A power disparity in social order maintained by consent is a hegemony, in Gramscian theory.

They explain the Great Transformation as being due to two factors. One was the decline of the ethnicity paradigm of race, which had perhaps naively assumed that racial conflicts could be resolved through assimilation and recognition of ethnic differences without addressing the politically entrenched mechanisms of racial stratification.

The other factor was the rise of new social movements characterized by, in alliance with second-wave feminism, the politicization of the social, whereby social identity and demographic categories were made part of the public political discourse, rather than something private. This is the birth of “politics of identity”, or “identity politics”, for short. These were the original social justice warriors. And they attained some real political victories.

The reason why these social movements are not exactly normalized today is that there was a conservative reaction to resist changes in the 70’s. The way Omi and Winant tell it, the “colorblind ideology” of the early 00’s was culmination of a kind of political truce between “racial despotism” and “racial democracy”–a “racial hegemony”. Gilman has called this “racial liberalism”.

So what does this mean for identity politics today? It means it has its roots in political activism which was once very radical. It really is influenced by Marxism, as these movements were. It means that its co-option by the right is not actually new, as “reverse racism” was one of the inventions of the groups that originally resisted the Civil Rights movement in the 70’s. What’s new is the crisis of hegemony, not the constituent political elements that were its polar extremes, which have been around for decades.

What it also means is that identity politics has been, from its start, a tool for political mobilization. It is not a philosophy of knowledge or about how to live the good life or a world view in a richer sense. It serves a particular instrumental purpose. Omi and Winant talk about the politics of identity is “attractive”, that it is a contagion. These are positive terms for them; they are impressed at how anti-racism spreads. These days I am often referred to Phillips’ report, “The Oxygen of Amplification”, which is about preventing the spread of extremist views by reducing the amount of reporting on them in ‘disgust’. It must be fair to point out that identity politics as a left-wing innovation were at one point an “extremist” view, and that proponents of that view do use media effectively to spread it. This is just how media-based organizing tactics work, now.

by Sebastian Benthall at August 12, 2018 03:44 AM

August 07, 2018

Ph.D. student

Racial projects and racism (Omi and Winant, 2014; Jeong case study)

Following up on earlier posts on Omi and Winant, I’ve gotten to the part where they discuss racial projects and racism.

Because I use Twitter, I have not been able to avoid the discussion of Sarah Jeong’s tweets. I think it provides a useful case study in Omi and Winant’s terminology. I am not a journalist or particularly with-it person, so I have encountered this media event mainly through articles about it. Here are some.

To recap, for Omi and Winant, race is a “master category” of social organization, but nevertheless one that is unstable and politically contested. The continuity of racial classification is due to a historical, mutually reinforcing process that includes both social structures that control the distribution of resources and social meanings and identities that have been acquired by properties of people’s bodies. The fact that race is sustained through this historical and semiotically rich structuration (to adopt a term from Giddens), means that

“To identify an individual or group racially is to locate them within a socially and historically demarcated set of demographic and cultural boundaries, state activities, “life-chances”, and tropes of identity/difference/(in)equality.

“We cannot understand how racial representations set up patterns of residential segregation, for example, without considering how segregation reciprocally shapes and reinforces the meaning of race itself.”

This is totally plausible. Identifying the way that racial classification depends on a relationship between meaning and social structure opens the possibility of human political agency in the (re)definition of race. Omi and Winant’s term for these racial acts is racial projects.

A racial project is simultaneously an interpretation, representation, or explanation of racial identities and meanings, and an effort to organize and distribute resources (economic, political, cultural) along particular racial lines.
… Racial projects connect the meaning of race in discourse and ideology with the way that social structures are racially organized.

“Racial project” is a broad category that can include both large state and institutional interventions and individual actions. “even the decision to wear dreadlocks”. What makes them racial projects is how they reflect and respond to broader patterns of race, whether to reproduce it or to subvert it. Prevailing stereotypes are one of the main ways we can “read” the racial meanings of society, and so the perpetuation of subversion of stereotypes is a form of “racial project”. Racial projects are often in contest with each other; the racial formation process is the interaction and accumulation of these projects.

Racial project is a useful category partly because it is key to Omi and Winant’s definition of racism. They acknowledge that the term itself is subject to “enormous debate”, at times inflated to be meaningless and at other times deflated to be too narrow. They believe the definition of racism as “racial hate” is too narrow, though it has gain legal traction as a category, as in when “hate crimes” are considered an offense with enhanced sentencing, or universities institute codes against “hate speech”. I’ve read “racial animus” as another term that means something similar, though perhaps more subtle, than ‘racial hate’.

The narrow definition of racism as racial hate is rejected due to an argument O&W attribute to David Theo Goldberg (1997), which is that by narrowly focusing on “crimes of passion” (I would gloss this more broadly to ‘psychological states’), the interpretation of racism misses the ideologies, policies, and practices that “normalize and reproduce racial inequality and domination”. In other words, racism, as a term, has to reference the social structure that is race in order to adequate.

Omi and Winant define racism thus:

A racial project can be defined as racist if it creates or reproduces structures of domination based on racial significance and identities.

A key implication of their argument is that not all racial projects are racist. Recall that Omi and Winant are very critical of colorblindness as (they allege) a political hegemony. They want to make room for racial solidarity and agency despite the hierarchical nature of race as a social fact. This allows them to answer two important questions.

Are there anti-racist projects? Yes. “[w]e define anti-racist projects as those that undo or resist structures of domination based on racial significations and identities.

Note that the two definitions are not exactly parallel in construction. To “create and reproduce structure” is not entirely the opposite of “undo or resist structure”. Given O&W’s ontology, and the fact that racial structure is always the accumulation of a long history of racial projects, projects that have been performed by (bluntly) both the right and the left, and given that social structure is not homogeneous across location (consider how race is different in the United States and in Brazil, or different in New York City and in Dallas), and given that an act of resistance is also an act of creation, implicitly, one could easily get confused trying to apply these definitions. The key word, “domination”, is not defined precisely, and everything hinges on this. It’s clear from the writing that Omi and Winant subscribe to the “left” view of how racial domination works; this orients their definition of racism concretely. But they also not that the political agency of people of color in the United States over the past hundred years or so has gained them political power. Isn’t the key to being racist having power? This leads O&W to the second question, which is

Can Group of Color Advance Racist Projects? O&W’s answer is, yes, they can. There are exceptions to the hierarchy of white supremacy, and in these exceptions there can be racial conflicts where a group of color is racist. Their example is in cases where blacks and Latinos are in contest over resources. O&W do not go so far as to say that it is possible to be racist against white people, because they believe all racial relations are shaped by the overarching power of white supremacy.

Case Study: Jeong’s tweets

That is the setup. So what about Sarah Jeong? Well, she wrote some tweets mocking white people, and specifically white men, in 2014, which was by the way the heyday of obscene group conflict on Twitter. That was the year of Gamergate. A whole year of tweets that are probably best forgotten. She compared white people to goblins, she compared them the dogs. She said she wished ill on white men. As has been pointed out, if any other group besides white men were talked about, her tweets would be seen as undeniably racist, etc. They are, truth be told, similar rhetorically to the kinds of tweets that the left media have been so appalled at for some time.

They have surfaced again because Jeong was hired by the New York Times, and right wing activists (or maybe just trolls, I’m a little unclear about which) surfaced the old tweets. In the political climate of 2018, when Internet racism feels like it’s gotten terribly real, these struck a chord and triggered some reflection.

What should we make of these tweets, in light of racial formation theory?

First, we should acknowledge that the New York Times has some really great lawyers working for it. Their statement was the at the time, (a) Jeong was being harassed, (b) that she responded to them in the same rhetorical manner of the harassment, that (c) that’s regrettable, but also, it’s long past and not so bad. Sarah Jeong’s own statement makes this point, acknowledges that the tweets may be hurtful out of context, and that she didn’t mean them the way others could take them. “Harassment” is actually a relatively neutral term; you can harass somebody, legally speaking, on the basis of their race without invoking a reaction from anti-racist sociologists. This is all perfectly sensible, IMO, and the case is pretty much closed.

But that’s not where the discussion on the Internet ended. Why? Because the online media is where the contest of racial formation is happening.

We can ask: Were Sarah Jeong’s tweets a racial project? The answer seems to be, yes, they were. It was a representation of racial identity (whiteness) “to organize and distribute resources (economic, political, cultural) along particular racial lines”. Jeong is a journalist and scholar, and these arguments are happening in social media, which are always-already part of the capitalist attention economy. Jeong’s success is partly due to her confrontation of on-line harassers and responses to right-wing media figures. And her activity is the kind that rallies attention along racial lines–anti-racist, racist, etc.

Confusingly, the language she used in these tweets reads as hateful. “Dumbass fucking white people marking up the internet with their opinions like dogs pissing on fire hydrants” does, reasonably, sound like it expresses some racial animus. If we were to accept the definition of racism as merely the possession of ill will towards a race, which seems to be Andrew Sullivan’s definition, then we would have to say those were racist tweets.

We could invoke a defense here. Were the tweets satire? Did Jeong not actually have any ill will towards white people? One might wonder, similarly, whether 4chan anti-Semites are actually anti-Semitic or just trolling. The whole question of who is just trolling and who should be taken seriously on the Internet is such an interesting one. But it’s one I had to walk away from long ago after the heat got turned up on me one time. So it goes.

What everyone knows is at stake, though, is the contention that the ‘racial animus’ definition is not the real definition of racism, but rather that something like O&W’s definition is. By their account, (a) a racial project is only racist if it aligns with structures of racial domination, and (b) the structure of racial domination is a white supremacist one. Ergo, by this account, Jeong’s tweets are not racist, because insulting white people does not create or reproduce structures of white supremacist domination.

It’s worth pointing out that there are two different definitions of a word here and that neither one is inherently more correct of a definition. I’m hesitant to label the former definition “right” and the latter definition “left” because there’s nothing about the former definition that would make you, say, not want to abolish the cradle-to-prison system or any number of other real, institutional reforms. But the latter definition is favored by progressives, who have a fairly coherent world view. O&W’s theorizing is consistent with it. The helpful thing about this worldview is that it makes it difficult to complain about progressive rhetorical tactics without getting mired into a theoretical debate about their definitions, which makes it an excellent ideology for getting into fights on the Internet. This is largely what Andrew Sullivan was getting at in his critique.

What Jeong and the NYT seem to get, which some others don’t, is that comments that insult an entire race can be hurtful and bothersome even if they are not racist in the progressive sense of the term. It is not clear what we should call a racial project that is hurtful and bothersome to white people if we do not call it racist. A difficulty with the progressive definition of racism is that agreement on the application of the term is going to depend on agreement about what the dominate racial structures are. What we’ve learned in the past few years is that the left-wing view of what these racial structures are is not as widely shared as it was believed to be. Example, there are far more people who believe in anti-Semitic conspiracies, in which the dominant race is the Jews, active in American political life than was supposed. Given O&W’s definition of racism, if it were, factually, the case that Jews ran the world, then anti-Semitic comments would not be racist in the meaningful sense.

Which means that the progressive definition of racism, to be effective, depends on widespread agreement about white supremacist hegemony, which is a much, much more complicated thing to try to persuade somebody of than a particular person’s racial animus.

A number of people have been dismissing any negative reaction to the resurfacing of Jeong’s tweets, taking the opportunity to disparage that reaction as misguided and backwards. As far as I can tell, there is an argument that Jeong’s tweets are actually anti-racist. This article argues that casually disparaging white men is just something anti-racists do lightly to call attention to the dominant social structures and also the despicable behavior of some white men. Naturally, these comments are meant humorously, and not intended to refer to all white men (to assume it does it to distract from the structural issues at stake). They are jokes that should be celebrated, because the the progressives have already won this argument over #notallmen, also in 2014. Understood properly as progressive, anti-racist, social justice idiom, there is nothing offensive about Jeong’s tweets.

I am probably in a minority on this one, but I do not agree with this assessment, for a number of reasons.

First, the idea that you can have a private, in-group conversation on Twitter is absurd.

Second, the idea that a whole community of people casually expresses racial animus because of representative examples of wrongdoing by members of a social class can be alarming whether or not it’s Trump voters talking about Mexicans or anti-racists talking about white people. That alarm, as an emotional reaction, is a reality whether or not the dominant racial structures are being reproduced or challenged.

Third, I’m not convinced that as a racial project, tweets simply insulting white people really counts as “anti-racist” in a substantive sense. Anti-racist projects are “those that undo or resist structures of domination based on racial significations and identities.” Is saying “white men are bullshit” undoing a structure of domination? I’m pretty sure any white supremacist structures of domination have survived that attack. Does it resist white supremacist domination? The thrust of wise sociology of race is that what’s more important than the social meanings are the institutional structures that maintain racial inequality. Even if this statement has a meaning that is degrading to white people, it doesn’t seem to be doing any work of reorganizing resources around (anti-)racial lines. It’s just a crass insult. It may well have actually backfired, or had an effect on the racial organization of attention that neither harmed nor supported white supremacy, but rather just made its manifestation on the Internet more toxic (in response to other, much greater, toxicity, of course).

I suppose what I’m arguing for is greater nuance than either the “left” or “right” position has offered on this case. I’m saying that it is possible to engage in a racial project that is neither racist nor anti-racist. You could have a racial project that is amusingly absurd, or toxic, or cleverly insightful. Moreover, there is a complex of ethical responsibilities and principles that intersects with racial projects but is not contained by the logic of race. There are greater standards of decency that can be invoked. These are not simply constraints on etiquette. They also are relevant to the contest of racial projects and their outcomes.

by Sebastian Benthall at August 07, 2018 07:48 PM

August 05, 2018

Ph.D. student

From social movements to business standards

Matt Levine has a recent piece discussing how discovering the history of sexual harassment complaints about a company’s leadership is becoming part of standard due diligence before an acquisition. Implicitly, the threat of liability, and presumably the costs of a public relations scandal, are material to the value of the company being acquired.

Perhaps relatedly, the National Venture Capital Association has added to its Model Legal Documents a slew of policies related to harassment and discrimination, codes of conduct, attracting and retaining diverse talent, and family friendly policies. Rumor has it that venture capitalists will now encourage companies they invest in to adopt these tested versions of the policies, much as an organization would adopt a tested and well-understood technical standard.

I have in various researcher roles studied social movements and political change, but these studies have left me with the conclusion that changes to culture are rarely self-propelled, but rather are often due to more fundamental changes in demographics or institutions. State legislation is very slow to move and limited in its range, and so often trails behind other amassing of power and will.

Corporate self-regulation, on the other hand, through standards, contracts, due diligence, and the like, seems to be quite adaptive. This is leading me to the conclusion that a best kept secret of cultural change is that some of the main drivers of it are actually deeply embedded in corporate law. Corporate law has the reputation of being a dry subject which sucks in recent law grads into soulless careers. But what if that wasn’t what corporate law was? What if corporate law was really where the action is?

In broader terms, the adaptivety of corporate policy to changing demographics and social needs perhaps explains the paradox of “progressive neoliberalism”, or the idea that the emerging professional business class seems to be socially liberal, whether or not it is fiscally conservative. Professional culture requires, due to antidiscrimination law and other policies, the compliance of its employees with a standard of ‘political correctness’. People can’t be hostile to each other in the workplace or else they will get fired, and they especially can’t be hostile to anybody on the basis of their being part of a protected category. This has been enshrined into law long ago. Part of the role of educational institutions is to teach students a coherent story about why these rules are what they are and how they are not just legally mandated, but morally compelling. So the professional class has an ideology of inclusivity because it must.

by Sebastian Benthall at August 05, 2018 08:14 PM

July 30, 2018

Ph.D. student

How the Internet changed everything: a grand theory of AI, etc.

I have read many a think piece and critical take about AI, the Internet, and so on. I offer a new theory of What Happened, the best I can come up with based on my research and observations to date.

Consider this article, “The death of Don Draper”, as a story that represents the changes that occur more broadly. In this story, advertising was once a creative field that any company with capital could hire out to increase their chances of getting noticed and purchased, albeit in a noisy way. Because everything was very uncertain, those that could afford it blew a lot of money on it (“Half of advertising is useless; the problem is knowing which half”).

A similar story could be told about access to the news–dominated by big budgets that hid quality–and political candidates–whose activities were largely not exposed to scrutiny and could follow a similarly noisy pattern of hype and success.

Then along came the Internet and targeted advertising, which did a number of things:

  • It reduced search costs for people looking for particular products, because Google searches the web and Amazon indexes all the products (and because of lots of smaller versions of Google and Amazon).
  • It reduced the uncertainty of advertising effectiveness because it allowed for fine-grained measurement of conversion metrics. This reduced the search costs of producers to advertisers, and from advertisers to audiences.
  • It reduced the search costs of people finding alternative media and political interest groups, leading to a reorganization of culture. The media and cultural landscape could more precisely reflect the exogenous factors of social difference.
  • It reduced the cost of finding people based on their wealth, social influence, and so on, implicitly creating a kind of ‘social credit system’ distributed across various web services. (Gandy, 1993; Fourcade and Healy, 2016)

What happens when you reduce search costs in markets? Robert Jensen’s (2007) study of the introduction of mobile phones to fish markets in Kerala is illustrative here. Fish prices were very noisy due to bad communication until mobile phones were introduced. After that, the prices stabilized, owing to swifter communication between fisherman and markets. Suddenly able to preempt prices rather than subject to the vagaries to them, fisherman could then choose to go to the market that would give them the best price.

Reducing search costs makes markets more efficient and larger. In doing so, it increases inequality, because whereas a lot of lower quality goods and services can survive in a noisy economy, when consumers are more informed and more efficient at searching, they can cut out less useful services. They can then standardize on “the best” option available, which can be produced with economies of scale. So inefficient, noisy parts of the economy were squeezed out and the surplus amassed in the hands of a big few intermediaries, who we now see as Big Tech leveraging AI.

Is AI an appropriate term? I have always liked this definition of AI: “Anything that humans still do better than computers.” Most recently I’ve seen this restated in an interview with Andrew Moore, quoted by Zachary Lipton:

Artificial intelligence is the science and engineering of making computers behave in ways that, until recently, we thought required human intelligence.

The use of technical platforms to dramatically reduce search costs. “Searching” for people, products, and information is something that used to require human intelligence. Now it is assisted by computers. And whether or not the average user knows that they are doing when they search (Mulligan and Griffin, 2018), as a commercial function, the panoply of search engines and recommendation systems and auctions that occupy the central places in the information economy outperform human intelligence largely by virtue of having access to more data–a broader perspective–than any individual human could ever accomplish.

The comparison between the Google search engine and a human’s intelligence is therefore ill-posed. The kinds of functions tech platforms are performing are things that have only every been solved by human organizations, especially bureaucratic ones. And while the digital user interfaces of these services hides the people “inside” the machines, we know that of course there’s an enormous amount of ongoing human labor involved in the creation and maintenance of any successful “AI” that’s in production.

In conclusion, the Internet changed everything for a mundane reason that could have been predicted from neoclassical economic theory. It reduced search costs, creating economic efficiency and inequality, by allowing for new kinds of organizations based on broad digital connectivity. “AI” is a distraction from these accomplishments, as is most “critical” reaction to these developments, which do not do justice to the facts of the matter because by taking up a humanistic lens, they tend not to address how decisions by individual humans and changes to their experience experience are due to large-scale aggregate processes and strategic behaviors by businesses.

References

Gandy Jr, Oscar H. The Panoptic Sort: A Political Economy of Personal Information. Critical Studies in Communication and in the Cultural Industries. Westview Press, Inc., 5500 Central Avenue, Boulder, CO 80301-2877 (paperback: ISBN-0-8133-1657-X, $18.95; hardcover: ISBN-0-8133-1656-1, $61.50)., 1993.

Fourcade, Marion, and Kieran Healy. “Seeing like a market.” Socio-Economic Review 15.1 (2016): 9-29.

Jensen, Robert. “The digital provide: Information (technology), market performance, and welfare in the South Indian fisheries sector.” The quarterly journal of economics 122.3 (2007): 879-924.

Mulligan, Deirdre K. and Griffin, Daniel S. “Rescripting Search to Respect the Right to Truth.” 2 GEO. L. TECH. REV. 557 (2018)

by Sebastian Benthall at July 30, 2018 09:40 PM

July 10, 2018

Ph.D. student

search engines and authoritarian threats

I’ve been intrigued by Daniel Griffin’s tweets lately, which have been about situating some upcoming work of his an Deirdre Mulligan’s regarding the experience of using search engines. There is a lively discussion lately about the experience of those searching for information and the way they respond to misinformation or extremism that they discover through organic use of search engines and media recommendation systems. This is apparently how the concern around “fake news” has developed in the HCI and STS world since it became an issue shortly after the 2016 election.

I do not have much to add to this discussion directly. Consumer misuse of search engines is, to me, analogous to consumer misuse of other forms of print media. I would assume to best solution to it is education in the complete sense, and the problems with the U.S. education system are, despite all good intentions, not HCI problems.

Wearing my privacy researcher hat, however, I have become interested in a different aspect of search engines and the politics around them that is less obvious to the consumer and therefore less popularly discussed, but I fear is more pernicious precisely because it is not part of the general imaginary around search. This is the aspect that is around the tracking of search engine activity, and what it means for this activity to be in the hands of not just such benevolent organizations such as Google, but also such malevolent organizations such as Bizarro World Google*.

Here is the scenario, so to speak: for whatever reason, we begin to see ourselves in a more adversarial relationship with search engines. I mean “search engine” here in the broad sense, including Siri, Alexa, Google News, YouTube, Bing, Baidu, Yandex, and all the more minor search engines embedded in web services and appliances that do something more focused than crawl the whole web. By ‘search engine’ I mean entire UX paradigm of the query into the vast unknown of semantic and semiotic space that contemporary information access depends on. In all these cases, the user is at a systematic disadvantage in the sense that their query is a data point amount many others. The task of the search engine is to predict the desired response to the query and provide it. In return, the search engine gets the query, tied to the identity of the user. That is one piece of a larger mosaic; to be a search engine is to have a picture of a population and their interests and the mandate to categorize and understand those people.

In Western neoliberal political systems the central function of the search engine is realized as commercial transaction facilitating other commercial transactions. My “search” is a consumer service; I “pay” for this search by giving my query to the adjoined advertising function, which allows other commercial providers to “search” for me, indirectly, through the ad auction platform. It is a market with more than just two sides. There’s the consumer who wants information and may be tempted by other information. There are the primary content providers, who satisfy consumer content demand directly. And there are secondary content providers who want to intrude on consumer attention in a systematic and successful way. The commercial, ad-enabled search engine reduces transaction costs for the consumer’s search and sells a fraction of that attentional surplus to the advertisers. Striking the right balance, the consumer is happy enough with the trade.

Part of the success of commercial search engines is the promise of privacy in the sense that the consumer’s queries are entrusted secretly with the engine, and this data is not leaked or sold. Wise people know not to write into email things that they would not want in the worst case exposed to the public. Unwise people are more common than wise people, and ill-considered emails are written all the time. Most unwise people do not come to harm because of this because privacy in email is a de facto standard; it is the very security of email that makes the possibility of its being leaked alarming.

So to with search engine queries. “Ask me anything,” suggests the search engine, “I won’t tell”. “Well, I will reveal your data in an aggregate way; I’ll expose you to selective advertising. But I’m a trusted intermediary. You won’t come to any harms besides exposure to a few ads.”

That is all a safe assumption until it isn’t, at which point we must reconsider the role of the search engine. Suppose that, instead of living in a neoliberal democracy where the free search for information was sanctioned as necessary for the operation of a free market, we lived in an authoritarian country organized around the principle that disloyalty to the state should be crushed.

Under these conditions, the transition of a society into one that depends for its access to information on search engines is quite troubling. The act of looking for information is a political signal. Suppose you are looking for information about an extremist, subversive ideology. To do so is to flag yourself as a potential threat of the state. Suppose that you are looking for information about a morally dubious activity. To do so is to make yourself vulnerable to kompromat.

Under an authoritarian regime, curiosity and free thought are a problem, and a problem that are readily identified by ones search queries. Further, an authoritarian regime benefits if the risks of searching for the ‘wrong’ thing are widely known, since it suppresses inquiry. Hence, the very vaguely announced and, in fact, implausible to implement Social Credit System in China does not need to exist to be effective; people need only believe it exists for it to have a chilling and organizing effect on behavior. That is the lesson of the Foucouldean panopticon: it doesn’t need a guard sitting in it to function.

Do we have a word for this function of search engines in an authoritarian system? We haven’t needed one in our liberal democracy, which perhaps we take for granted. “Censorship” does not apply, because what’s at stake is not speech but the ability to listen and learn. “Surveillance” is too general. It doesn’t capture the specific constraints on acquiring information, on being curious. What is the right term for this threat? What is the term for the corresponding liberty?

I’ll conclude with a chilling thought: when at war, all states are authoritarian, to somebody. Every state has an extremist, subversive ideology that it watches out for and tries in one way or another to suppress. Our search queries are always of strategic or tactical interest to somebody. Search engine policies are always an issue of national security, in one way or another.

by Sebastian Benthall at July 10, 2018 11:52 PM

Ph.D. student

Exploring Implications of Everyday Brain-Computer Interface Adoption through Design Fiction

This blog post is a version of a talk I gave at the 2018 ACM Designing Interactive Systems (DIS) Conference based on a paper written with Nick Merrill and John Chuang, entitled When BCIs have APIs: Design Fictions of Everyday Brain-Computer Interface Adoption. Find out more on our project page, or download the paper: [PDF link] [ACM link]

In recent years, brain computer interfaces, or BCIs, have shifted from far-off science fiction, to medical research, to the realm of consumer-grade devices that can sense brainwaves and EEG signals. Brain computer interfaces have also featured more prominently in corporate and public imaginations, such as Elon Musk’s project that has been said to create a global shared brain, or fears that BCIs will result in thought control.

Most of these narratives and imaginings about BCIs tend to be utopian, or dystopian, imagining radical technological or social change. However, we instead aim to imagine futures that are not radically different from our own. In our project, we use design fiction to ask: how can we graft brain computer interfaces onto the everyday and mundane worlds we already live in? How can we explore how BCI uses, benefits, and labor practices may not be evenly distributed when they get adopted?

Brain computer interfaces allow the control of a computer from neural output. In recent years, several consumer-grade brain-computer interface devices have come to market. One example is the Neurable – it’s a headset used as an input device for virtual reality systems. It detects when a user recognizes an object that they want to select. It uses a phenomenon called the P300 – when a person either recognizes a stimulus, or receives a stimulus they are not expecting, electrical activity in their brain spikes approximately 300 milliseconds after the stimulus. This electrical spike can be detected by an EEG, and by several consumer BCI devices such as the Neurable. Applications utilizing the P300 phenomenon include hands-free ways to type or click.

Demo video of a text entry system using the P300

Neurable demonstration video

We base our analysis on this already-existing capability of brain computer interfaces, rather than the more fantastical narratives (at least for now) of computers being able to clearly read humans’ inner thoughts and emotions. Instead, we create a set of scenarios that makes use of the P300 phenomenon in new applications, combined with the adoption of consumer-grade BCIs by new groups and social systems.

Stories about BCI’s hypothetical future as a device to make life easier for “everyone” abound, particularly in Silicon Valley, as shown in recent research.  These tend to be very totalizing accounts, neglecting the nuance of multiple everyday experiences. However, past research shows that the introductions of new digital technologies end up unevenly shaping practices and arrangements of power and work – from the introduction of computers in workplaces in the 1980s, to the introduction of email, to forms of labor enabled algorithms and digital platforms. We use a set of a design fictions to interrogate these potential arrangements in BCI systems, situated in different types of workers’ everyday experiences.

Design Fictions

Design fiction is a practice of creating conceptual designs or artifacts that help create a fictional reality. We can use design fiction to ask questions about possible configurations of the world and to think through issues that have relevance and implications for present realities. (I’ve written more about design fiction in prior blog posts).

We build on Lindley et al.’s proposal to use design fiction to study the “implications for adoption” of emerging technologies. They argue that design fiction can “create plausible, mundane, and speculative futures, within which today’s emerging technologies may be prototyped as if they are domesticated and situated,” which we can then analyze with a range of lenses, such as those from science and technology studies. For us, this lets us think about technologies beyond ideal use cases. It lets us be attuned to the experiences of power and inequalities that people experience today, and interrogate how emerging technologies might get uptaken, reused, and reinterpreted in a variety of existing social relations and systems of power.

To explore this, we thus created a set of interconnected design fictions that exist within the same fictional universe, showing different sites of adoptions and interactions. We build on Coulton et al.’s insight that design fiction can be a “world-building” exercise; design fictions can simultaneously exist in the same imagined world and provide multiple “entry points” into that world.

We created 4 design fictions that exist in the same world: (1) a README for a fictional BCI API, (2) a programmer’s question on StackOverflow who is working with the API, (3) an internal business memo from an online dating company, (4) a set of forum posts by crowdworkers who use BCIs to do content moderation tasks. These are downloadable at our project page if you want to see them in more detail.  (I’ll also note that we conducted our work in the United States, and that our authorship of these fictions, as well as interpretations and analysis are informed by this sociocultural context.)

Design Fiction 1: README documentation of an API for identifying P300 spikes in a stream of EEG signals

First, this is README documentation of an API for identifying P300 spikes in a stream of EEG signals. The P300 response, or “oddball” response is a real phenomenon. It’s a spike in brain activity when a person is either surprised, or when see something that they’re looking for. This fictional API helps identify those spikes in EEG data. We made this fiction in the form of a GitHub page to emphasize the everyday nature of this documentation, from the viewpoint of a software developer. In the fiction, the algorithms underlying this API come from a specific set of training data from a controlled environment in a university research lab. The API discloses and openly links to the data that its algorithms were trained on.

In our creation and analysis of this fiction, for us it surfaces ambiguity and a tension about how generalizable the system’s model of the brain is. The API with a README implies that the system is meant to be generalizable, despite some indications based on its training dataset that it might be more limited. This fiction also gestures more broadly toward the involvement of academic research in larger technical infrastructures. The documentation notes that the API started as a research project by a professor at a University before becoming hosted and maintained by a large tech company. For us, this highlights how collaborations between research and industry may produce artifacts that move into broader contexts. Yet researchers may not be thinking about the potential effects or implications of their technical systems in these broader contexts.

Design Fiction 2: A question on StackOverflow

Second, a developer, Jay, is working with the BCI API to develop a tool for content moderation. He asks a question on Stack Overflow, a real website for developers to ask and answer technical questions. He questions the API’s applicability beyond lab-based stimuli, asking “do these ‘lab’ P300 responses really apply to other things? If you are looking over messages to see if any of them are abusive, will we really see the ‘same’ P300 response?” The answers from other developers suggest that they predominantly believe the API is generalizable to a broader class of tasks, with the most agreed-upon answer saying “The P300 is a general response, and should apply perfectly well to your problem.”

This fiction helps us explore how and where contestation may occur in technical communities, and where discussion of social values or social implications could arise. We imagine the first developer, Jay, as someone who is sensitive to the way the API was trained, and questions its applicability to a new domain. However, he encounters the commenters who believe that physiological signals are always generalizable, and don’t engage in questions of broader applicability. The community’s answers re-enforce notions not just of what the technical artifacts can do, but what the human brain can do. The stack overflow answers draw on a popular, though critiqued, notion of the “brain-as-computer,” framing the brain as a processing unit with generic processes that take inputs and produce outputs. Here, this notion is reinforced in the social realm on Stack Overflow.

Design Fiction 3: An internal business memo for a fictional online dating company

Meanwhile, SparkTheMatch.com, a fictional online dating service, is struggling to moderate and manage inappropriate user content on their platform. SparkTheMatch wants to utilize the P300 signal to tap into people’s tacit “gut feelings” to recognize inappropriate content. They are planning to implement a content moderation process using crowdsourced workers wearing BCIs.

In creating this fiction, we use the memo to provide insight into some of the practices and labor supporting the BCI-assisted review process from the company’s perspective. The memo suggests that the use of BCIs with Mechanical Turk will “help increase efficiency” for crowdworkers while still giving them a fair wage. The crowdworkers sit and watch a stream of flashing content, while wearing a BCI and the P300 response will subconsciously identity when workers recognize supposedly abnormal content. Yet we find it debatable whether or not this process improves the material conditions of the Turk workers. The amount of content to look at in order to make the supposedly fair wage may not actually be reasonable.

SparkTheMatch employees creating the Mechanical Turk tasks don’t directly interact with the BCI API. Instead they use pre-defined templates created by the company’s IT staff, a much more mediated interaction compared to the programmers and developers reading documentation and posting on Stack Overflow. By this point, the research lab origins of the P300 API underlying the service and questions about its broader applicability are hidden. From the viewpoint of SparkTheMatch staff, the BCI-aspects of their service just “works,” allowing managers to design their workflows around it, obfuscating the inner workings of the P300 API.

Design fiction 4: A crowdworker forum for workers who use BCIs

Fourth, the Mechanical Turk workers who do the SparkTheMatch content moderation work, share their experiences on a crowdworker forum. These crowd workers’ experiences and relationships to the P300 API is strikingly different from the people and organizations described in the other fictions—notably the API is something that they do not get to explicitly see. Aspects of the system are blackboxed or hidden away. While one poster discusses some errors that occurred, there’s ambiguity about whether fault lies with the BCI device or the data processing. EEG signals are not easily human-comprehensible, making feedback mechanisms difficult. Other posters blame the user for the errors. Which is problematic, given the preciousness of these workers’ positions, as crowd workers tend to have few forms of recourse when encountering problems with tasks.

For us, these forum accounts are interesting because they describe a situation in which the BCI user is not the person who obtains the real benefits of its use. It’s the company SparkTheMatch, not the BCI-end users, that is obtaining the most benefit from BCIs.

Some Emergent Themes and Reflections

From these design fictions, several salient themes arose for us. By looking at BCIs from the perspective of several everyday experiences, we can see different types of work done in relation to BCIs – whether that’s doing software development, being a client for a BCI-service, or using the BCI to conduct work. Our fictions are inspired by others’ research on the existing labor relationships and power dynamics in crowdwork and distributed content moderation (in particular work by scholars Lilly Irani and Sarah T. Roberts). Here we also critique utopian narratives of brain-controlled computing that suggest BCIs will create new efficiencies, seamless interactions, and increased productivity. We investigate a set of questions on the role of technology in shaping and reproducing social and economic inequalities.

Second, we use the design fiction to surface questions about the situatedness of brain sensing, questioning how generalizable and universal physiological signals are. Building on prior accounts of situated actions and extended cognition, we note the specific and the particular should be taken into account in the design of supposedly generalizable BCI systems.

These themes arose iteratively, and were somewhat surprising for us, particularly just how different the BCI system looks like from each of the different perspectives in the fictions. We initially set out to create a rather mundane fictional platform or infrastructure, an API for BCIs. With this starting point we brainstormed other types of direct and indirect relationships people might have with our BCI API to create multiple “entry points” into our API’s world. We iterated on various types of relationships and artifacts—there are end-users, but also clients, software engineers, app developers, each of whom might interact with an API in different ways, directly or indirectly. Through iterations of different scenarios (a BCI-assisted tax filing service was thought of at one point), and through discussions with our colleagues (some of whom posed questions about what labor in higher education might look like with BCIs), we slowly began to think that looking at the work practices implicated in these different relationships and artifacts would be a fruitful way to focus our designs.

Toward “Platform Fictions”

In part, we think that creating design fictions in mundane technical forms like documentation or stack overflow posts might help the artifacts be legible to software engineers and technical researchers. More generally, this leads us to think more about what it might mean to put platforms and infrastructures at the center of design fiction (as well as build on some of the insights from platform studies and infrastructure studies). Adoption and use does not occur in a vacuum. Rather, technologies get adopted into and by existing sociotechnical systems. We can use design fiction to open the “black boxes” of emerging sociotechnical systems. Given that infrastructures are often relegated to the background in everyday use, surfacing and focusing on an infrastructure helps us situate our design fictions in the everyday and mundane, rather than dystopia or utopia.

We find that using a digital infrastructure as a starting point helps surface multiple subject positions in relation to the system at different sites of interaction, beyond those of end-users. From each of these subject positions, we can see where contestation may occur, and how the system looks different. We can also see how assumptions, values, and practices surrounding the system at a particular place and time can be hidden, adapted, or changed by the time the system reaches others. Importantly, we also try to surface ways the system gets used in potentially unintended ways – we don’t think that the academic researchers who developed the API to detect brain signal spikes imagined that it would be used in a system of arguably exploitative crowd labor for content moderation.

Our fictions try to blur clear distinctions that might suggest what happens in “labs,” is separate from the “the outside world”, instead highlighting their entanglements. Given that much of BCI research currently exists in research labs, we raise this point to argue that BCI researchers and designers should also be concerned about the implications of adoption and application. This helps gives us insight into the responsibilities (and complicitness) of researchers and builders of technical systems. Some of the recent controversies around Cambridge Analytica’s use of Facebook’s API points to ways in which the building of platforms and infrastructures isn’t neutral, and that it’s incumbent upon designers, developers, and researchers to raise issues related to social concerns and potential inequalities related to adoption and appropriation by others.

Concluding Thoughts

This work isn’t meant to be predictive. The fictions and analysis present our specific viewpoints by focusing on several types of everyday experiences. One can read many themes into our fictions, and we encourage others to do so. But we find that focusing on potential adoptions of an emerging technology in the everyday and mundane helps surface contours of debates that might occur, which might not be immediately obvious when thinking about BCIs – and might not be immediately obvious if we think about social implications in terms of “worst case scenarios” or dystopias. We hope that this work can raise awareness among BCI researchers and designers about social responsibilities they may have for their technology’s adoption and use. In future work, we plan to use these fictions as research probes to understand how technical researchers envision BCI adoptions and their social responsibilities, building on some of our prior projects. And for design researchers, we show that using a fictional platform in design fiction can help raise important social issues about technology adoption and use from multiple perspectives beyond those of end-users, and help surface issues that might arise from unintended or unexpected adoption and use. Using design fiction to interrogate sociotechnical issues present in the everyday can better help us think about the futures we desire.


Crossposted with the UC Berkeley BioSENSE Blog

by Richmond at July 10, 2018 09:56 PM

Ph.D. student

The California Consumer Privacy Act of 2018: a deep dive

I have given the California Consumer Privacy Act of 2018 a close read.

In summary, the act grants consumers a right to request that businesses disclose the categories of information about them that it collects and sells, and gives consumers the right to businesses to delete their information and opt out of sale.

What follows are points I found particularly interesting. Quotations from the Act (that’s what I’ll call it) will be in bold. Questions (meaning, questions that I don’t have an answer to at the time of writing) will be in italics.

Privacy rights

SEC. 2. The Legislature finds and declares that:
(a) In 1972, California voters amended the California Constitution to include the right of privacy among the “inalienable” rights of all people. …

I did not know that. I was under the impression that in the United States, the ‘right to privacy’ was a matter of legal interpretation, derived from other more explicitly protected rights. A right to privacy is enumerated in Article 12 of the Universal Declaration of Human Rights, adopted in 1948 by the United Nations General Assembly. There’s something like a right to privacy in Article 8 of the 1950 European Convention on Human Rights. California appears to have followed their lead on this.

In several places in the Act, it specifies that exceptions may be made in order to be compliant with federal law. Is there an ideological or legal disconnect between privacy in California and privacy nationally? Consider the Snowden/Schrems/Privacy Shield issue: exchanges of European data to the United States are given protections from federal surveillance practices. This presumably means that the U.S. federal government agrees to respect EU privacy rights. Can California negotiate for such treatment from the U.S. government?

These are the rights specifically granted by the Act:

[SEC. 2.] (i) Therefore, it is the intent of the Legislature to further Californians’ right to privacy by giving consumers an effective way to control their personal information, by ensuring the following rights:

(1) The right of Californians to know what personal information is being collected about them.

(2) The right of Californians to know whether their personal information is sold or disclosed and to whom.

(3) The right of Californians to say no to the sale of personal information.

(4) The right of Californians to access their personal information.

(5) The right of Californians to equal service and price, even if they exercise their privacy rights.

It has been only recently that I’ve been attuned to the idea of privacy rights. Perhaps this is because I am from a place that apparently does not have them. A comparison that I believe should be made more often is the comparison of privacy rights to property rights. Clearly privacy rights have become as economically relevant as property rights. But currently, property rights enjoy a widespread acceptance and enforcement that privacy rights do not.

Personal information defined through example categories

“Information” is a notoriously difficult thing to define. The Act gets around the problem of defining “personal information” by repeatedly providing many examples of it. The examples are themselves rather abstract and are implicitly “categories” of personal information. Categorization of personal information is important to the law because under several conditions businesses must disclose the categories of personal information collected, sold, etc. to consumers.

SEC. 2. (e) Many businesses collect personal information from California consumers. They may know where a consumer lives and how many children a consumer has, how fast a consumer drives, a consumer’s personality, sleep habits, biometric and health information, financial information, precise geolocation information, and social networks, to name a few categories.

[1798.140.] (o) (1) “Personal information” means information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household. Personal information includes, but is not limited to, the following:

(A) Identifiers such as a real name, alias, postal address, unique personal identifier, online identifier Internet Protocol address, email address, account name, social security number, driver’s license number, passport number, or other similar identifiers.

(B) Any categories of personal information described in subdivision (e) of Section 1798.80.

(C) Characteristics of protected classifications under California or federal law.

(D) Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies.

Note that protected classifications (1798.140.(o)(1)(C)) includes race, which is socially constructed category (see Omi and Winant on racial formation). The Act appears to be saying that personal information includes the race of the consumer. Contrast this with information as identifiers (see 1798.140.(o)(1)(A)) and information as records (1798.140.(o)(1)(D)). So “personal information” in one case is the property of a person (and a socially constructed one at that); in another case it is the specific syntactic form; in another case it is a document representing some past action. The Act is very ontologically confused.

Other categories of personal information include (continuing this last section):


(E) Biometric information.

(F) Internet or other electronic network activity information, including, but not limited to, browsing history, search history, and information regarding a consumer’s interaction with an Internet Web site, application, or advertisement.

Devices and Internet activity will be discussed in more depth in the next section.


(G) Geolocation data.

(H) Audio, electronic, visual, thermal, olfactory, or similar information.

(I) Professional or employment-related information.

(J) Education information, defined as information that is not publicly available personally identifiable information as defined in the Family Educational Rights and Privacy Act (20 U.S.C. section 1232g, 34 C.F.R. Part 99).

(K) Inferences drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, preferences, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes.

Given that the main use of information is to support inferences, it is notable that inferences are dealt with here as a special category of information, and that sensitive inferences are those that pertain to behavior and psychology. This may be narrowly interpreted to exclude some kinds of inferences that may be relevant and valuable but not so immediately recognizable as ‘personal’. For example, one could infer from personal information the ‘position’ of a person in an arbitrary multi-dimensional space that compresses everything known about a consumer, and use this representation for targeted interventions (such as advertising). Or one could interpret it broadly: since almost all personal information is relevant to ‘behavior’ in a broad sense, and inference from it is also ‘about behavior’, and therefore protected.

Device behavior

The Act focuses on the rights of consumers and deals somewhat awkwardly with the fact that most information collected about consumers is done indirectly through machines. The Act acknowledges that sometimes devices are used by more than one person (for example, when they are used by a family), but it does not deal easily with other forms of sharing arrangements (i.e., an open Wifi hotspot) and the problems associated with identifying which person a particular device’s activity is “about”.

[1798.140.] (g) “Consumer” means a natural person who is a California resident, as defined in Section 17014 of Title 18 of the California Code of Regulations, as that section read on September 1, 2017, however identified, including by any unique identifier. [SB: italics mine.]

[1798.140.] (x) “Unique identifier” or “Unique personal identifier” means a persistent identifier that can be used to recognize a consumer, a family, or a device that is linked to a consumer or family, over time and across different services, including, but not limited to, a device identifier; an Internet Protocol address; cookies, beacons, pixel tags, mobile ad identifiers, or similar technology; customer number, unique pseudonym, or user alias; telephone numbers, or other forms of persistent or probabilistic identifiers that can be used to identify a particular consumer or device. For purposes of this subdivision, “family” means a custodial parent or guardian and any minor children over which the parent or guardian has custody.

Suppose you are a business that collects traffic information and website behavior connected to IP addresses, but you don’t go through the effort of identifying the ‘consumer’ who is doing the behavior. In fact, you may collect a lot of traffic behavior that is not connected to any particular ‘consumer’ at all, but is rather the activity of a bot or crawler operated by a business. Are you on the hook to disclose personal information to consumers if they ask for their traffic activity? If they do, or if they do not, provide their IP address?

Incidentally, while the Act seems comfortable defining a Consumer as a natural person identified by a machine address, it also happily defines a Person as “proprietorship, firm, partnership, joint venture, syndicate, business trust, company, corporation, …” etc. in addition to “an individual”. Note that “personal information” is specifically information about a consumer, not a Person (i.e., business).

This may make you wonder what a Business is, since these are the entities that are bound by the Act.

Businesses and California

The Act mainly details the rights that consumers have with respect to businesses that collect, sell, or lose their information. But what is a business?

[1798.140.] (c) “Business” means:
(1) A sole proprietorship, partnership, limited liability company, corporation, association, or other legal entity that is organized or operated for the profit or financial benefit of its shareholders or other owners, that collects consumers’ personal information, or on the behalf of which such information is collected and that alone, or jointly with others, determines the purposes and means of the processing of consumers’ personal information, that does business in the State of California, and that satisfies one or more of the following thresholds:

(A) Has annual gross revenues in excess of twenty-five million dollars ($25,000,000), as adjusted pursuant to paragraph (5) of subdivision (a) of Section 1798.185.

(B) Alone or in combination, annually buys, receives for the business’ commercial purposes, sells, or shares for commercial purposes, alone or in combination, the personal information of 50,000 or more consumers, households, or devices.

(C) Derives 50 percent or more of its annual revenues from selling consumers’ personal information.

This is not a generic definition of a business, just as the earlier definition of ‘consumer’ is not a generic definition of consumer. This definition of ‘business’ is a sui generis definition for the purposes of consumer privacy protection, as it defines businesses in terms of their collection and use of personal information. The definition explicitly thresholds the applicability of the law to businesses over certain limits.

There does appear to be a lot of wiggle room and potential for abuse here. Consider: the Mirai botnet had by one estimate 2.5 million devices compromised. Say you are a small business that collects site traffic. Suppose the Mirai botnet targets your site with a DDOS attack. Suddenly, your business collects information of millions of devices, and the Act comes into effect. Now you are liable for disclosing consumer information. Is that right?

An alternative reading of this section would recall that the definition (!) of consumer, in this law, is a California resident. So maybe the thresholds in 1798.140.(c)(B) and 1798.140.(c)(C) refer specifically to Californian consumers. Of course, for any particular device, information about where that device’s owner lives is personal information.

Having 50,000 California customers or users is a decent threshold for defining whether or not a business “does business in California”. Given the size and demographics of California, you would expect that many of the, just for example, major Chinese technology companies like Tencent to have 50,000 Californian users. This brings up the question of extraterritorial enforcement, which gave the GDPR so much leverage.

Extraterritoriality and financing

In a nutshell, it looks like the Act is intended to allow Californians to sue foreign companies. How big a deal is this? The penalties for noncompliance are civil penalties and a price per violation (presumably individual violation), not a ratio of profit, but you could imagine them adding up:

[1798.155.] (b) Notwithstanding Section 17206 of the Business and Professions Code, any person, business, or service provider that intentionally violates this title may be liable for a civil penalty of up to seven thousand five hundred dollars ($7,500) for each violation.

(c) Notwithstanding Section 17206 of the Business and Professions Code, any civil penalty assessed pursuant to Section 17206 for a violation of this title, and the proceeds of any settlement of an action brought pursuant to subdivision (a), shall be allocated as follows:

(1) Twenty percent to the Consumer Privacy Fund, created within the General Fund pursuant to subdivision (a) of Section 1798.109, with the intent to fully offset any costs incurred by the state courts and the Attorney General in connection with this title.

(2) Eighty percent to the jurisdiction on whose behalf the action leading to the civil penalty was brought.

(d) It is the intent of the Legislature that the percentages specified in subdivision (c) be adjusted as necessary to ensure that any civil penalties assessed for a violation of this title fully offset any costs incurred by the state courts and the Attorney General in connection with this title, including a sufficient amount to cover any deficit from a prior fiscal year.

1798.160. (a) A special fund to be known as the “Consumer Privacy Fund” is hereby created within the General Fund in the State Treasury, and is available upon appropriation by the Legislature to offset any costs incurred by the state courts in connection with actions brought to enforce this title and any costs incurred by the Attorney General in carrying out the Attorney General’s duties under this title.

(b) Funds transferred to the Consumer Privacy Fund shall be used exclusively to offset any costs incurred by the state courts and the Attorney General in connection with this title. These funds shall not be subject to appropriation or transfer by the Legislature for any other purpose, unless the Director of Finance determines that the funds are in excess of the funding needed to fully offset the costs incurred by the state courts and the Attorney General in connection with this title, in which case the Legislature may appropriate excess funds for other purposes.

So, just to be concrete: suppose a business collects personal information on 50,000 Californians and does not disclose that information. California could then sue that business for $7,500 * 50,000 = $375 million in civil penalties, that then goes into the Consumer Privacy Fund, whose purpose is to cover the cost of further lawsuits. The process funds itself. If it makes any extra money, it can be appropriated for other things.

Meaning, I guess this Act basically sustains a very sustained bunch of investigations and fines. You could imagine that this starts out with just some lawyers responding to civil complaints. But consider the scope of the Act, and how it means that any business in the world not properly disclosing information about Californians is liable to be fined. Suppose that some kind of blockchain or botnet based entity starts committing surveillance in violation of this act on a large scale. What kinds of technical investigative capacity is necessary to enforce this kind of thing worldwide? Does this become a self-funding cybercrime investigative unit? How are foreign actors who are responsible for such things brought to justice?

This is where it’s totally clear that I am not a lawyer. I am still puzzling over the meaning of [1798.155.(c)(2), for example.

“Publicly available information”

There are more weird quirks to this Act than I can dig into in this post, but one that deserves mention (as homage to Helen Nissenbaum, among other reasons) is the stipulation about publicly available information, which does not mean what you think it means:

(2) “Personal information” does not include publicly available information. For these purposes, “publicly available” means information that is lawfully made available from federal, state, or local government records, if any conditions associated with such information. “Publicly available” does not mean biometric information collected by a business about a consumer without the consumer’s knowledge. Information is not “publicly available” if that data is used for a purpose that is not compatible with the purpose for which the data is maintained and made available in the government records or for which it is publicly maintained. “Publicly available” does not include consumer information that is deidentified or aggregate consumer information.

The grammatical error in the second sentence (the phrase beginning with “if any conditions” trails off into nowhere…) indicates that this paragraph was hastily written and never finished, as if in response to an afterthought. There’s a lot going on here.

First, the sense of ‘public’ used here is the sense of ‘public institutions’ or the res publica. Amazingly and a bit implausibly, government records are considered publicly available only when they are used for purposes compatible with their maintenance. So if a business takes a public record and uses it differently that it was originally intended when it was ‘made available’, it becomes personal information that must be disclosed? As somebody who came out of the Open Data movement, I have to admit I find this baffling. On the other hand, it may be the brilliant solution to privacy in public on the Internet that society has been looking for.

Second, the stipulation that “publicly available” does not mean biometric information collected by a business about a consumer without the consumer’s knowledge” is surprising. It appears to be written with particular cases in mind–perhaps IoT sensing. But why specifically biometric information, as opposed to other kinds of information collected without consumer knowledge?

There is a lot going on in this paragraph. Oddly, it is not one of the ones explicitly flagged for review and revision in the section of soliciting public participation on changes before the Act goes into effect on 2020.

A work in progress

1798.185. (a) On or before January 1, 2020, the Attorney General shall solicit broad public participation to adopt regulations to further the purposes of this title, including, but not limited to, the following areas:

This is a weird law. I suppose it was written and passed to capitalize on a particular political moment and crisis (Sec. 2 specifically mentions Cambridge Analytica as a motivation), drafted to best express its purpose and intent, and given the horizon of 2020 to allow for revisions.

It must be said that there’s nothing in this Act that threatens the business models of any American Big Tech companies in any way, since storing consumer information in order to provide derivative ad targeting services is totally fine as long as businesses do the right disclosures, which they are now all doing because of GDPR anyway. There is a sense that this is California taking the opportunity to start the conversation about what U.S. data protection law post-GDPR will be like, which is of course commendable. As a statement of intent, it is great. Where it starts to get funky is in the definitions of its key terms and the underlying theory of privacy behind them. We can anticipate some rockiness there and try to unpack these assumptions before adopting similar policies in other states.

by Sebastian Benthall at July 10, 2018 03:16 PM

July 09, 2018

Ph.D. student

some moral dilemmas

Here are some moral dilemmas:

  • A firm basis for morality is the Kantian categorical imperative: treat others as ends and not means, with the corollary that one should be able to take the principles of ones actions and extend them as laws binding all rational beings. Closely associated and important ideas are those concerned with human dignity and rights. However, the great moral issues of today are about social forms (issues around race, gender, etc.), sociotechnical organizations (issues around the role of technology), or a totalizing systemic issues (issues around climate change). Morality based on individualism and individual equivalence seem out of place when the main moral difficulties are about body agonism. What is the basis for morality for these kinds of social moral problems?
  • Theodicy has its answer: it’s bounded rationality. Ultimately what makes us different from other people, that which creates our multiplicity, is our distance from each other, in terms of available information. Our disconnection, based on the different loci and foci within complex reality, is precisely that which gives reality its complexity. Dealing with each other’s ignorance is the problem of being a social being. Ignorance is therefore the condition of society. Society is the condition of moral behavior; if there were only one person, there would be no such thing as right or wrong. Therefore, ignorance is a condition of morality. How, then, can morality be known?

by Sebastian Benthall at July 09, 2018 01:53 PM

July 06, 2018

Ph.D. student

On “Racialization” (Omi and Winant, 2014)

Notes on Omi and Winant, 2014, Chapter 4, Section: “Racialization”.

Summary

Race is often seen as either an objective category, or an illusory one.

Viewed objectively, it is seen as a biological property, tied to phenotypic markers and possibly other genetic traits. It is viewed as an ‘essence’.
Omi and Winant argue that the concept of ‘mixed-race’ depends on this kind of essentialism, as it implies a kind of blending of essences. This is the view associated with “scientific” racism, most prevalent in the prewar era.

View as an illusion, race is seen as an ideological construct. An epiphenomenon of culture, class, or peoplehood. Formed as a kind of “false consciousness”, in the Marxist terminology. This view is associated with certain critics of affirmative action who argue that any racial classification is inherently racist.

Omi and Winant are critical of both perspectives, and argue for an understanding of race as socially real and grounded non-reducibly in phenomic markers but ultimately significant because of the social conflicts and interests constructed around those markers.

They define race as: “a concept that signifies and symbolizes signifiers and symbolizes social conflicts and interests by referring to different types of human bodies.”

The visual aspect of race is irreducible, and becomes significant when, for example, is becomes “understood as a manifestation of more profound differences that are situated within racially identified persons: intelligence, athletic ability, temperament, and sexuality, among other traits.” These “understandings”, which it must be said may be fallacious, “become the basis to justify or reinforce social differentiation.

This process of adding social significance to phenomic markers is, in O&W’s language, racialization, which they define as “the extension of racial meanings to a previously racially unclassified relationship, social practice, or group.” They argue that racialization happens at both macro and micro scales, ranging from the consolidation of the world-system through colonialization to incidents of racial profiling.

Race, then, is a concept that refer to different kinds of bodies by phenotype and the meanings and social practices ascribed to them. When racial concepts are circulated and accepted as ‘social reality’, racial difference is not dependent on visual difference alone, but take on a life of their own.

Omi and Winant therefore take a nuanced view of what it means for a category to be socially constructed, and it is a view that has concrete political implications. They consider the question, raised frequently, as to whether “we” can “get past” race, or go beyond it somehow. (Recall that this edition of the book was written during the Obama administration and is largely a critique of the idea, which seems silly now, that his election made the United States “post-racial”).

Omi and Winant see this framing as unrealistically utopian and based on extreme view that race is “illusory”. It poses race as a problem, a misconception of the past. A more effective position, they claim, would note that race is an element of social structure, not an irregularity in it. “We” cannot naively “get past it”, but also “we” do not need to accept the erroneous conclusion that race is a fixed biological given.

Comments

Omi and Winant’s argument here is mainly one about the ontology of social forms.
In my view, this question of social form ontology is one of the “hard problems”
remaining in philosophy, perhaps equivalent to if not more difficult than the hard problem of consciousness. So no wonder it is such a fraught issue.

The two poles of thinking about race that they present initially, the essentialist view and the epiphenomenal view, had their heyday in particular historical intellectual movements. Proponents of these positions are still popularly active today, though perhaps it’s fair to say that both extremes are now marginalized out of the intellectual mainstream. Despite nobody really understanding how social construction works, most educated people are probably willing to accept that race is socially constructed in one way or another.

It is striking, then, that Omi and Winant’s view of the mechanism of racialization, which involves the reading of ‘deeper meanings’ into phenomic traits, is essentially a throwback to the objective, essentializing viewpoint.
Perhaps there is a kind of cognitive bias, maybe representativeness bias or fundamental attribution bias, which is responsible for the cognitive errors that make racialization possible and persistent.

If so, then the social construction of race would be due as much to the limits of human cognition as to the circulation of concepts. That would explain the temptation to believe that we can ‘get past’ race, because we can always believe in the potential for a society in which people are smarter and are trained out of their basic biases. But Omi and Winant would argue that this is utopian. Perhaps the wisdom of sociology and social science in general is the conservative recognition of the widespread implications of human limitation. As the social expert, one can take the privileged position that notes that social structure is the result of pervasive cognitive error. That pervasive cognitive error is perhaps a more powerful force than the forces developing and propagating social expertise. Whether it is or is not may be the existential question for liberal democracy.

An unanswered question at this point is whether, if race were broadly understood as a function of social structure, it remains as forceful a structuring element as if it is understood as biological essentialism. It is certainly possible that, if understood as socially contingent, the structural power of race will steadily erode through such statistical processes as regression to the mean. In terms of physics, we can ask whether the current state of the human race(s) is at equilibrium, or heading towards an equilibrium, or diverging in a chaotic and path-dependent way. In any of these cases, there is possibly a role to be played by technical infrastructure. In other words, there are many very substantive and difficult social scientific questions at the root of the question of whether and how technical infrastructure plays a role in the social reproduction of race.

by Sebastian Benthall at July 06, 2018 05:40 PM

July 02, 2018

Ph.D. student

“The Theory of Racial Formation”: notes, part 1 (Cha. 4, Omi and Winant, 2014)

Chapter 4 of Omi and Winant (2014) is “The Theory of Racial Formation”. It is where they lay out their theory of race and its formation, synthesizing and improving on theories of race as ethnicity, race as class, and race as nation that they consider earlier in the book.

This rhetorical strategy of presenting the historical development of multiple threads of prior theory before synthesizing them into something new is familiar to me from my work with Helen Nissenbaum on Contextual Integrity. CI is a theory of privacy that advances prior legal and social theories by teasing out their tensions. This seems to be a good way to advance theory through scholarship. It is interesting that the same method of theory building can work in multiple fields. My sense is that what’s going on is that there is an underlying logic to this process which in a less Anglophone world we might call “dialectical”. But I digress.

I have not finished Chapter 4 yet but I wanted to sketch out the outline of it before going into detail. That’s because what Omi and Winant are presenting a way of understanding the mechanisms behind the reproduction of race that are not simplistically “systemic” but rather break it down into discrete operations. This is a helpful contribution; even if the theory is not entirely accurate, its very specificity elevates the discourse.

So, in brief notes:

For Omi and Winant, race is a way of “making up people”; they attribute this phrase to Ian Hacking but do not develop Hacking’s definition. Their reference to a philosopher of science does situate them in a scholarly sense; it is nice that they seem to acknowledge an implicit hierarchy of theory that places philosophy at the foundation. This is correct.

Race-making is a form of othering, of having a group of people identify another group as outsiders. Othering is a basic and perhaps unavoidable human psychological function; their reference for this is powell and Menendian (Apparently, john a. powell being one of these people like danah boyd who decapitalizes their name.)

Race is of course a social construct that is neither a fixed and coherent category nor something that is “unreal”. That is, presumably, why we need a whole book on the dynamic mechanisms that form it. One reason why race is such a dynamic concept is because (a) it is a way of organizing inequality in society, (b) the people on “top” of the hierarchy implied by racial categories enforce/reproduce that category “downwards”, (c) the people on the “bottom” of the hierarchy implied by racial categories also enforce/reproduce a variation of those categories “upwards” as a form of resistance, and so (d) the state of the racial categories at any particular time is a temporary consequence of conflicting “elite” and “street” variations of it.

This presumes that race is fundamentally about inequality. Omi and Winant believe it is. In fact, they think racial categories are a template for all other social categories that are about inequality. This is what they mean by their claim that race is a master category. It’s “a frame used for organizing all manner of political thought”, particularly political thought about liberation struggles.

I’m not convinced by this point. They develop it with a long discussion of intersectionality that is also unconvincing to me. Historically, they point out that sometimes women’s movements have allied with black power movements, and sometimes they haven’t. They want the reader to think this is interesting; as a data scientist, I see randomness and lack of correlation. They make the poignant and true point that “perhaps at the core of intersectionality practice, as well as theory, is the ‘mixed race’ category. Well, how does it come about that people can be ‘mixed’?” They then drop the point with no further discussion.

Perhaps the book suffers from its being aimed at undergraduates. Omi and Winant are unable to bring up even the most basic explanation for why there are mixed race people: a male person of one race and a female person of a different race have a baby, and that creates a mixed race person (whether or not they are male or female). The basic fact that race is hereditary whereas sex is not is probably really important to the interesectionality between race and sex and the different ways those categories are formed; somehow this point is never mentioned in discussions of intersectionality. Perhaps this is because of the ways this salient difference in race and sex undermines the aim of political solidarity that so much intersectional analysis seems to be going for. Relatedly, contemporary sociological theory seems to have some trouble grasping conventional sexual reproduction, perhaps because it is so sensitized to all the exceptions to it. Still, they drop the ball a bit by bringing this up and not going into any analytic depth about it at all.

Omi and Winant make an intriguing comment, “In legal theory, the sexual contract and racial contract have often been compared”. I don’t know what this is about but I want to know more.

This is all a kind of preamble to their presentation of theory. They start to provide some definitions:

racial formation
The sociohistorical process by which racial identities are created, lived out, transformed, and destroyed.
racialization
How phenomic-corporeal dimensions of bodies acquire meaning in social life.
racial projects
The co-constitutive ways that racial meanings are translated into social structures and become racially signified.
racism
Not defined. A property of racial projects that Omi and Winant will discuss later.
racial politics
Ways that the politics (of a state?) can handle race, including racial despotism, racial democracy, and racial hegemony.

This is a useful breakdown. More detail in the next post.

by Sebastian Benthall at July 02, 2018 03:36 PM

June 27, 2018

MIMS 2012

Notes from “Good Strategy / Bad Strategy”

Strategy has always been a fuzzy concept in my mind. What goes into a strategy? What makes a strategy good or bad? How is it different from vision and goals? Good Strategy / Bad Strategy, by UCLA Anderson School of Management professor Richard P. Rumelt, takes a nebulous concept and makes it concrete. He explains what goes into developing a strategy, what makes a strategy good, and what makes a strategy bad – which makes good strategy even clearer.

As I read the book, I kept underlining passages and scribbling notes in the margins because it’s so full of good information and useful techniques that are just as applicable to my everyday work as they are to running a multi-national corporation. To help me use the concepts I learned, I decided to publish my notes and key takeaways so I can refer back to them later.

The Kernel of Strategy

Strategy is designing a way to deal with a challenge. A good strategy, therefore, must identify the challenge to be overcome, and design a way to overcome it. To do that, the kernel of a good strategy contains three elements: a diagnosis, a guiding policy, and coherent action.

  • A diagnosis defines the challenge. What’s holding you back from reaching your goals? A good diagnosis simplifies the often overwhelming complexity of reality down to a simpler story by identifying certain aspects of the situation as critical. A good diagnosis often uses a metaphor, analogy, or an existing accepted framework to make it simple and understandable, which then suggests a domain of action.
  • A guiding policy is an overall approach chosen to cope with or overcome the obstacles identified in the diagnosis. Like the guardrails on a highway, the guiding policy directs and constrains action in certain directions without defining exactly what shall be done.
  • A set of coherent actions dictate how the guiding policy will be carried out. The actions should be coherent, meaning the use of resources, policies, and maneuvers that are undertaken should be coordinated and support each other (not fight each other, or be independent from one another).

Good Strategy vs. Bad Strategy

  • Good strategy is simple and obvious.
  • Good strategy identifies the key challenge to overcome. Bad strategy fails to identify the nature of the challenge. If you don’t know what the problem is, you can’t evaluate alternative guiding policies or actions to take, and you can’t adjust your strategy as you learn more over time.
  • Good strategy includes actions to take to overcome the challenge. Actions are not “implementation” details; they are the punch in the strategy. Strategy is about how an organization will move forward. Bad strategy lacks actions to take. Bad strategy mistakes goals, ambition, vision, values, and effort for strategy (these things are important, but on their own are not strategy).
  • Good strategy is designed to be coherent – all the actions an organization takes should reinforce and support each other. Leaders must do this deliberately and coordinate action across departments. Bad strategy is just a list of “priorities” that don’t support each other, at best, or actively conflict with each other, undermine each other, and fight for resources, at worst. The rich and powerful can get away with this, but it makes for bad strategy.
    • This was the biggest “ah-ha!” moment for me. All strategy I’ve seen has just been a list of unconnected objectives. Designing a strategy that’s coherent and mutually reinforces itself is a huge step forward in crafting good strategies.
  • Good strategy is about focusing and coordinating efforts to achieve an outcome, which necessarily means saying “No” to some goals, initiatives, and people. Bad strategy is the result of a leader who’s unwilling or unable to say “No.” The reason good strategy looks so simple is because it takes a lot of effort to maintain the coherence of its design by saying “No” to people.
  • Good strategy leverages sources of power to overcome an obstacle. It brings relative strength to bear against relative weakness (more on that below).

How to Identify Bad Strategy

Four Major Hallmarks of Bad Strategy

  • Fluff: A strategy written in gibberish masking as strategic concepts is classic bad strategy. It uses abstruse and inflated words to create the illusion of high-level thinking.
  • Failure to face the challenge: A strategy that does not define the challenge to overcome makes it impossible to evaluate, and impossible to improve.
  • Mistaking goals for strategy: Many bad strategies are just statements of desire rather than plans for overcoming obstacles.
  • Bad strategic objectives: A strategic objective is a means to overcoming an obstacle. Strategic objectives are “bad” when they fail to address critical issues or when they are impracticable.

Some Forms of Bad Strategy

  • Dog’s Dinner Objectives: A long list of “things to do,” often mislabeled as “strategies” or “objectives.” These lists usually grow out of planning meetings in which stakeholders state what they would like to accomplish, then they throw these initiatives onto a long list called the “strategic plan” so that no one’s feelings get hurt, and they apply the label “long-term” so that none of them need be done today.
    • In tech-land, I see a lot of companies conflate OKRs (Objectives and Key Results) with strategy. OKRs are an exercise in goal setting and measuring progress towards those goals (which is important), but it doesn’t replace strategy work. The process typically looks like this: once a year, each department head is asked to come up with their own departmental OKRs, which are supposed to be connected to company goals (increase revenue, decrease costs, etc.). Then each department breaks down their OKRs into sub-OKRs for their teams to carry out, which are then broken down into sub-sub-OKRs for sub-teams and/or specific people, so on down the chain. This process just perpetuates departmental silos and are rarely cohesive or mutually supportive of each other (if this does happen, it’s usually a happy accident). Department and team leaders often throw dependencies on other departments and teams, which causes extra work for teams that they often haven’t planned for and aren’t connected to their own OKRs, which drags down the efficiency and effectiveness of the entire organization. It’s easy for leaders to underestimate this drag since it’s hard to measure, and what isn’t measured isn’t managed.
    • As this book makes clear, setting objectives is not the same as creating a strategy to reach those goals. You still need to do the hard strategy work of making a diagnosis of what obstacle is holding you back, creating a guiding policy for overcoming the obstacle, and breaking that down into coherent actions for the company to take (which shouldn’t be based on what departments or people or expertise you already have, but instead you should look at what competencies you need to carry out your strategy and then apply existing teams and people to carrying them out, if they exist, and hire where you’re missing expertise, and get rid of competencies that are no longer needed in the strategy). OKRs can be applied at the top layer as company goals to reach, then applied again to the coherent actions (i.e. what’s the objective of each action, and how will you know if you reached it?), and further broken down for teams and people as needed. You still need an actual strategy before you can set OKRs, but most companies conflate OKRs with strategy.
  • Blue Sky Objectives: A blue-sky objective is a simple restatement of the desired state of affairs or of the challenge. It skips over the annoying fact that no one has a clue as to how to get there.
    • For example, “underperformance” isn’t a challenge, it’s a result. It’s a restatement of a goal. The true challenge are the reasons for the underperformance. Unless leadership offers a theory of why things haven’t worked in the past (a.k.a. a diagnosis), or why the challenge is difficult, it is hard to generate good strategy.
  • The Unwillingness or Inability to Choose: Any strategy that has universal buy-in signals the absence of choice. Because strategy focuses resources, energy, and attention on some objectives rather than others, a change in strategy will make some people worse off and there will be powerful forces opposed to almost any change in strategy (e.g. a department head who faces losing people, funding, headcount, support, etc., as a result of a change in strategy will most likely be opposed to the change). Therefore, strategy that has universal buy-in often indicates a leader who was unwilling to make a difficult choice as to the guiding policy and actions to take to overcome the obstacles.
    • This is true, but there are ways of mitigating this that he doesn’t discuss, which I talk about in the “Closing Thoughts” section below.
  • Template-style “strategic planning:” Many strategies are developed by following a template of what a “strategy” should look like. Since strategy is somewhat nebulous, leaders are quick to adopt a template they can fill in since they have no other frame of reference for what goes into a strategy.
    • These templates usually take this form:
      • The Vision: Your unique vision of what the org will be like in the future. Often starts with “the best” or “the leading.”
      • The Mission: High-sounding politically correct statement of the purpose of the org.
      • The Values: The company’s values. Make sure they are non-controversial.
      • The Strategies: Fill in some aspirations/goals but call them strategies.
    • This template-style strategy skips over the hard work of identifying the key challenge to overcome, and setting out a guiding policy and actions to overcome the obstacle. It mistakes pious statements of the obvious as if they were decisive insights. The vision, mission, and goals are usually statements that no one would argue against, but that no one is inspired by, either.
    • I found myself alternating between laughing and shaking my head in disbelief because this section is so on the nose.
  • New Thought: This is the belief that you only need to envision success to achieve it, and that thinking about failure will lead to failure. The problem with this belief is that strategy requires you to analyze the situation to understand the problem to be solved, as well as anticipating the actions/reactions of customers and competitors, which requires considering both positive and negative outcomes. Ignoring negative outcomes does not set you up for success or prepare you for the unthinkable to happen. It crowds out critical thinking.

Sources of Power

Good strategy will leverage one or more sources of power to overcome the key obstacles. Rumelt describes 7 sources of power, but the list is not exhaustive:

  • Leverage: Leverage is finding an imbalance in a situation, and exploiting it to produce a disproportionately large payoff. Or, in resource constrained situations (e.g. a startup), it’s using the limited resources at hand to achieve the biggest result (i.e. not trying to do everything at once). Strategic leverage arises from a mixture of anticipating the actions and reactions of competitors and buyers, identifying a pivot point that will magnify the effects of focused effort (e.g. an unmet need of people, an underserved market, your relative strengths/weaknesses, a competence you’ve developed that can be applied to a new context, and so on), and making a concentrated application of effort on only the most critical objectives to get there.
    • This is a lesson in constraints – a company that isn’t rich in resources (i.e. money, people) is forced to find a sustainable business model and strategy, or perish. I see startups avoid making hard choices about what objectives to pursue by taking investor money to hire their way out of deciding what not to do. They can avoid designing a strategy by just throwing spaghetti at the wall and hoping something sticks, and if it doesn’t go back to the investors for more handouts. “Fail fast,” “Ready, fire, aim,” “Move fast and break things,” etc., are all Silicon Valley versions of this thinking worshiped by the industry. If a company is resource constrained, they’re forced to find a sustainable business model and strategy sooner. VC money has a habit of making companies lazy when it comes to the business fundamentals of strategy and turning a profit.
  • Proximate Objectives: Choose an objective that is close enough at hand to be feasible, i.e. proximate. This doesn’t mean your goal needs to lack ambition, or be easy to reach, or that you’re sandbagging. Rather, you should know enough about the nature of the challenge that the sub-problems to work through are solvable, and it’s a matter of focusing individual minds and energy on the right areas to reach an otherwise unreachable goal. For example, landing a man on the moon by 1969 was a proximate objective because Kennedy knew the technology and science necessary was within reach, and it was a matter of allocating, focusing, and coordinating resources properly.
  • Chain-link Systems: A system has chain-link logic when its performance is limited by its weakest link. In a business context, this typically means each department is dependent on the other such that if one department underperforms, the performance of the entire system will decline. In a strategic setting, this can cause organizations to become stuck, meaning the chain is not made stronger by strengthening one link – you must strengthen the whole chain (and thus becoming un-stuck is its own strategic challenge to overcome). On the flip side, if you design a chain link system, then you can achieve a level of excellence that’s hard for competitors to replicate. For example, IKEA designs its own furniture, builds its own stores, and manages the entire supply chain, which allows it to have lower costs and a superior customer experience. Their system is chain-linked together such that it’s hard for competitors to replicate it without replicating the entire system. IKEA is susceptible to getting stuck, however, if one link of its chain suffers.
  • Design: Good strategy is design – fitting various pieces together so they work as a coherent whole. Creating a guiding policy and actions that are coherent is a source of power since so few companies do this well. As stated above, a lot of strategies aren’t “designed” and instead are just a list of independent or conflicting objectives.
    • The tight integration of a designed strategy comes with a downside, however — it’s narrower in focus, more fragile, and less flexible in responding to change. If you’re a huge company with a lot of resources at your disposal (e.g. Microsoft), a tightly designed strategy could be a hinderance. But in situations where resources are constrained (e.g. a startup grasping for a foothold in the market), or the competitive challenge is high, a well-designed strategy can give you the advantage you need to be successful.
  • Focus: Focus refers to attacking a segment of the market with a product or service that delivers more value to that segment than other players do for the entire market. Doing this requires coordinating policies and objectives across an organization to produce extra power through their interacting and overlapping effects (see design, above), and then applying that power to the right market segment (see leverage, above).
    • This source of power exists in the UX and product world in the form of building for one specific persona who will love your product, capturing a small – but loyal – share of the market, rather than trying to build a product for “everyone” that captures a potentially bigger part of the market but that no one loves or is loyal to (making it susceptible to people switching to competitors). This advice is especially valuable for small companies and startups who are trying to establish themselves.
  • Growth: Growing the size of the business is not a strategy – it is the result of increased demand for your products and services. It is the reward for successful innovation, cleverness, efficiency, and creativity. In business, there is blind faith that growth is good, but that is not the case. Growth itself does not automatically create value.
    • The tech industry has unquestioned faith in growth. VC-backed companies are expected to grow as big as possible, as fast as possible. If you don’t agree, you’re said to lack ambition, and investors won’t fund you. This myth is perpetuated by the tech media. But as Rumelt points out, growth isn’t automatically good. Most companies don’t need to be, and can’t be, as big as companies like Google, Facebook, Apple, and Amazon. Tech companies grow in an artificial way, i.e. spending the money of their investors, not money they’re making from customers. This growth isn’t sustainable, and when they can’t turn a profit they shut down (or get acquired). What could have been a great company, at a smaller size or slower growth rate, now no longer exists. This generally doesn’t harm investors because they only need a handful of big exits out of their entire portfolio, so they pay for the ones that fail off of the profits from the few that actually make it big.
  • Using Advantage: An advantage is the result of differences – an asymmetry between rivals. Knowing your relative strengths and weaknesses, as well as the relative strengths and weaknesses of your competitors, can help you find an advantage. Strengths and weaknesses are “relative” because a strength you have in one context, or against one competitor, may be a weakness in another context, or against a different competitor. You must press where you have advantage and side-step situations in which you do not. You must exploit your rivals’ weaknesses and avoid leading with your own.
    • The most basic advantage is producing at a lower cost than your competitors, or delivering more perceived value than your competitors, or a mix of the two. The difficult part is sustaining an advantage. To do that, you need an “isolating mechanism” that prevents competitors from duplicating it. Isolating mechanisms include patents, reputations, commercial and social relationships, network effects, dramatic economies of scale, and tacit knowledge and skill gained through experience.
    • Once you have an advantage, you should strengthen it by deepening it, broadening it, creating higher demand for your products and services, or strengthening your isolating mechanisms (all explained fully in the book).
  • Dynamics: Dynamics are waves of change that roll through an industry. They are the net result of a myriad of shifts and advances in technology, cost, competition, politics, and buyer perceptions. Such waves of change are largely exogenous – that is, beyond the control of any one organization. If you can see them coming, they are like an earthquake that creates new high ground and levels what had previously been high ground, leaving behind new sources of advantage for you to exploit.
    • There are 5 guideposts to look out for: 1. Rising fixed costs; 2. Deregulation; 3. Predictable Biases; 4. Incumbent Response; and 5. Attractor States (i.e. where an industry “should” go). (All of these are explained fully in the book).
    • Attractor states are especially interesting because he defines it as where an industry “should” end up in the light of technological forces and the structure of demand. By “should,” he means to emphasize an evolution in the direction of efficiency – meeting the needs and demands of buyers as efficiently as possible. They’re different from corporate visions because the attractor state is based on overall efficiency rather than a single company’s desire to capture most of the pie. Attractor states are what pundits and industry analysts write about. There’s no guarantee, however, that the attractor state will ever come to pass. As it relates to strategy, you can anticipate most players to chase the attractor state. This leads many companies to waste resources chasing the wrong vision, and faltering as a result (e.g. Cisco rode the wave of “dumb pipes” and “IP everywhere” that AT&T and other telecom companies should have exploited). If you “zig” when other companies “zag”, you can build yourself an advantage.
    • As a strategist, you should seek to do your own analysis of where an industry is going, and create a strategy based on that (rather than what pundits “predict” will happen). Combining your own proprietary knowledge of your customers, technology, and capabilities with industry trends can give you deeper insights that analysts on the outside can’t see. Taking that a step further, you should also look for second-order effects as a result of industry dynamics. For example, the rise of the microprocessor was predicted by many, and largely came true. But what most people didn’t predict was the second-order effect that commoditized microprocessors getting embedded in more products led to increased demand for software, making the ability to write good software a competitive advantage.
  • Inertia: Inertia is an organization’s unwillingness or inability to adapt to changing circumstances. As a strategist, you can exploit this by anticipating that it will take many years for large and well-established competitors to alter their basic functioning. For example, Netflix pushed past Blockbuster because the latter could or would not abandon its focus on retail stores.
  • Entropy: Entropy causes organizations to become less organized and less focused over time. As a strategist, you need to watch out for this in your organization to actively maintain your purpose, form, and methods, even if there are no changes in strategy or competition. You can also use it as a weakness to exploit against your competitors by anticipating that entropy will creep into their business lines. For example, less focused product lines are a sign of entropy. GM’s car lines used to have distinct price points, models, and target buyers, but over time entropy caused each line to creep into each other and overlap, causing declining sales from consumer confusion.

Closing Thoughts

One of the things that surprised me as I read the book is how much overlap there is between doing strategy work and design work – diagnosing the problem, creating multiple potential solutions (i.e. the double diamond), looking at situations from multiple perspectives, weighing tradeoffs in potential solutions, and more. The core of strategy, as he defines it, is identifying and solving problems. Sound familiar? That’s the core of design! He even states, “A master strategist is a designer.”

Rumelt goes on to hold up many examples of winning strategies and advantages from understanding customer needs, behaviors, pain points, and building for a specific customer segment. In other words, doing user-centered design. He doesn’t specifically reference any UX methods, but it was clear to me that the tools of UX work apply to strategy work as well.

The overlap with design doesn’t end there. He has a section about how strategy work is rooted in intuition and subjectivity. There’s no way to prove a strategy is the “best” or “right” one. A strategy is a judgement of a situation and the best path forward. You can say the exact same thing about design as well.

Since a strategy can’t be proven to be right, Rumelt recommends considering a strategy a “hypothesis” that can be tested and refined over time. Leaders should listen for signals that their strategy is or is not working, and make adjustments accordingly. In other words, strategists should iterate on their solutions, same as designers.

Furthermore, this subjectivity causes all kinds of challenges for leaders, such as saying “no” to people, selling people on their version of reality, and so on. He doesn’t talk about how to overcome these challenges, but as I read the book I realized these are issues that designers have to learn how to deal with.

Effective designers have to sell their work to people to get it built. Then they have to be prepared for criticism, feedback, questions, and alternate ideas. Since their work can’t be “proven” to be correct, it’s open to attack from anyone and everyone. If their work gets built and shipped to customers, they still need to be open to it being “wrong” (or at least not perfect), listen to feedback from customers, and iterate further as needed. All of these soft skills are ways of dealing with the problems leaders face when implementing a strategy.

In other words, design work is strategy work. As Rumelt says, “Good strategy is design, and design is about fitting various pieces together so they work as a coherent whole.”


If you enjoyed this post (and I’m assuming you did if you made it this far), then I highly recommend reading the book yourself. I only covered the highlights here, and the book goes into a lot more depth on all of these topics. Enjoy!

by Jeff Zych at June 27, 2018 09:16 PM