nmvg

The WGA Strike is a Canary in the Coal Mine for AI Labor Concerns

In a nutshell: Film, television, radio, and media writers are on strike, and their demands include restrictions on AI use. Studios have responded by threatening to use AI systems — trained using the fruits of the labor of the very writers on strike — as AI Strikebreakers. This scenario paints a bleak picture for a wide variety of professions that produce data as a byproduct of a day’s work.

About the Strike

The Writers Guild of America (WGA) is currently on strike. Some of their asks are typical of labor strikes, including better pay and working conditions, but the strike demands have a distinctly 2023 flavor to them — a section titled “Artificial Intelligence”. This section states:

“Regulate use of artificial intelligence on MBA-covered projects: AI can’t write or rewrite literary material; can’t be used as source material; and MBA-covered material can’t be used to train AI.”

WGA members want use restrictions on AI. At the very same time, studios are discussing using AI systems during the strike. Studios appear to be interested in using AI systems as strikebreakers.

On one hand, some AI scripts might be truly terrible (a silver lining in the bleakness — we might see a renaissance in “so-bad-it’s-good” content, though it certainly won’t top my favorite). On the other hand, if you throw enough prompts at ChatGPT, you’re likely to get something decent, that’s likely will lower strike leverage.

Why Would ChatGPT Be Able to Write Scripts?

In the cases where AI scripts *are* good, we can explain why: it’s because the past scripts written by writers were good! There’s little doubt that past labor from WGA members will be directly helping to produce any AI-generated scripts.

In addition to actual script content that may have found its way into large collections of pirated data (alongside books like Lord of the Rings and Dune), tweets, blogs, and Reddit posts from writers are also “teaching” ChatGPT. Technically, the exact training data used for models like ChatGPT that studios are likely to employ (e.g. ChatGPT) are still secret.

The very same week, we’ve seen a pre-print on arXiv showing strong evidence that GPT-4 has memorized many famous, clearly copyrighted books such as Harry Potter, the Hunger Games, Lord of the Rings, and Dune. Given this addition to emerging evidence around memorization of copyrighted materials, it’s probably best to start taking a guilty-until-proven-innocent approach: if a LLM has secret training data, it probably leans more towards “gray area” data sources than unambiguously legal data source.

Machines and Labor

The idea that machines might benefit capitalists at the expense of laborers is not particularly new. However, the machines of “The Second Machine Age” have an especially perverse characteristic. Generative AI truly uses a worker’s past efforts against them, in a way that’s even more extreme than a Jacquard loom (though it’s interesting to note, both the loom and generative AI are really just maps over collective experience).

A technology that directly uses the efforts of writers -- alongside efforts from you and me -- to reduce their labor power, with no form of recourse (right now), should be concerning to everyone. It’s simply bad for labor power, which has been instrumental in securing better working conditions across the world. Even in fields in which unions play a smaller role, we can imagine a parallel situation in which a generative AI system lowers your individual labor leverage (the basic logic of using a strike as leverage for higher wages is no different than an individual threatening to leave in order to get a raise).

More generally, the idea that we’re all being co-opted into intervening in labor disputes on the side of The Man is an idea that I think feels “icky” across the political spectrum. But without a fairer playing field in terms of data transparency (we should know when we contributed to an AI system) and data agency (we should only be contributing to AI systems we want to contribute to, at least going forward), this is the nature of generative AI.

Can We Just Shift Some Jobs Around to Address these Concerns?

There’s a position (see e.g. this op-ed) that downplays economic concerns, typically highlighting past revolutions in technology just result in *changes* (some jobs go away, and new jobs spring up). One version of the argument goes, “Cameras didn’t kill painting”. In the generative AI context, people have argued that workers will just need to learn to use AI as part of their job.

These counterpoints may hold water (though I’m inclined the believe the general effect of AI and data-dependent computing sans intervention will be increased power concentration), but offer very little reassurance in the context of a human striker vs. AI strikebreaker dispute. Even if in the long run things level off, so to speak, in the short run generative AI may seriously disrupt the lives of writers and the broader production of films and TV.

Implications for Other Fields

Writing will not be the last field to see this play out. Already, we’ve seen major battles (legal and otherwise) in the area of visual art. The WGA Strike is a canary in the coal mine for a large number of content-based jobs, which includes a decent set of “email and spreadsheet” jobs.

But more generally, any job in which a worker produces tokens that capture important elements of knowledge, skill, and human decision-making may be at some risk from generative AI. OpenAI themselves have published research examining which jobs have the most “exposure” (80% of the U.S. workforce).

Ultimately, we’re all going to have ask ourselves: if you were to go on strike — or even just ask your boss for a raise — how much of your work can your boss try (perhaps unsuccessfully) to replace with generative AI outputs?

There’s no doubt in my mind that in the long run, wholesole job replacement will work very poorly. Eventually, models need to be touched up or retrained, and each touch-up requires a new set of data. In the context of creative work in particular, I’m very confident that a model with no new training data will turn soulless very quickly, not to mention be completely disconnected from news and trends.

Someone out there in the world needs to write scripts, or send your emails, or generative AI will stagnate. Different fields probably have creative outputs that “go stale” at a different rate, and the faster they go stale the more leverage laborers have (I think a comparison with map-making is useful here).

A Specific Policy Intervention: The Right to Data Strike

In my previous posts and papers, I’ve written at length about a variety of policy and design interventions I’d like to see in the data space. Here, I want to briefly touch on what we might do specifically about AI Strikebreakers.

Workers that produce creative outputs should have the right to withdraw their outputs from training sets. This would mean a group like the WGA, who intends to strike at some point, might refuse to include any scripts in training data unless AI operators pay for it (this payment might be quite large, and could be used for strike funds so that the strike can last long enough that AI outputs “go stale”).

In other words, just as many workers across the world have a right to strike, workers who create data as a byproduct of their job should have a right to data strike. If we cannot implement such a right, we must recognize that this fundamentally erodes labor power (and therefore we might want to take other actions to account for the societal downstream effects of weakened labor).

In the long run, most training data should be collected under explicit contracts that make it clear that, in the event of a strike, an AI operator might have to delete on of their models. In the short term this probably isn’t realistic. This would directly solve the AI Strikebreaker problem (though perhaps some groups may negotiate for contracts that do allow for AI use during a strike).

Something that is possible now, is that going forward we should ensure a right for individuals and groups to control where the data they produce starting tomorrow flows.

This intervention would be easier to implement if built on top of transparency requirements, like those being proposed in the EU. Given this legislation seems fairly likely to pass, the Right to Data Strike may not be so far off.