Scarlett Johansson, OpenAI, and the muddy forensics of AI content theft

In its latest announcement of the recent update ChatGPT-4o, OpenAI demonstrated how it more naturally interacts in voice conversations. (You can see some of that in this clip.) The AI’s voice, known as Sky, sounds a lot like the popular actress Scarlett Johansson. The question I’m pondering today is, if the voice was actually built based on Johansson’s, how could she prove it, what does that say about the many other pieces of copyrighted content that OpenAI is accused of stealing, and what’s the future for AI based on copyrighted content?

OpenAI backed off after Johansson’s comments

According to Johansson, OpenAI tried to negotiate with her for permission to use her voice in its AI (gift link). Johansson was also the voice of the AI that the main character falls in love with in the movie “Her.” She says she refused to work with OpenAI, and then was surprised to hear a voice that sounds just like her in the company’s demo. OpenAI CEO Sam Altman muddied the water a bit by tweeting the single word “her” recently. Johansson’s lawyers demanded that OpenAI stop using her likeness without permission.

OpenAI claims that it trained “Sky” on the voice of a different actress that it hired, not Johansson — and that there was no intent to copy her voice. But the company has removed “Sky” from the list of available voices.

The problem of identifying AI source material

Imagine for a moment that OpenAI had not backed down. How could Johansson prove that they’d used her voice?

Short of issuing a subpoena for the people and processes involved in the training, that would be very difficult. Since Sky is not a direct copy of anything Johansson has said or recorded, there’s no way to prove the voice is hers. I suppose you could ask Sky to say something Johansson said and attempt to determine if it sounds the same, but it would still be hard to prove. If Sky’s voice print doesn’t match Johansson’s, would that be sufficient to exonerate OpenAI of theft?

Imagine for a moment that OpenAI created a voice by combining Johansson’s voice with, say, Charlize Theron’s and Jennifer Aniston’s. Could anyone prove where the resulting voice came from? Could those actresses take action against the company?

By cleverly manipulating AI tools, you can sometimes get them to cough up large chunks of text that resemble their original sources. This is the basis for the New York Times’ lawsuit against OpenAI. But you can bet that the AI companies are doing everything possible to prevent future versions from accidentally generating copies of the material on which they are trained.

We’ve seen versions of this scenario before, for example, when content companies sued YouTube for hosting portions of their copyrighted content. YouTube became such a popular utility so quickly — and the copyright situation was sufficiently muddy — that shutting it down wasn’t the best endgame. Instead, the settlement involved putting copyrighted content detection software in place, allowing content companies to request content takedowns, and cutting content owners in on advertising revenues generated from their copyrighted content.

The strategy of the AI companies is likely to be similar:

  • Create tools that are so useful that they become embedded in many everyday tasks, so that no one can imagine working without them.
  • Ensure that any copyright claims are difficult to bring and expensive to litigate.
  • Use methods that make it as hard as possible to identify the source material on which AIs are trained.
  • Respond to individual legal actions from prominent individuals — like Johansson’s — by backing down, but admitting nothing.
  • Negotiate to pay a fee for access to copyrighted material, including by sharing revenues with copyright holders.

This is a common Silicon Valley “disruption” playbook: break rules and negotiate later. Consider Uber’s evasion of taxi licensing rules, Airbnb’s evasion of hotel regulations, and crypto’s disruption of tax regimes. In the end, these companies mostly came to an accommodation with regulators, but only after they’d established their businesses by breaking the rules.

There’s a possible future in which copyright claims bring down the entire AI world. But it’s certainly not the most likely future. AI is disruptive but transformative. The copyright owners and their lawyers will end up settling for a piece of the rapidly growing pie based on their work — except for a few who, like Johansson, will opt out.

Count on it.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

4 Comments

  1. Johansson released an album of Tom Waits covers, making her probably a bigger fan of his work than I am. https://www.youtube.com/playlist?list=PLQvL6GLJiX916YBAwEhYQxR2qqnBp8GfQ Doesn’t she (or at least, her lawyers) know that Frito-Lay approached Tom Waits in the late Eighties over using “Step Right Up?”

    Waits sent them to hell–he’s very loathe to license his work because of his lovely philosophical views–so Frito-Lay hired a “sound-alike” that the public interpreted as Tom Waits to hock Doritos, to Waits’s great embarrassment. Waits sued, won two-and-a-half million in a verdict including punitives (in 1990, so about a half-billion dollars today), and allegedly donated the proceeds to charity.
    https://www.latimes.com/archives/la-xpm-1990-05-09-me-238-story.html

    From the gift link article in the Washington Post, the “experts” say copyright law hasn’t caught up yet. If memory serves me right, Waits based his claim on trademark and false appropriation of his work and voice. The IP law is already there and has been for decades. No one has to pretend Scam Altman’s “but it’s AI” is a hand-wavey defense that bunks the existing body of intellectual property precedent.

    Step right up and get owned, everyone’s a winner, bargains galore.

  2. We already know that one of the AI platforms stole the voice of the narrator of the Harry Potter audio books, and have applied it to other audio books, using his name as the narrator, without his knowledge or permission. That was reported a few months ago. Identity theft, no matter what others may call it, and completely unethical. I wonder how long it will be before we have voice-recognition ID, the way that fingerprints have been used for 100+ years and eye-mapping much more recently for security measures?

    Talk about job security for intellectual property protection and security professionals …

  3. Scarlett Johansson is well-respected, liked, and known for having a voice. OpenAI is known for consuming other people’s content to produce chatbot answers without citing sources.