OpenAI Stole Scarlett Johansson's Voice for GPT-4o

A Sam Altman gimmick/obsession gone terribly wrong.

May 21, 2024

Asian American group slams Scarlett Johansson's 'whitewashed' Ghost in the Shell | South China Morning Post

Hello Everyone,

This is an Op-Ed about OpenAI, the state of the internet and technological loneliness. The lengths Generative AI startups are going to is not a healthy sign for the future of the internet.

Scarlett Johansson says that OpenAI asked her to be the voice behind ChatGPT — but that when she declined, the company went ahead and created a voice that sounded just like her.

In a better world OpenAI would actually be able to deliver on the dream of many engineers around the movie Her (2013). OpenAI thought it was necessary to explain how it chose the voices of GPT-4o. The original “Sky” voice sure did sound a lot like Scarlett Johansson.

It appears like OpenAI had some kind of Character.AI envy with how it took to the Voice-AI voices behind GPT-4o. With the death of Inflection AI, with its best talent merged into Microsoft, perhaps OpenAI thought it was the one. It probably isn’t the One.

In a statement shared to NPR, Johansson says that she has now been “forced to hire legal counsel” and has sent two letters to OpenAI inquiring how the soundalike ChatGPT voice, known as Sky, was made.

For OpenAI, the optics don’t look good here.

The Battle for AI Companions, this isn’t your Granddaughter’s Siri

Over 20 million users are using Character AI worldwide. Only 100 million users use ChatGPT sometimes. That’s not actually very many users. No wonder they made GPT-4o free in that case.

OpenAI has removed a ChatGPT voice that sounded very similar to Scarlett Johansson’s AI chatbot from ‘HER’. It’s all very awkward for OpenAI.

It’s all a little creepy, and it raises questions about whether the development of this kind of technology will prey on human vulnerabilities and the technological loneliness the same Silicon Valley companies are actually responsible for:

OpenAI’s unveiling of GPT-4o has come under mixed reviews with the Voice-AI not being as smart as is claimed in any number of demos of OpenAI’s Spring update event, that was one day before Google I/O 2024.

Statement from Scarlett Johansson on the OpenAI situation

Source, X.

OpenAI says Sky voice in ChatGPT will be paused after concerns it sounds too much like Scarlett Johansson

Obviously OpenAI knew what it was doing. Always trying to pretend it’s more sophisticated and advanced than it really is. “Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system,” Johansson writes.

The company says the voices in ChatGPT were from paid voice actors. I have a hard time believing this to be true. According to OpenAI, a final five were selected from an initial pool of 400 and it's “purely a coincidence” the unnamed actress behind the Sky voice has a similar tone to Johansson.

Johansson says she was “shocked, angered and in disbelief” over how “eerily similar” the voice of Sky sounded to herself.

OpenAI has said GPT-4o is the most human sounding Voice-AI in the world. OpenAI is now in a rush to push this free version of ChatGPT to try to scale to more users and get more people interested in its ecosystem.

OpenAI was of course trying to make us all believe that this was it! After many people noticed that one of the voices in OpenAI’s voice-enabled chatbot sounded very much like that of Scarlett Johansson’s disembodied AI companion in Spike Jonze’s 2013 movie “Her,” the company is suspending the voice for the time being. So this is what is supposed to be a distinct step closer to human-aligned AGI? I really don’t understand.

Apparently the voice of “Sky” has been available since OpenAI launched ChatGPT’s voice mode last September. But the connection to Johansson wasn’t as clear until last week when OpenAI demoed an updated AI model that made the voice more expressive.

In a somewhat lacklustre demo of GPT-4o at the event and in a flood of videos I’m supposed to find this product useful? After 18 months of having to explain how ChatGPT’s sole utility for me is helping me to brainstorm? Not the most useful gang of tools and agents in this Generative AI crowd, huh? Where’s the ROI for real companies in this stuff, eh?

Only one Generative AI startups, this one in OpenAI, is even on the path to profitability making decent revenue - and that’s mostly because it’s best buddy is Microsoft! It’s not even mainstream. If you say “ChatGPT” to most people on the street, they will just laugh. It’s not exactly a revolution to most consumers, and that’s a problem in the OpenAI/HER narrative.

If OpenAI says:

GPT-4o is our latest step in pushing the boundaries of deep learning, this time in the direction of practical usability.

And one of their voice-actor voices sounds nearly exactly like HER, we have a problem in product-marketing y’all.

Most of GPT-4o Users are Nerdy Young Men

So it makes sense that the Voice-AI is so flirtatious and trying to say things like we might have seen in the (now cringe) 2013 movie.

Do they want me to share my life with their Genius Voice-AI? Like wtf OpenAI! I find it all too disturbing.

Do they want it to help kids with their homework, and be our next date? I don’t get it.

Is that supposed to sound more human?

I guess even OpenAI gets a bit lost in translation sometimes.

GPT-4o does have a noticeable delay and lag, this is not natural human-sounding AI. And it’s claimed its a real-time to a natural-sounding and emotion-mimicking AI? Wait, what?

Sam Altman even wrote a blog about his GPT-4o event, well he probably got AI to do it with some subliminal marketing messages too. OpenAI appears to want to be the messenger to the AGI lovers and friends, we don’t have time for since we are so addicted to our mobile devices and apps from the previous generation of Silicon Valley. There’s something so unhinged about this.

The internet is being now flooded with fake AI generated content and most reviews and applauding for GPT-4o isn’t even real with engagement from more bots. We’re riding into the sunset of a zombie internet, where OpenAI thinks it gets to inherit a lot of the future. But what if that doesn’t actually occur?

So no wonder on social media there aren’t actual people to interact with, so apparently the solution is to interact with apps like OpenAI? That isn’t even as smart as the AI operating system in a cult-favorite move that’s 11 years old?

OpenAI is going to face more lawsuits I guess. But what does it even say about the state of the internet? Between the AI overviews and cloning celebrities, you could make the argument that Advertising has broken the American model of the internet. OpenAI seems to be complicit.

Listen I love science fiction as much as the next guy, but this technology isn’t even at the point of AI companionship. It’s not fit for purpose. It’s not even ready, after all the promises OpenAI have been making. After stealing a voice Sam Altman admired.

I honestly don’t care about the benchmarks of this thing:

38% of webpages from 2013 have disappeared from the Internet

The internet is recycling AI slop now, and most of the web pages back from when the HER movie came out, well they don’t exist any longer. It’s a big 404 error. But why should people stay on a fake internet or meet with imperfect mirrors for friendship, psychological help or even help with their homework? Isn’t there a real person around? Anybody who actually cares?

Maybe the ‘dead internet theory’ is more true than it once seemed. The “dead internet theory” has an explanation: AI and bot-generated content has surpassed the human-generated internet. The bot traffic on X is really wild, inflating both traffic to Ads and making it look like posts have more activity than they do, but with the comments you can ascertain that not many of them are real people.

Sam Altman trying to cheat Scarlett Johansson is so ludicrous! But the backdrop is a silent epidemic. Technological loneliness is only going to get worse in this context. A third of Americans say that they feel lonelier than ever before. This according to a 2023 study, conducted by A/B Consulting in partnership with Maveron VC firm.

The truth is this harms productivity, it doesn’t enhance it. ChatGPT might actually be more harmful than helpful.

OpenAI or its dumbed-down version of AGI won’t solve or help the world.

Does anybody who is seriously lonely care that GPT-4o is omnimodal or can have a more human sounding voice? There is something very wrong with society in 2024.

The next internet based on Generative AI might be even more nefarious than the last. This isn’t just a Babel Project, this is about how the next generation is being brainwashed. It’s not just about what social media apps did to kids and are still doing to youth, (and all of us) but what AI companions and chatbots are doing to us. But oddly, I don’t even hear that story as much.

What about the plagiarism of human connection? Can you mimic intimacy as BigTech profits from the attempt? Apparently OpenAI thinks it’s a good goal to strive for.

Illustration by Nicolás Ortega, for the Atlantic, 2022, from Coenraet Decker, 1679.

Is GPT-4o supposed to replace a person in some cases? Is that a healthy way of building an interface? Why would you want to replace a ‘median human’ at a task or their job? Is Sam Altman secretly a sociopath? I mean, it wouldn’t surprise me.

I was dumbfounded for years about how terrible Alexa, Google assistant or Siri felt, and how bad their voice was. But is this the answer?

“The updated voice can mimic a wider range of human emotions, and allows the user to interrupt. It chatted with users with fewer delays, and identified an OpenAI executive’s emotion based on a video chat where he was grinning.”

Do I want any of these petty machine learning engineers to feel like Gods in their imitative tools? Sam Altman’s psyche seems devoid of ethical reasoning. It’s all so very disturbing. OpenAI should bear some consequences for its tone-deaf attempt.

The GPT-4o demo had such an uncanny similarity to Johansson’s assistant in Her, it led to headlines and even a Saturday Night Live joke about the comparison. This isn’t good PR for OpenAI.

There have been lots of comparisons between GPT-4o and the 2013 movie “Her,” in which a man falls in love with his A.I. assistant, voiced by Scarlett Johansson. But there is something morbid about a culture that glorifies technological loneliness and the pill, or in this case the chatbot, that is the righteous solution? This isn’t a religion you know.

When the Verge thinks it has to write tech-optimism propaganda for OpenAI’s marketing team, we have a problem. That’s not journalism, yo.

The address for ChatGPT has changed, moving from chat.openai.com to chatgpt.com, suggesting a significant commitment to AI as a product rather than an experiment. But are we the product, again? Something just seems off.

OpenAI feels increasingly dystopian, increasingly the sort of company whose products I want to keep as far as possible from me and my life.

GPT-4o is 2x faster, half the price, and has 5x higher rate limits compared to GPT-4 Turbo. But for the consumer, this feels like a hack. And not the good kind.

On May 13th, 2024 Sam Altman made his most important Tweet ever:

It said only one word:

Statement from Scarlett Johansson on the OpenAI situation

“Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and AI. He said he felt that my voice would be comforting to people.
After much consideration and for personal reasons, I declined the offer. Nine months later, my friends, family and the general public all noted how much the newest system named “Sky” sounded like me.
When I heard the released demo, I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference. Mr. Altman even insinuated that the similarity was intentional, tweeting a single word “her” - a reference to the film in which I voiced a chat system, Samantha, who forms an intimate relationship with a human.
Two days before the ChatGPT 4.0 demo was released, Mr. Altman contacted my agent, asking me to reconsider. Before we could connect, the system was out there.
As a result of their actions, I was forced to hire legal counsel, who wrote two letters to Mr. Altman and OpenAI, setting out what they had done and asking them to detail the exact process by which they created the “Sky” voice. Consequently, OpenAI reluctantly agreed to take down the “Sky” voice.
In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity. I look forward to resolution in the form of transparency and the passage of appropriate legislation to help ensure that individual rights are protected.”

Scarlett Johansson and OpenAI’s Sky Flirty Voice

How did OpenAI clone her voice while claiming they use voice-actors? The Verge is struggling so much, it needs to make TikTok/IG Videos about it (Credit: The Verge):

verge

A post shared by @verge

I find the story mostly hilarious, but there are dystopian black mirror angles here as well. What do you think? Why didn’t OpenAI just partner with ElevenLabs who actually seem to know what they are doing with voices. It just makes no sense.

BigTech and AI

Discussion about this post