2 companies, 1 technology, opposite bets
Three weeks, two letters, zero consensus on whether a voice should be human. Here's the question that settles it
Two letters reached me in the space of a few weeks. I didn’t go looking for either.
I’m a paying Blinkist subscriber and Solid Gold is a fairly large (and growing) Spotify for Authors user, so Blinkist’s and Spotify’s both landed in my inbox.
Two companies in the same business… getting books into your ears.
And they reached opposite conclusions about the same technology.
On May 21, at its Investor Day, Spotify told authors it was leaning into AI narration. Its new Audiobook Creation Tools use ElevenLabs voice models to turn a finished manuscript into a complete audiobook. The framing was abundance: a new door, a wider audience, discovery “through a conversation about exactly the kind of story you’re in the mood for.”
Blinkist went the other way. After two years of quietly shipping AI-narrated Blinks, it’s pulling the plug. “No AI-narration, no matter how polished, can deliver the same depth that a human narrator can,” the company wrote. New Blinks will be human-read. The old AI ones get a label, like a nutrition warning.
So which is it? Is AI narration the future of audio or a failed experiment? The honest answer, the actual state of this emerging nation, is that nobody knows yet, and the split isn’t random. It tracks what each company is selling.
Blinkist sells curation. You don’t go there for the book; you go for the fifteen-minute distillation of it, performed by someone who knows how to land an insight. The narrator isn’t reading the product. The narrator is part of the product. Strip the human out and you’re left with a robot reading the study notes. No wonder it felt hollow. Their whole pitch is that a person made choices on your behalf.
Spotify sells a marketplace. Its problem isn’t depth, it’s supply. Most self-published authors will never afford a studio and a voice actor, so their books don’t exist in audio at all. For them, an ElevenLabs read isn’t a downgrade from human narration. It’s the difference between an audiobook and no audiobook. AI doesn’t dilute the catalogue; it fills shelves that were empty.
Same technology, opposite verdicts, and each is right for its own economics. That’s worth sitting with, because the boldest takes on AI insist there’s a single answer coming. There isn’t. The technology is an input, and an input is only as good or bad as the thing it’s an input to.
A couple of things to watch.
First, the labels. Blinkist is going to mark its AI Blinks so listeners can choose. Spotify, notably, puts the review step on the author (you can fix pronunciation before publishing) but says less about whether the listener will know. Disclosure is becoming a differentiating battleground too. Quietly, “made by AI” is turning into information consumers expect to have.
Second, the quiet part. Blinkist tested AI narration for two years and only announced it when killing it. Spotify announced before launch: “specifics may change,” “closed beta,” “still ahead of us.” One company is doing damage control; the other is managing expectations. Read both letters again and you’ll notice neither is really about technology. They’re about trust, and who’s allowed to spend yours.
And the form says as much as the words. Both came by email, but that’s where they part ways. Blinkist’s email was the letter itself, with a photograph of the person who wrote it, the whole message delivered to me. Spotify’s was a nudge with a link (go read the announcement if you feel like it) pointing to a clinical web page published under no one’s name. The company defending the human touch came to me as a person. The one automating it sent a URL.
The comfortable story is that AI either wins everywhere or fails everywhere. The week of May 2026 says something more boring and more true: it depends. It depends on whether you’re selling the thing or selling the curation of the thing. Blinkist decided the human was the point. Spotify decided the human was the bottleneck. Both can be right. And watching a streaming giant and a niche specialist bet against each other this publicly is, frankly, a grab-the-popcorn moment.
From where we sit
Except we don’t get to just watch.
We run a studio that produces a lot of podcasts and audiobooks, and we live on both sides of that line every week.
Some jobs are purely functional: on-hold messaging, training-video narration, the budget work where nobody’s buying the voice, just the information. For those, we’ll often deploy an AI clone of a voice we already work with. The talent still gets paid (less, and without doing any fresh work), the turnaround collapses, and the listener loses nothing, because there was never a performance to lose. That’s Spotify’s logic, applied one booth at a time.
But a lot of what we do is author-read prescriptive non-fiction: business strategy, change management, the books where someone is asking you to do something differently. There, the voice isn’t decoration. It’s evidence. An author who reads their own book is staking their name on every sentence, and the listener hears it: this person believes what they wrote enough to say it out loud. You can’t clone that. The whole point is that it’s them. It’s table stakes. That’s Blinkist’s logic, and we’d take that bet every time the goal is trust rather than throughput.
So the question we ask before any project isn’t “human or AI?”
It’s… “what is this voice doing?”
Sometimes a voice is simply a delivery system for words. The information is the point, and the voice carries it from page to ear. When that’s the job, fidelity isn’t what you’re optimising for. Price and speed are. (A pleasant voice is a bonus, not the brief.) You want it good enough, fast, and cheap, and increasingly that’s AI: a voice clone turns a day in the booth into a few minutes at a greatly reduced rate. Nothing is lost, because nothing was being promised.
But sometimes the voice isn’t simply carrying the words. It’s carrying the person.
When an author reads their own prescriptive book, the message isn’t only what they say. It’s that they showed up to say it. The voice is the proof that someone stands behind the page. Clone it and you haven’t cut a corner; you’ve removed the thing the listener came for. An AI version of an author saying “trust me” is a recording of a promise nobody actually made.
That’s the whole rule, and it’s not complicated. A voice either delivers content or it delivers conviction. Spotify is solving for content. Blinkist remembered it was selling conviction. They were never in disagreement. They were answering different questions.
So before your next audiobook, ask the question we ask: what is this voice doing? Carrying the words, or carrying you? If you already know the answer, you’re most of the way there. If you’re not sure… that’s exactly the conversation we love.
#SoundAdvice #BeHeard
P.S. Yes, we help some of our authors clone their own voices for the budget work, and we put others behind the mic for days on end to read every word themselves. Both, on purpose. The art is knowing which book is which. ;-)




