Discussion about this post

User's avatar
Neural Foundry's avatar

Spot-on demonstration of where we are on Mori's curve right now. The prosody issue you mention is critical, AI voices nail lexical stress but often miss the pragmatic layer (where humans modulate pitch/timing based on discourse context, not just word meaning). What's wild is how the valley depth varies by listener familiarity with synth voices. People who interact with voice agents daily seem to have shifted their threshold, they're less bothered by mismatches that would've triggered revulsion a year ago. Adaptation is hapening faster than the technology is improving.

1 more comment...

No posts

Ready for more?