Summary: We tested the Free versions of Doubao, Gemini, Grok, and ChatGPT in a real-time “Voice Call” battle. The test subject? A tricky Beijing dialect word: “大栅栏” (Dàshílànr). 摘要: 明天就是立春了!我们对豆包、Gemini、Grok 和 ChatGPT(均为免费版 APP)进行了实时的“中文语音通话”测试。试金石只有一个:你能读对北京话“大栅栏”吗?
Voice & Emotion (99/100): Indistinguishable from human. The tone, the pauses, the “breath”—it’s perfect. It captures subtle emotions instantly. 听感无敌: 简直就是真人!语气、语调、情感拿捏得死死的。它不仅听得懂,还能接得住你的情绪。
The “Dàshílànr” Test: Pass (Instant). No teaching needed. It read “Dà Shí Lànr” perfectly on the first try. 秒过: 不需要教,张嘴就是标准的发音。
Context (Fail): Goldfish Memory. It forgets what you said 3 sentences ago. Great for small talk, bad for deep work. 金鱼记忆: 聊着聊着就忘了上句说啥了。适合闲聊,不适合干活。
2. 👽 Grok (xAI) – “The Funny Foreign Student”
Mode: Partner (伙伴)
Listening (75/100): Hard of Hearing. Like a foreign student who just learned Chinese. It speaks well but struggles to understand accents or non-standard English/Chinese. You have to speak slowly. 有点耳背: 就像一个刚来中国的留学生。你得用标准的普通话慢点说。稍微吞个音或者夹杂点不标准的英文,它就开始“阿巴阿巴”了。
Voice & Personality (95/100): Fun & Creative! Despite the bad listening, the output is amazing. Great pronunciation, funny jokes, and a very distinct “vibe”. 有趣的灵魂: 虽然听力差,但说得真好!发音标准,而且性格超级有趣,天马行空,很有梗。
The “Dàshílànr” Test: Fast Learner. Initially read “Da Zha Lan”, but learned the dialect pronunciation after just 2 tries. 聪明好学: 起初不会,但教了两次马上就学会了,非常地道,还会缠着你 让你多教几个,非常好学。
3. 🤖 Gemini (Google) – “The Smart Robot”
Mode: Gemini Live
Voice & Emotion (40/100): Robotic. Sounds like Windows 98 TTS. It also changes voices randomly (Identity crisis?), which breaks the immersion. 莫得感情: 纯机械音,感觉像是在听播音员念稿子。最离谱的是,聊三句它能换三个人的声线。
The “Dàshílànr” Test: Teachable but Robotic. It can learn the pronunciation, but it still sounds stiff. 能学会: 发音是对了,但那股机器味儿去不掉。
Context (100/100): Top Tier. It remembers everything. Complex logic, mixed languages, fast speech—it handles it all. 最强大脑: 无论你语速多快、逻辑多绕,它都能精准捕捉。记忆力超群。
4. 🧠 ChatGPT (OpenAI) – “The Stubborn Professor”
Mode: Standard Voice (Free)
ADVERTISEMENT
Voice & Emotion (65/100): Better than Gemini, but still clearly AI. Stable but boring. 固执的教书先生: 比 Gemini 强点有限。能听出是一个固定的“人”在说话,但依然有明显的播音腔。
The “Dàshílànr” Test: Refused to Learn (Fail). It insists on reading the dictionary pronunciation. Even when corrected, it argues back. 老顽固: 根本不学!非要读字典里的标准音。你告诉它读错了,它还跟你犟,拒绝学习方言。
📊 Final Scoreboard (综合评分表)
We scored them out of 100 based on 5 key dimensions.
(满分 100 分,基于听力、发音、情感、学习能力、上下文五个维度进行加权评测。)
Feature (维度)
🇨🇳 Doubao
👽 Grok
🤖 Gemini
🧠 ChatGPT
Listening (听力)
99 (Perfect)
60 (Deaf/耳背)
95 (Precise)
90 (Stable)
Pronunciation (发音)
99 (Native)
95 (Standard)
40 (Robotic)
70 (Average)
Emotion (情感)
100 (Human)
95 (Funny/有趣)
20 (None/冷漠)
60 (Boring)
Learning (学习能力)
——
90 (Fast/快)
80 (Okay)
20 (Refused)
Context (上下文)
50 (Goldfish/差)
85 (Good)
99 (Top Tier)
90 (Reliable)
TOTAL SCORE (总分)
87 👑
85 🥈
67
66
💡 Score Breakdown (评分详解):
Doubao: Wins on Voice & Emotion. It is the only “Human” here. But loses hard on Context (memory).赢在声音和情感,简直是真人。但输在记忆力,典型的傻白甜😊
Grok: A weird mix. Bad Hearing (60) but Great Speaking (95). High EQ, funny vibe.听力像耳背的老外,但发音和性格巨好。主打一个颜值即正义,看着她在屏幕里蹦蹦跳跳的,还那么爱学习,听不懂又怎样?我可以学英文啊😍
Gemini: The ultimate tool. Ugly Voice (40) but Beautiful Mind (Context 99).究极工具人。声音难听,莫得感情,但脑子最好使,记性最好,适合打字交流😎
ChatGPT: Mediocre. It passes, but excels at nothing in the Free version.平庸。各方面都能及格,但在免费版语音里没有亮眼之处😢
🏆 选购指南 (The Verdict)
Best for Loneliness (闲聊/陪伴): 👉 Doubao (豆包). Nothing beats its voice. (没有什么比它的声音更治愈了。)
Best for Fun/Vibe (娱乐/整活): 👉 Grok. The funny foreign friend. (那个有趣的外国损友。)
Best for Work (干活/逻辑): 👉 Gemini. Ugly voice, beautiful mind. (虽然声音难听,但脑子好使。)
Best for Everything Else (其他/凑合用): 👉 ChatGPT. It’s just… ChatGPT. (就是 ChatGPT 咯。能用,但没什么好说的。)
Tags: AI Voice Test, Doubao vs Grok, Gemini Live, ChatGPT Voice, Chinese Dialect, 评测