The Shocking Truth About AI Interpreting: WHO Study Reveals Why It's a Disaster Waiting to Happen

June 27, 2025

Event organizers and corporate clients, listen carefully: If you're considering AI interpreting to save costs at your next international conference or business summit, a groundbreaking World Health Organization (WHO) report delivers an urgent warning. Based on the most rigorous evaluation ever conducted on AI interpreting in professional settings, the findings are unequivocal: AI interpreting is dangerously unreliable and poses unacceptable risks to your reputation, speaker relationships, and event integrity.

The WHO's Unprecedented Test: Rigor Meets Real-World Challenges

The WHO didn't test AI in a lab. They put it through the wringer using real speeches from the 2024 World Health Assembly (WHA) – high-stakes political and technical statements in all 6 UN languages (Arabic, Chinese, English, French, Russian, Spanish). They evaluated 90 language combinations using the industry-leading platform Wordly (after other major providers like KUDO and Interprefy failed to participate).

The test simulated brutal real-world conditions:

  • Extreme speech challenges: Fast talkers (up to 185 words per minute!), complex jargon, numbers, cultural references, names, emotional content, and code-switching (e.g., a speaker switching between languages).
  • Professional grading: Scored by WHO linguists and University of Geneva experts using UN interpreter exam standards, focusing on accuracy, logic, and reputational risk. Passing grade: 75%.
  • "Reputational Risk" (RR) as a deal-breaker: Any error that could ridicule a speaker, misrepresent a country/organization, or cause diplomatic fallout instantly failed the interpretation – even if the overall score was high.

The Jaw-Dropping Results: Failure on an Industrial Scale

The findings weren't just disappointing; they were catastrophic:

  1. Overall Failure Rate: 99%. Only ONE out of 90 interpretations passed (English to French, barely at 75%). 89 failed.
  2. Abysmal Average Score: 46%. Far below the passing mark of 75%. This is not a "B-" effort; it's a systemic collapse.
  3. Chinese: The Weakest Link. Interpreting into Chinese scored the lowest average (40%). Even interpreting from Chinese only managed around 45%. This is critical for events involving Mandarin speakers, as WHO noted these were prepared, clear speeches – far easier than spontaneous discussions common in business meetings.
  4. Reputational Risks in EVERY Speech. 100% of speeches contained at least one major reputational risk error. Some had up to 9 RRs per speech! Interpreting into English had the highest total RR count (46).

Why AI Fails Spectacularly: Stunning Examples of Disaster

The report details horrifying errors that go beyond minor glitches – they are brand and relationship killers:

  • Diplomatic Nightmares: "Hamas" (referenced in a Spanish statement on terrorism) was translated into Arabic as "United States" (الولايات المتحدة) and the transcription said "US". (Potential for international incidents!) "Brunei Darussalam" (a country) interpreted from Chinese into Arabic as "the brunette Russel" (برونيت راسل).
  • Offensive Misrepresentations & Ridicule: "Joy Bangla!" (Bangladesh's national slogan) was interpreted by AI as the female name of the meeting Chairperson in gendered languages (e.g., "la présidente Joy Bangla" in French). The speaker was ridiculed, and the Chair potentially offended. Dr. Moeti, WHO's African Regional Director, was consistently misgendered as male and referred to in Arabic as "our African".
  • Meaningless Gibberish & Medical Mayhem: Chinese "stratified health" became "airplane health" in all languages. "Transmission of polio" (Arabic) became "transportation" (due to similar Arabic words). "Hepatitis" (French) became "Ebola" in Arabic.
  • Critical Data & Name Failures: Figures and dates were routinely butchered, especially with zeros ("70% reduction" became "to 70%" in French/Russian – reversing the meaning). Country names like "Greece" became "Chris", "Haiti" became "Heidy".

Imagine this happening to your CEO, your keynote speaker, or a government official at your event. The damage to relationships and reputation would be severe and potentially irreversible.

The Silent Danger: "No Complaints" Does NOT Mean Success

Some event organizers in places like Taiwan have claimed, "We used AI, and no one complained!" The WHO report exposes why this is dangerously misleading:

  1. Most Audience Members Are Monolingual: They hear smooth-sounding speech and assume it's correct. They have no way to know if the CEO's key message, the technical data, or the country name was just horribly mangled. They accept nonsense as fact.
  2. AI Sounds Deceptively "Good": While monotonous, AI delivery often lacks obvious robotic glitches. This masks the rampant semantic errors happening underneath – making it more dangerous, not less. The meaning is wrong, but it sounds okay.
  3. The Speaker Knows (And Will Blame YOU): Your VIP speaker does understand the target language or will have advisors who do. When they hear their words turned into nonsense or offensive errors, they won't blame the AI. They will blame your organization for unprofessionalism and disrespect. Their silence during the event doesn't mean satisfaction afterwards.

The Unambiguous Verdict & Critical Warning

The WHO's conclusion is crystal clear and severe:

"AI interpretation is still at an experimental stage and is not fit for use in WHO meetings with external stakeholders... [It] may be used in internal meetings involving WHO staff only."

For Event Organizers & Corporate Clients: The Stakes Are Too High.

  • DO NOT USE AI INTERPRETING for ANY high-stakes communication: This includes government/NGO events, media-broadcast sessions, legal/medical/technical discussions, and any event with VIPs, foreign dignitaries, or international partners.
  • Beware the False Promise of "Accessibility": Offering incomprehensible or offensive AI output isn't inclusive; it's exclusionary and damaging. True language access requires professional human accuracy and cultural understanding.
  • Cost Saving ≠ Risk Mitigation: The potential cost of a single catastrophic misinterpretation (lost contracts, damaged relationships, PR disasters, diplomatic fallout) dwarfs the savings on human interpreters. Ask yourself: "If our speaker is ridiculed or misrepresented on a global stage, who pays the price?"
  • AI is an Assistant, NOT a Replacement: AI might help human interpreters with transcripts or glossary support, but it cannot replicate human judgment, context understanding, cultural nuance, or error correction in real-time.

The Bottom Line: Protect Your Event, Your Speakers, and Your Reputation

The WHO study, the most comprehensive of its kind, delivers a stark warning based on hard data from real-world, high-level meetings: AI interpreting technology is fundamentally not ready for professional use. Its error rate is catastrophically high, and the risks of reputational damage, diplomatic blunders, and speaker humiliation are immense and ever-present.

Do not gamble your event's success and your organization's reputation on experimental technology. Insist on professional human interpreters who possess the expertise, judgment, and cultural sensitivity essential for accurate and respectful communication. The stakes are simply too high to settle for anything less. Based on the WHO's findings, expecting reliable, safe AI interpreting within the next 5 years is dangerously optimistic. Human expertise remains irreplaceable.

If you wish to read and study the full report, please find it here: https://tinyurl.com/who-ai-report.

Contact us

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
If you want me to provide you Interpretation/Translation service in English to Chinese (Mandarin) and vice versa, or a quotation for conference interpreting services of any languages, simply contact me by phone or email.

Dr. Bernard Song