Intro: Translation apps are now so ubiquitous that it is easy to overlook their limits. We use them to navigate unfamiliar places, from ordering off foreign menus to finding our way around new cities. In many everyday situations, a bad translation is inconvenient. In medicine, it can mean a missed diagnosis, a wrong dose, or, in the worst cases, a fatal outcome.

Just as most of us would rather see a doctor who excelled in their training than one who only scraped through, trusting the “bare pass” version of translation when lives are on the line is an invitation to avoidable harm. In a study of Google Translate on written discharge instructions, researchers at the University of California, San Francisco found that 92% of Spanish and 81% of Chinese translations were accurate, but 2% of Spanish and 8% of Chinese sentences had potential for clinically significant harm. In one case, a directive to “hold the kidney medicine” (stop taking a drug that could cause kidney damage) was mistranslated as “keep the medication” or “keep taking the medication”.

A recent comparative review of professional translators, Google Translate and ChatGPT‑4 for discharge instructions in Spanish, Brazilian Portuguese and Haitian Creole reached a similar conclusion: despite improved performance, critical errors persist, especially in low‑resource languages, and machine systems are inappropriate for unsupervised clinical use.

A rough translation might help a traveller. In medicine, relying on consumer‑grade translation apps to render consent forms or dosage instructions is a gamble with lives.

What the latest evidence shows

In late 2025, Irish researchers published a scoping review in BJGP Open on AI‑mediated communication with refugee and migrant patients in general practice, drawing on five international studies. They found that clinicians frequently used tools such as Google Translate to fill interpreter gaps, despite limited evidence on their safety, concerns about translation errors in complex consultations and a lack of clear guidance on appropriate use.

Separate UK research by the Chartered Institute of Linguists and the University of Bristol suggests that around a third of frontline public‑service workers already use machine or AI translation at work, often in face‑to‑face encounters and on personal devices via public browser interfaces, with little institutional oversight or training. Commentators argue this reflects an emerging policy vacuum, particularly given existing NHS advice against relying on machine translation for clinical decision‑making. Risks include clinically significant mistranslations, “hallucinated” text, and breaches of patient confidentiality, which can distort treatment and exacerbate health inequalities for non‑English speakers, who can perceive machine translation as disrespectful.

NHS England has agreed. Its 2025 Improvement framework: community language translation and interpreting services warns reliance on translation apps in community services – which might be convenient – increases the risk of medical error and undermines informed consent. It acknowledges that many services rely on these tools “not out of choice but necessity”, urging that AI should serve only as a last resort for low‑risk, administrative tasks, accompanied by national safety standards, staff training, and renewed investment in professional interpreters.

Why generic AI is unsuitable in clinical settings

The failings described in such research are not outliers: this is how general‑purpose translation systems work.

Consumer translation tools optimise for speed and fluency. They predict plausible wording based on vast amounts of general text, which may suffice for daily communication, but not in nuanced situations where, for example, they might be required to explain chemotherapy risks, capture safeguarding disclosures or discuss end‑of‑life choices. Accuracy also declines for low‑resource languages, which tend to be spoken by patients who are already at risk of a higher illness burden and poorer access to care.

Accountability is another issue. The Irish review reports that clinicians were unsure who would be held responsible if an AI‑mediated conversation led to harm: the individual doctor, the practice, or the technology provider. That goes against the grain of established clinical governance. Without defined safety standards, indemnity cover and incident‑reporting pathways, unregulated use of consumer apps does not align with professional codes and duty of care.

The solution to AI in healthcare involves mandating Human-Centred AI architectures where AI serves as “augmented intelligence” for administrative tasks while requiring a Human-in-the-loop (HITL) for clinical validation. This ensures safety in high-stakes environments.

Another factor is data protection. Consumer translation services typically process content through external servers, governed by commercial terms and conditions. Patients are rarely told how their words might be stored, repurposed or used to train models. NHS England’s framework explicitly warns poor‑quality or partial translations jeopardise both confidentiality and informed consent; if patients cannot understand what is being done with their data, they cannot give informed consent.

Health advocacy group National Voices’ report, Community Languages: Translation and Interpreting Services, highlights a systemic failure in the UK’s healthcare system to provide adequate language support. It argues that language barriers are not just a “communication hurdle” but a risk to patient safety and health equity as they drive delayed diagnoses, missed appointments, and – most critically – treatment performed without meaningful consent. They highlight harrowing accounts of women undergoing procedures they didn’t understand and patients signing forms they could not read. When services substitute professional interpreters with unregulated apps, they effectively discharge clinical risk onto those with the least power to challenge it.

This is not only about AI accuracy; it’s a mismatch between a technology designed for convenience and a domain that demands rigour, accountability and equity.

Language access is a matter of patient safety. NHS England’s community‑language framework places trained interpreters and professional translation at the centre of safe care, while National Voices goes further, calling for language needs to be systematically logged, translated information to be routine and tailored, interpreter quality to be regulated, and communities to influence how such services work.

AI has a vital, yet strictly governed, supporting role here. It can draft letters or appointment reminders for clinicians to check, but it cannot stand in for a trained interpreter.

Any deployment requires strict boundaries: clear organisational sign‑off, defined clinical‑safety and information‑governance approval, and a clear process for flagging and investigating errors.

Author

AIJ Thought Leader

View all posts

AIJ Thought Leader 31 minutes ago

4 minutes read

The clinical liability of convenience: Why general-purpose AI fails the patient safety test

Standfirst: Off‑the‑shelf translation apps are creeping into clinical consultations as a quick fix for language barriers, but convenience comes at the cost of safety and equality, warns Matt Taylor, CEO of Dals.

What the latest evidence shows

Why generic AI is unsuitable in clinical settings

Author

What the latest evidence shows

Why generic AI is unsuitable in clinical settings

Author

Related Articles

AI Vacuums: What They Are and Why

How AI Can Help Close America’s Security Response Gap

How Artificial Intelligence is Transforming the Future of Live Event Production

Care at the core: why responsible AI matters most in dementia support