In this session, we will explore how we evaluated the translation quality of Google’s Gemma model using the MQM framework and a human-in-the-loop review process.
The case study walks through how LLM-generated translations were assessed using structured error typology, how linguistic quality was benchmarked, and how AI-enhanced workflows can combine automated generation with professional post-editing and evaluation.
We’ll discuss:
How MQM works in real-world AI evaluation
What kinds of errors LLMs produce across languages
Where AI performs well — and where it still struggles
How to design scalable human-in-the-loop evaluation workflows
What this means for localization vendors and enterprise buyers
The session is based on a real case study conducted by Alconost’s MT evaluation team using our MQM evaluation tool.
Full case:
https://alconost.mt/mqm-tool/case-studies/translategemma/
Salespeople working in translation usually have a hard time keeping track of all that happens with customers, so they become very transactional.In the meantime,...
Now that President Trump has officially signed the Executive Order designating English as the official language of the United States, the full text of...
Pop-up LIVE discussion: Findings from the 2021 Globalization Report Card (feat. John Yunker). John Yunker is co-founder of Byte Level Research (www.bytelevel.com) and is...