TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski

Episode 144 March 03, 2026 01:00:52
TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski
Nimdzi LIVE!
TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski

Mar 03 2026 | 01:00:52

/

Show Notes

In this session, we will explore how we evaluated the translation quality of Google’s Gemma model using the MQM framework and a human-in-the-loop review process.

The case study walks through how LLM-generated translations were assessed using structured error typology, how linguistic quality was benchmarked, and how AI-enhanced workflows can combine automated generation with professional post-editing and evaluation.

We’ll discuss:

How MQM works in real-world AI evaluation

What kinds of errors LLMs produce across languages

Where AI performs well — and where it still struggles

How to design scalable human-in-the-loop evaluation workflows

What this means for localization vendors and enterprise buyers

The session is based on a real case study conducted by Alconost’s MT evaluation team using our MQM evaluation tool.

Full case:
https://alconost.mt/mqm-tool/case-studies/translategemma/

Other Episodes

Episode 2

March 16, 2021 00:56:42
Episode Cover

Process-driven revenue growth for LSPs (feat. John Flannery)

POP UP LIVE EVENT with Tucker Johnson and John E. Flannery. Sales is not magic. Sales is a process. Today we talk with John...

Listen

Episode 102

November 17, 2023 00:53:07
Episode Cover

Bridge the gap: AI to enhance translation workflows feat. Joe and Helena

Unlock the potential of AI in translation workflows for faster, more efficient, and error-free results by learning to customize and train AI models to...

Listen

Episode 36

September 28, 2021 01:11:45
Episode Cover

Start Localization at the Design Stage

Design-stage localization is a powerful way to continuously release fully localized products like mobile apps, web apps, and games. Because just like localization, design...

Listen