TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski

Episode 144 March 03, 2026 01:00:52
TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski
Nimdzi LIVE!
TranslateGemma Quality Evaluation / Stress Test feat Alex Murauski

Mar 03 2026 | 01:00:52

/

Show Notes

In this session, we will explore how we evaluated the translation quality of Google’s Gemma model using the MQM framework and a human-in-the-loop review process.

The case study walks through how LLM-generated translations were assessed using structured error typology, how linguistic quality was benchmarked, and how AI-enhanced workflows can combine automated generation with professional post-editing and evaluation.

We’ll discuss:

How MQM works in real-world AI evaluation

What kinds of errors LLMs produce across languages

Where AI performs well — and where it still struggles

How to design scalable human-in-the-loop evaluation workflows

What this means for localization vendors and enterprise buyers

The session is based on a real case study conducted by Alconost’s MT evaluation team using our MQM evaluation tool.

Full case:
https://alconost.mt/mqm-tool/case-studies/translategemma/

Other Episodes

Episode 98

September 29, 2023 00:55:15
Episode Cover

Working Globally Is The New Normal feat. Giulia Tarditi

More of us are working remotely than ever before. This has profoundly impacted who your colleagues are, where they're from, where they're based, and...

Listen

Episode 59

August 13, 2022 00:55:53
Episode Cover

RESULTS: DEMT Evaluation Report (feat. Anne-Maj van der Meer, Amir Kamran, & Achim Ruopp)

The results are in: TAUS Date-enhanced Machine Translation (DeMT™) improves MT output of the major MT engines with an average of 25% in BLEU...

Listen

Episode 95

September 27, 2023 00:47:19
Episode Cover

Shifting the Mindset feat. Sasan Banava

When it comes to getting the job done, mindset is key. And Sasan Banava knows all about it. Join our next episode with Sasan...

Listen