Machine Translation. We can do better!

5 minute read

TAUS recently launched DeMT™. This article provides context and more information on this recipe for improved MT.

MT has come a long way. After seventy years of research, the technology is now taken into production. And yet, we are missing out on the full opportunities. Because the developers are preoccupied with the idea that the massive models will magically solve the remaining problems. And because the operators in the translation industry are slow in developing new MT-centric translation strategies. This article is an appeal to everyone involved in the translation ecosystem to come off the fence and realize the full benefits of MT. We can do better!

From grassroots to MT industrialization

Gartner, in their AI Landscape 2020 blog article, declared industrialization and democratization of AI the two dominant trends for AI in 2021. All the hard work and experimentation with models and data by the early adopters and MT gurus has finally paid off. We are shifting from a bottom-up grassroots movement to a top-down directive coming from the executive suites. Enterprise-wide adoption of MT becomes part of the AI or digital transformation program and belongs now to the responsibility of the CIO or CTO. The technology is there to simply translate everything, launch new platforms and talk to many more users in their own languages. But then, too often we realize that the quality of automated translation is not on par with enterprise production requirements in terms of reliability and trust. The technology may be there, but the delivery is not yet good enough. We can do better! If we set the right priorities in our strategies. 

The big-tech MT developers

Most MT is sourced from the big tech companies Amazon, Google and Microsoft. They have the scale and the capital to develop the massive models that return the amazing results that we have seen in the last few years and that have been the driving force behind the industrialization of MT. The rapid improvements have been feeding an unprecedented optimism on the West Coast of the United States that remaining issues such as MT hallucinations, catastrophic errors and domains and languages not being covered yet by the technology will all go away within five to ten years.

One disturbing factor though is that the massive MT models are black boxes. Even the researchers who train them can’t tell exactly why one performs better than the other. The model work is glamorous and cool, but the intellectual insight that would allow us to reproduce bugs and remove them is hard to get. To get models to work in production, more data engineering is required than research. Well-focused data engineering can bring in the nuances that are required for robust performance in a real-world domain. The problem is that most researchers like to do the model work, not the data work. (See this Google Research article: Data Cascades in High-Stakes AI). 

Many MT platforms provide customization features allowing users to upload translation data and take care of their own data engineering work. These features however, as TAUS found out, require a lot of experimentation and experience.* In-domain training data have unpredictable, often low and sometimes even negative impact on the performance of the engines. It seems that the big tech companies treat their customization features as stop-gap measures for the time it takes until human parity is reached. Five to ten years?

The big tech MT developers can do better to support and facilitate the industrialization of MT. Here is how:

  • Don’t bet the future entirely on the brute force of the massive models 
  • Improve your customization features to better support your business customers in building production-ready engines.

MT users

The technology breakthroughs in the past years have caused a rise in the adoption of MT. Nothing spectacular or revolutionary though. The MT engines are simply plugged into the existing workflows and are being used as complementary sources for translation matches. Translators see their tasks changing more and more to post-editing. The new technology is mainly used to help the business drive for continuous efficiency gains and lower word rates, very much so in the tradition of thirty years of leveraging translation memories.

What we miss in the translation industry overall is blue-sky thinking. Putting aside a few start-up innovators, most of the actors in the translation industry have taken a defensive approach towards MT technology. The result is a general negative sentiment with emphasis on cost reductions, compromises in translation quality, disruption in the workforce and pessimistic perspectives on the future of the industry. The problem is that we are all so deeply rooted in our traditions, we can’t see through the present.

In their Market Guide for AI-Enabled Translation Services (June 2022) Gartner recommends that companies divide content into “tiers” of “acceptable translation quality” and develop new end-to-end workflows taking into account automation enabled by MT technology. Some start-up innovators have done exactly that, by putting MT technology at the core of a brand new real-time multilingual business solution. 

MT technology can be a force multiplier for those operators in the translation industry that are capable of shifting from a defensive to a proactive approach.

MT users, LSPs and enterprises can do better to support and facilitate the industrialization of MT. Here is how:

  • Focus on data engineering. Do not accept that the quality output of, among others, the Amazon, Google, Microsoft and Systran engines is as good as it can get. Significant improvements can be made using core competencies such as domain knowledge and linguistic expertise.
  • Design end-to-end MT-centric workflows. Do not think of MT as just an add-on to your current process and workflow but make it the core of new solutions serving new customers, translating content that was never translated before.
  • Provide new opportunities for linguists. Post-editing is not the end-game. Create new perspectives by leveraging intellectual insights for better automation.

TAUS recipe for better MT

TAUS has been an industry advocate for translation automation since 2005. We have developed a unique recipe for better MT as outlined here below.

1. Evaluate

The first step in every MT project is to measure and evaluate the translation quality. Most MT users are just measuring and comparing the baseline engines. TAUS takes the evaluation a step further. We train and customize different MT engines and then select the engine with the maximum achievable quality in the customer domain. See TAUS DeMT Evaluate.

2. Build

The second step is the creation of in-domain customer-specific training datasets, using a context-based ranking technique. Language data are sourced from the TAUS Data Marketplace, from the customer’s repositories or created on the Human Language Project platform. Advanced automatic cleaning features are applied. See TAUS DeMT Build.

3. Translate

The third step is then generating the improved machine translation. Improvements demonstrated show scores between 11% and 25% over the baseline engines from Amazon, Google and Microsoft. In many cases, this brings the quality up to levels equal to human translation or post-edited MT. Some customers refer to DeMT™ Translate as ‘zero-shot localization’, meaning that translated content goes directly to customers without post-editing. TAUS offers DeMT Translate via an API to LSPs and enterprises as a white-label product. 

* MT customization features require a lot of experimentation and experience. See TAUS DeMT™ Evaluation Report and contact a TAUS expert to learn how to best work with MT customization.


Jaap van der Meer founded TAUS in 2004. He is a language industry pioneer and visionary, who started his first translation company, INK, in The Netherlands in 1980. Jaap is a regular speaker at conferences and author of many articles about technologies, translation and globalization trends.

Related Articles
Understanding our new leap in Artificial Intelligence and its impact on the industry.
How is ChatGPT being used by the translation industry and is it really fit for the purposes of machine translation or language data for AI applications?
BLEU scores are essential to calculate translation precision: they compare reference translations with MT translation output, also known as candidate translations.