Customization of Amazon Active Custom Translate with TAUS Data
icons-action-calendar19 Jan 2022
15 minute read
An independent BLEU score analysis on customization of Amazon Active Custom Translate with domain-specific TAUS datasets.

Online machine translation engines provide easy access to high-quality machine translations. They are optimized for content like news articles and social media posts that users of online platforms frequently translate.

Businesses often want to translate text with a different style and a specific topic. For enterprise use, online machine translation engines offer customization via sets of pre-existing translations that reflect the desired style and topic. This data is often called “parallel data” and TAUS makes such customization data available through TAUS Data Marketplace and provides all relevant data processing services. 

Polyglot Technology LLC independently evaluated the quality of machine translation output from Amazon Translate customized with TAUS Data (using Amazon Translate Active Custom Translation) compared to non-customized Amazon Translate.

The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum.

These are significant improvements that demonstrate the superiority of this
customized Amazon Translation Active Custom Translation for the Ecommerce,
Medical/Pharma and Financial domain over non-customized Amazon Translate. 

The evaluated datasets are also used as a part of the TAUS Data-Enhanced Machine Translation (DEMT) service that offers an end-to-end solution to those who wish to produce customized MT output for their specific domains, without the hassle of going through the actual MT training process. With TAUS DEMT, BLEU score points are proven to increase by 15.3% on average in the Ecommerce, Medical and Financial domains. Try TAUS DEMT now!



After graduating in computer science Achim Ruopp worked as a translator, opening his eyes to the myriad of possibilities of using computers in human language translation. He has been involved in enabling computers to process different languages and the translation business ever since. After deepening his knowledge with a master’s in computational linguistics he participated in a wide range of projects in machine translation research and practical integration of machine translation in the human translation process. Achim's goal in sharing his knowledge, experience and the latest developments in the field of machine translation is to break down barriers in cross-language communication. Polyglot Technology LLC helps customers succeed with machine translation by enabling them they make best use of data available to them, by assessing machine translation quality independent from MT vendors and by advising customers on how to best integrate the technology with people and processes.

Related Articles
icons-action-calendar19 Sep 2022

Working as a collaborative partner, our language data for MT training solutions helped facilitate an MT experiment to inform the efficiency of automated translation processes for ING Hubs Poland

, a leading multinational banking and financial services corporation. The TAUS datasets improved the number of translations rated perfect by human testers by 15% and it was observed that the output from the engine trained with TAUS datasets will be better than the untrained 95% of the time in Anti Money Laundering (AML) and Human Resources (HR) domains.

icons-action-calendar1 Feb 2022

TAUS provided 172.980 segments of training data in French-German language pair, in a very specific area of the broadly legal domain for Custom MT, one of the latest and leading MT services companies delivering affordable machine translation engine training, evaluation, and integration.