Case Studies
MT QUALITY ESTIMATION

TAUS Estimate API as the Ultimate Risk Management Solution for a Global Technology Corporation

Our client is one of the largest technology companies in the world. Based on examples of texts from the client, TAUS generated a large dataset and customized a quality prediction model. The accuracy rate achieved was 85%.

Read more
thumbnail_estimate
MT QUALITY ESTIMATION

Unlocking Efficiency: Leveraging DeMT Estimate API to Optimize MT Workflows

MotionPoint, a global technology solutions company, partnered with TAUS to determine whether Machine Translation Quality Estimation (MTQE) models could be used to remove the human post-editing (PE) from certain machine translation (MT) workflows.

Read more
MT QUALITY ESTIMATION

DeMT™ Estimate API enables saving up to 76% in lead time and costs

Yamagata Europe, a leading language service provider, partnered with TAUS to streamline their translation process for a major automotive client. By implementing a customized Machine Translation Quality Estimation (MTQE) model, they achieved remarkable results, reducing post-editing (PE) efforts by up to 76% and gaining valuable insights into translation quality, saving time and costs.

Read more
SPEECH DATA

Speech Data Collection to Increase Performance & Diversity in Voice-based AI Systems

For a multinational technology corporation, TAUS curated a diverse team of workers who created over 1,400 hours of speech data in English (GB) in nine specific dialects with no recurring submissions from one person. Quality in speech data is tightly related to the diversity of accents and demographics of the community that provides the data. That’s where the TAUS Human Language Project Platform can help.

Read more
MACHINE TRANSLATION

Enabling 15% Increase in Number of Perfect Translations for ING Hubs Poland

Our client is ING Hubs Poland, a leading multinational banking and financial services corporation. The TAUS datasets improved the number of translations rated perfect by human testers by 15% and the output from the engine trained with TAUS datasets will be better than the untrained 95% of the time in Anti Money Laundering (AML) and Human Resources (HR) domains.

Read more
MACHINE TRANSLATION

Customizing MT in a Narrow Domain with 19% Quality Improvement

TAUS provided 172.980 segments of training data in FR-DE language pair, in a very specific area of the broadly legal domain for Custom MT, one of the latest and leading MT services companies. Custom MT measured a 19% increase (+7.23 BLEU points) in the output for the French-German language pair. 

Read more
MT CUSTOMIZATION

Customization of Amazon Active Custom Translate with TAUS Data

Polyglot Technology LLC independently evaluated the quality of machine translation output from Amazon Translate customized with TAUS Data compared to non-customized. The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum.

Read more
MACHINE TRANSLATION

Improving Adaptive MT Outputs by 22% BLEU Scores Across Five Languages

TAUS provided Pangeanic 1.8 million words of MT training data in English to Spanish, German, Polish, Russian, and Chinese language pairs. Using the data provided by TAUS, Pangeanic built COVID-19 domain-specific NMT models. On average 22% BLEU score improvement was achieved with 50% increase in English - Russian language pair.

Read more
DATA ANNOTATION

Data Annotation to Optimize Searchability in E-Commerce

For our client, a multinational e-commerce corporation, a community of 200+ TAUS contributors was formed based on their product affinity in various product categories, ranging from make-up to collectible coins to annotate data in several European languages. The annotated data was to be used in training client's high-tech ML system to optimize webshop functionalities such as searchability.

Read more
DATA GENERATION

Domain-Specific Training Data Generation for SYSTRAN

SYSTRAN, a leading AI-based translation technology company, partnered with TAUS to use these datasets to produce twelve translation models. After the training with the TAUS Corona datasets, the SYSTRAN engines improved on average 18% across all twelve language pairs compared to the SYSTRAN baseline engines.

Read more
MACHINE TRANSLATION

Corona Datasets Used by Google, Naver Labs and University of Catalonia

Google, Naver Labs and University of Catalonia used the corona specific datasets provided by TAUS to build MT models. Six datasets containing a total of 3,403,681 segments were provided by TAUS as a part of the industry collaboration effort initiated by TAUS.

Read more
Have a similar challenge you'd like to address? Let's partner to customize the exact project to advance your machine learning efforts.