Polyglot Technology LLC
TAUS asked Achim Ruopp, owner of Polyglot Technology LLC to independently evaluate the quality of machine translation of Amazon Translate customized with TAUS Data (using Amazon Translate Active Custom Translation) compared to non-customized Amazon Translate
The BLEU Score Automatic Metric
TAUS Test Set
TAUS selects the machine translation customization data by querying its large repository of highquality translation data with a domain-specific text. The resulting customization dataset is then split at random into a larger training set for Amazon Translate Active Custom Translation and a smaller 2,000 sentence test set that was provided to Polyglot Technology for evaluation with the BLEU Score.
Summary Evaluation Results
The customization of Amazon Translate with TAUS Data improved the BLEU score for all language pairs (as evaluated with the test sets):
- by more than 6 BLEU points, or 15.3% on average
- by 2 BLEU points at a minimum
For an even more detailed evaluation including analysis of most improved translations, please see the individual evaluation reports for each language pair and domain available on request from firstname.lastname@example.org.
Figure 1: BLEU Scores for the TAUS Test Sets for the E-Commerce Domain
TAUS Matching Data
Interpreting BLEU Scores
The paragraphs in this section are adapted from Google AutoML Translate's documentation page on evaluation which is licensed under the Creative Commons 4.0 Attribution License
BLEU (BiLingual Evaluation Understudy) is a metric for automatically evaluating machine-translated text. The BLEU score is a number between zero and one that measures the similarity of the machine-translated text to a set of high quality reference translations. A value of 0 means that the machine-translated output has no overlap with the reference translation (low quality) while a value of 1 means there is perfect overlap with the reference translations (high quality).
TAUS Estimate API as the Ultimate Risk Management Solution for a Global Technology Corporation
Based on examples of texts from one of the largest technology companies in the world, TAUS generated a large dataset and customized a quality prediction model. The accuracy rate achieved was 85%.
Domain-Specific Training Data Generation for SYSTRAN
After the training with TAUS datasets in the pandemic domain, the SYSTRAN engines improved on average by 18% across all twelve language pairs compared to the baseline engines.
Speech Data Collection to Increase Performance & Diversity in Voice-based AI Systems
TAUS curated a diverse team of workers who created over 1,400 hours of speech data in English (GB) in nine specific dialects with no recurring submissions from one person.