Powering Automated Translation in times of Crisis
1. Information Availabilty
2. Sufficient Language Data
3. In-Domain Data
Rapid Domain Adaptation for Machine Translation with Monolingual Data for Google
Training of Neural Machine Translation Systems by the Open University of Catalonia
Antoni Oliver González, Director of the Translation and Technologies Postgraduate Degree Course, Explains that they have been using these corpora along with other available medical corpora and glossaries to train neural machine translation systems. These systems are used to translate abstracts of scientific papers about COVID-19
Development of a Multilingual Neural Machine Translation Model for Biomedical Data by Naver Labs Europe
Vassilina Nikoulina from the Naver Labs Europe Natural Language Processing Group explains that they have used these corpora in a multilingual and multi-domain neural machine translation model specialized for biomedical data and that enables translation into English from five languages (French, German, Italian, Spanish, and Korean). The usage of the TAUS Corona Crisis Corpora was used in combination with other corpora.
TAUS Estimate API as the Ultimate Risk Management Solution for a Global Technology Corporation
Based on examples of texts from one of the largest technology companies in the world, TAUS generated a large dataset and customized a quality prediction model. The accuracy rate achieved was 85%.
Speech Data Collection to Increase Performance & Diversity in Voice-based AI Systems
For a multinational technology corporation, TAUS curated a diverse team of workers who created over 1,400 hours of speech data in English (GB) in nine specific dialects with no recurring submissions from one person.
Customization of Amazon Active Custom Translate with TAUS Data
The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum.