High-performance Search Technology
Corpus Optimization and Cleaning
JP Barraza, CIO at SYSTRAN
Mentioned training datasets are available in the TAUS Data Library.
Talk to our Data Experts to help you find the right type of data for your next project. Niche domains or rare languages? We have a large suite of services to generate your dataset.
TAUS Estimate API as the Ultimate Risk Management Solution for a Global Technology Corporation
Based on examples of texts from one of the largest technology companies in the world, TAUS generated a large dataset and customized a quality prediction model. The accuracy rate achieved was 85%.
Speech Data Collection to Increase Performance & Diversity in Voice-based AI Systems
For a multinational technology corporation, TAUS curated a diverse team of workers who created over 1,400 hours of speech data in English (GB) in nine specific dialects with no recurring submissions from one person.
Customization of Amazon Active Custom Translate with TAUS Data
The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum.