Loic Dufresne de Virel, Localization Strategist at Intel

“As a founding member of TAUS, we are very pleased with the thought leadership that TAUS has demonstrated with the creation of TAUS DATA, and its explosive growth. I’m looking forward to an even brighter future where we might add multilingual speech data to our current corpus of text data and help jumpstart further development in the area of multilingual speech processing.”

Peter Bourgonje, Infor

“When working with SMT, there’s no data like more data. And TAUS has a whole lot of it!”

Henry Wang, UTH International

“We have been working with TAUS since early 2015. The TAUS data are of high quality and therefore of great value to us. The TAUS personnel always respond quickly to our questions and I really enjoy working with them. We hope that we can establish a close partnership with TAUS in the near future. “

Spence Green, Stanford University

"TAUS data helped us replicate the commercial translation environment. We are not typically able to experiment with domains like software interface and product manual text. By helping align academic research with industry, TAUS is enabling the development of more readily applicable translation technology and the reporting of more interpretable results."

Dr. Daniel Cer, eBay

"We've built our own in-house English to Russian translation system using data from TAUS in combination with our own internal parallel data. We're very pleased with the combined results since merging our own data with the TAUS data produces much better performance than using either dataset in isolation."

Prof. Khalil Sima'an, University of Amsterdam

"With my team we are using the TAUS repository for its size, but also for its diversity of domains of language use. Domain diversity could be perceived as a difficulty for training statistical models. However, in our collaboration with TAUS in the DatAptor project we perceive this as a potential added value for MT users who are interested in extracting out systems from the repository for specific translation tasks for which they have little training data. Our recent work on latent domain models shows how to exploit a rather tiny sample of training data for harvesting multiple orders of magnitude of relevant translation statistics from the TAUS repository in order to build better MT systems. We obtain substantial improvements in MT quality, but we realize that these improvements are as big as the domain diversity that our methods have access to, and the TAUS repository is the most diverse available data we are aware of in this respect."

Maria Pia Montoro, Intrasoft International S.A.

"I use almost every day TAUS Search for technical translations and thank to it I can choose the right term by checking the context (always reliable) and being sure I have selected the right term even without being an expert. I just want to tell you please keep on doing, don’t stop, I couldn’t work without your precious tool."

Prof. Dr. Jan Hajic, Charles University, Prague (Czech Republic)

"TAUS parallel data from the domain of medicine was sucessfully used to train English-Czech, English-German and English-French statistical machine translation systems for the Khresmoi project which aims to develop a multi-lingual multi-modal search and access system for biomedical information and documents."

Prof. François Yvon, University Paris Sud (France)

"I am deeply convinced that the data collected and distributed by TAUS is a major asset for the development of accurate and usable statistical MT engines, and also for future progress of research in SMT, notably due to the high quality of the translations that are collected, but also due to the quality of the associated meta-data. I think that only with the availability of thematically and linguistically coherent data, will we be in a position to deliver high-quality systems for the industry."

Paula Shannon, Lionbridge

"We joined TAUS Data because it's the first of its kind and the prominence of other like-minded members. We believe from this group great ideas will spring."