November 5, 2020 - Amsterdam Today, TAUS is going live with the first release of the Data Marketplace. The Data Marketplace is a platform for all stakeholders in the global language and AI industries to trade, clean, cluster and curate language data. The development of the Data Marketplace is a collaborative project between TAUS, Translated, and FBK Trento and is co-financed by the European Union under the Connecting European Facility Program to disintermediate the language data supply chain by connecting data producers and consumers directly.
The Data Marketplace is developed on top of the TAUS Data Cloud which was founded in 2008. What is new in the Data Marketplace is the business model that provides a free market for language data trading and the features that help users to get the most useful and high-quality language data. The Data Marketplace allows owners and producers of data to monetize their data and trainers and developers of MT and AI systems can shop and buy high-quality data tuned to their required domains and specifications. Data Marketplace currently offers data cleaning and anonymization services. A Matching Data or clustered search feature, allowing users to build their own corpora tuned to their domains and needs, will be part of the next release of the Data Marketplace in June 2021.
“The Data Marketplace establishes a more equal level playing field for the thousands of companies in the world that want to optimize automatic translation and invest in language-based AI,” says Jaap van der Meer, Director of TAUS. “Today access to sufficient language data is precluded to a few big-tech companies who can afford to invest. The Data Marketplace makes access to language data universal and affordable by connecting language data producers and consumers directly.”
Sellers on the Data Marketplace come from different backgrounds such as publishing companies, data companies, language service providers and buyers, and translators who look for ways to monetize the language data they've collected or generated over time.
“The Data Marketplace comes just at the right time to support the grand shifts in the translation and AI industries and to realize the best results out of the investments in translation automation,” says Marco Trombetti, CEO at Translated. “The Data Marketplace is a great opportunity for translators to monetize their resources as well as a step forward for MT players in support of language expansion and domain diversification.”
"The Data Marketplace is an unprecedented opportunity to foster the exchange of datasets and make them available to all the players of the translation and AI markets. The TAUS Data Marketplace gives the possibility to access clean and domain-specific resources that speed up the creation of high-quality translation models to quickly satisfy the needs of the customers," says Marco Turchi, Head of MT Unit at FBK.
The Data Marketplace is launched with the largest collection of language data (more than 35B words in 600+ language pairs) and it can help users of MT and AI to expand into new languages, new domains and new applications very quickly.
“It had never occurred to me that my hard-earned solutions to thorny translation issues could one day be made available to the public, thus allowing me to share useful knowledge to colleagues, and also offering me some extra reward for my past efforts,” says Nicoletta Aresca, Translator and seller on the Data Marketplace.
Data Marketplace is established based on a strong legal framework that complies with privacy policies in Europe and North America, therefore, the legal review committee puts great emphasis on the importance of transparency on the origin and usage of language data.
TAUS was founded in 2005 as a think tank with a mission to automate and innovate translation. Ideas transformed into actions. TAUS became the language data network offering the largest industry-shared repository of data, deep know-how in language engineering and a network of Human Language Project workers around the globe. Our mission today is to empower global enterprises and their service and technology providers with data solutions that help them to communicate in all languages, faster, better and more efficiently.
Translated is a translation company that pioneered the use of artificial intelligence to help professional translators. It was founded in 1999 by computer scientist Marco Trombetti and linguist Isabelle Andrieu and has always been focused on a powerful combination of human creativity and machine intelligence to craft consistent quality translations at speed. Today, it is one of the most successful online translation companies in the world, with 180,000 clients, offering translation in 177 languages in 40 areas of expertise. Translated has been rewarded on several occasions, including the TAUS Innovation Contest. In 2015, the European Commission recognized Translated’s CAT tool Matecat as one of the best AI research projects of the previous 7 years. In 2017, Financial Times included Translated as one of Europe's fastest-growing companies.
Fondazione Bruno Kessler (FBK) contributes to the Data MarketPlace project through its Machine Translation research unit, which pursues research in the field of machine and speech translation and develops technology to enable multilingual communication and access to large-scale spoken and written content. The MT unit includes about 10 people, counting staff researchers, postdocs and PhD students. In recent years the unit has been involved in several EU-funded initiatives covering the different research areas (e.g. speech translation and machine translation) Besides his long record of scientific results, the MT unit is also known for its development of NLP software, language resources, and for organizing international evaluation campaigns and conferences.
Jaap van der Meer founded TAUS in 2004. He is a language industry pioneer and visionary, who started his first translation company, INK, in The Netherlands in 1980. Jaap is a regular speaker at conferences and author of many articles about technologies, translation and globalization trends.