Datafication in China: A TAUS Webinar

17 October, 2017, 04:00 - 05:00 pm CEST

Overview

The datafication of translation started with the unreasonable effectiveness of data article written by the Google scientists Fernando Pereira, Peter Norvig and Alon Halevy in 2009, or perhaps even earlier when the TAUS Data Cloud was launched in 2008. Translation learns from data. In those early days indeed there was no better data than ‘more data’. The English-French Google machine translation engine was trained by a corpus of 100 billion words. Now, with the new generation of Neural MT, very large quantities of data belong to the past. The pursuit of high-quality in-domain translation data will challenge the protectionists and create opportunities for pirates.

Data become an obsession, either way, in the translation industry. And it does not stop with translation memory data. We need speech data too. And we want to have the edits and annotations on human as well as machine translations, plus the attributes for content types, industry sectors, translators’ locations, the process applied, the technology used. And why not correlate it with the weather reports, the social graphs of the translators and their eye movement tracking? There is always something we can learn from new data.

The internet giants had a competitive edge in translation data, but they spoiled it by polluting their own fishing grounds with machine translations. Now, the hunt is open for new data marketplaces. The European Commission is investing in the Connecting European Facility. But watch out also for the greenfield translation data ventures in China, or perhaps closer to home: the TAUS Data Cloud.

Agenda

  1. The story of the translation industry in 2022: Datafication of Translation. Quantity or Quality? By Jaap van der Meer, director of TAUS
  2. Presentation. By Henry Wang (UTH International)
  3. TM marketplace in China. By Jing Zhang (Tmxmall)
    The current situation of TM production, management and usage in Chinese translation companies will be firstly introduced. Then I will share some information about how TM trading is done in China both online and offline. Finally Tmxmall TM marketplace including TM P2P trading platform will be introduced.  
  4. Q&A with panelists
  5. Next steps: TAUS reports and User Groups
  6. Question & answers

 

Event Properties

Event Date 17-10-2017 4:00 pm
Event End Date 17-10-2017 5:00 pm
Capacity Unlimited
Individual Price Free
Created By Anne-Maj van der Meer
Registration link https://attendee.gotowebinar.com/register/4379701543378946562
Secondary text
Overview

The datafication of translation started with the unreasonable effectiveness of data article written by the Google scientists Fernando Pereira, Peter Norvig and Alon Halevy in 2009, or perhaps even earlier when the TAUS Data Cloud was launched in 2008. Translation learns from data. In those early days indeed there was no better data than ‘more data’. The English-French Google machine translation engine was trained by a corpus of 100 billion words. Now, with the new generation of Neural MT, very large quantities of data belong to the past. The pursuit of high-quality in-domain translation data will challenge the protectionists and create opportunities for pirates.

Data become an obsession, either way, in the translation industry. And it does not stop with translation memory data. We need speech data too. And we want to have the edits and annotations on human as well as machine translations, plus the attributes for content types, industry sectors, translators’ locations, the process applied, the technology used. And why not correlate it with the weather reports, the social graphs of the translators and their eye movement tracking? There is always something we can learn from new data.

The internet giants had a competitive edge in translation data, but they spoiled it by polluting their own fishing grounds with machine translations. Now, the hunt is open for new data marketplaces. The European Commission is investing in the Connecting European Facility. But watch out also for the greenfield translation data ventures in China, or perhaps closer to home: the TAUS Data Cloud.

Agenda
  1. The story of the translation industry in 2022: Datafication of Translation. Quantity or Quality? By Jaap van der Meer, director of TAUS
  2. Presentation. By Henry Wang (UTH International)
  3. TM marketplace in China. By Jing Zhang (Tmxmall)
    The current situation of TM production, management and usage in Chinese translation companies will be firstly introduced. Then I will share some information about how TM trading is done in China both online and offline. Finally Tmxmall TM marketplace including TM P2P trading platform will be introduced.  
  4. Q&A with panelists
  5. Next steps: TAUS reports and User Groups
  6. Question & answers

 

Event Properties

Event Date 17-10-2017 4:00 pm
Event End Date 17-10-2017 5:00 pm
Capacity Unlimited
Individual Price Free
Created By Anne-Maj van der Meer
Registration link https://attendee.gotowebinar.com/register/4379701543378946562
Secondary text
Overview

The datafication of translation started with the unreasonable effectiveness of data article written by the Google scientists Fernando Pereira, Peter Norvig and Alon Halevy in 2009, or perhaps even earlier when the TAUS Data Cloud was launched in 2008. Translation learns from data. In those early days indeed there was no better data than ‘more data’. The English-French Google machine translation engine was trained by a corpus of 100 billion words. Now, with the new generation of Neural MT, very large quantities of data belong to the past. The pursuit of high-quality in-domain translation data will challenge the protectionists and create opportunities for pirates.

Data become an obsession, either way, in the translation industry. And it does not stop with translation memory data. We need speech data too. And we want to have the edits and annotations on human as well as machine translations, plus the attributes for content types, industry sectors, translators’ locations, the process applied, the technology used. And why not correlate it with the weather reports, the social graphs of the translators and their eye movement tracking? There is always something we can learn from new data.

The internet giants had a competitive edge in translation data, but they spoiled it by polluting their own fishing grounds with machine translations. Now, the hunt is open for new data marketplaces. The European Commission is investing in the Connecting European Facility. But watch out also for the greenfield translation data ventures in China, or perhaps closer to home: the TAUS Data Cloud.

Agenda
  1. The story of the translation industry in 2022: Datafication of Translation. Quantity or Quality? By Jaap van der Meer, director of TAUS
  2. Presentation. By Henry Wang (UTH International)
  3. TM marketplace in China. By Jing Zhang (Tmxmall)
    The current situation of TM production, management and usage in Chinese translation companies will be firstly introduced. Then I will share some information about how TM trading is done in China both online and offline. Finally Tmxmall TM marketplace including TM P2P trading platform will be introduced.  
  4. Q&A with panelists
  5. Next steps: TAUS reports and User Groups
  6. Question & answers

 

Share this event: