On March 22-24 (2017), fifty people came together in a former clandestine church in Amsterdam to break their heads on the question how the translation industry will have changed in 2022. The story that came out can be read as an ordinary battle between man and machine, with a victory for the latter. But at a deeper layer, there is a fascinating intrigue with many threads about game-changing technologies and trends and an outcome that is perplexing even for all of us who think that they are behind the wheel today. Be careful what you wish for.
The translation companies of today will not be the same in 2022. We’ll see a split in translation tech and the creative networks, the data factories and the storytelling, the platforms and the boutiques, perhaps sometimes still operating under the same umbrella, but clearly separated in functions. Sounds familiar, this story? Perhaps you are thinking about the paradigm shift in the advertising and marketing industry. Once thought to be so creative, it had its own unique place in an environment of factory and office automation. But now, after a few decades of data storms, the business of the prestigious advertising agencies has changed, fundamentally.
Marketing is automated and driven by data and clicks. The incredibly rapid rise of online ads, razor-sharp marketing, and pay-per-view through companies like Google and Facebook has turned the landscape upside down. Legendary names like Saatchi and Saatchi, McCann Erickson, J. Walter Thompson give us sweet memories of the days of Mad Men, but the creative directors now all report to giant holding companies acting under dull names like Omnicon, WPP, Interpublic and Publicis.
Similar mergers and acquisitions are likely to happen in the relative small translation industry in the coming five years and a convergence with that other creative sector that has fallen victim to data storms - the advertising and marketing industry - would make a lot of sense.
But before we get there, let’s look at the story that developed in Amsterdam just a few weeks ago. The story is broken down into ten chapters, all interconnected, like in every good novel.
Machines will be better at almost everything humans do. Like in many other sectors we expect the robots to come into the translation field to enhance the work we do, expand it and ultimately replace us. It started very simple with the counting of words of the source document to be translated. Now it is expanded into matching jobs with translators, identifying new terminology, optimizing the leveraging from translation memories, profiling content, sampling for review, deciding on the type of quality evaluation and error types to be checked, status tracking and reporting, invoicing and delivery.
Soon, the robots will check the quality, productivity and even predict the quality of jobs yet to be performed. They will track the ROI on the translation of each individual message or segment based on how many users viewed the translation. Not to speak of course of producing the translations itself, speech translation, pay-as-you-go and all other innovations driven by algorithms. They are covered in the other mini-chapters of this story. We have heard many start-ups in our sector already refer to themselves as the ubers of translation. Well, here you go: self-driving translations will be the norm in ‘22.
The datafication of translation started with the unreasonable effectiveness of data article written by the Google scientists Fernando Pereira, Peter Norvig and Alon Halevy in 2009, or perhaps even earlier when the TAUS Data Cloud was launched in 2008. Translation learns from data. In those early days indeed there was no better data than ‘more data’. The English-French Google machine translation engine was trained by a corpus of 100 billion words. Now, with the new generation of Neural MT, very large quantities of data belong to the past. The pursuit of high-quality in-domain translation data will challenge the protectionists and create opportunities for pirates.
Data become an obsession, either way, in the translation industry. And it does not stop with translation memory data. We need speech data too. And we want to have the edits and annotations on human as well as machine translations, plus the attributes for content types, industry sectors, translators’ locations, the process applied, the technology used. And why not correlate it with the weather reports, the social graphs of the translators and their eye movement tracking? There is always something we can learn from new data.
The internet giants had a competitive edge in translation data, but they spoiled it by polluting their own fishing grounds with machine translations. Now, the hunt is open for new data marketplaces. The European Commission is investing in the Connecting European Facility. But watch out also for the greenfield translation data ventures in China, or perhaps closer to home: the TAUS Data Cloud.
Let’s not be blinded by data and technology. The ubiquitous availability of dumb translation will only drive the demand for stories and messages that trigger the user’s imagination, that build the customer’s brand and engage global and diversified communities. Here is where the synergy between the translation and the advertising and marketing sectors makes the most sense. In an open globalized economy, stories need to be recreated. There is enough bad content out there already that gets generated and translated automatically.
Data is a great help to gain insights into markets and customers. What they like and don’t like, where they click and not click, That allows us to take informed decisions where to invest and not to invest when it comes to content creation and transcreation. The way we see it: translators become writers, journalists, and storytellers, cultural consultants, global brand promoters.
Convergence is the confluence of technologies, business models or markets. It is at these crossroads where the biggest innovations are happening. The best example is still the success of Google Translate: a huge wake-up call for the translation industry. Google had no intention to disrupt the translation industry, was uneasy about it even. But the enormous popularity of the automatic translate button combined with Google Search opened perspectives that some pioneers had dreamt off as early as in the 1970's (see the story about Jean Gachot offering Systran Translate on Minitel in Paris).
In the next five years, convergence is the thing that will drive the most fascinating and imaginative innovations. The Megaphonyaku and the Wearable Translator in Japan are good examples of how the convergence of machine translation and speech technology can help tourists and travelers manage themselves in a foreign language environment. The Tokyo Olympics of 2020 will likely bring us more surprising translation innovations. It only takes a bit of imagination to think of similar innovations in business and industrial environments. What would you think of machine translation, text-to-speech and video conferencing all coming together in the Microsoft Hololens to support the John Deere field engineer in Vietnam to interact directly with the factory in Germany when he encounters a defect in the tractor? And in a similar way, would the laboratory assistant in China working with a blood testing machine from Roche Diagnostics not be better off when the device speaks to her in Chinese with an instruction for use or a response to a question when asked?
The power of such convergence will particularly be strong in all those cases where today we produce translations that nobody reads because they are not there when and where they are needed.
As forecast in the recent TAUS Speech-to-Speech Translation Technology report, we can expect to see good working solutions of speech-to-speech translation coming on the market in the next five years. The Skype Translator and the Japanese wearable devices mentioned above are just a few early examples. People are lazy and generally, prefer listening and speaking over reading and writing. The technology is there and it is working.
What is needed now, are the voices in the required languages and, not to forget, the data to train and automate the speech-to-speech translation systems. This is another new area for language service providers to expand and develop new services, finding and hiring the talents. Spoken translation will pop up in many apps, on wristbands, in glasses and integrated into products from software companies, manufacturers of automobiles, medical devices, and online services.
To scale up for speech-to-speech translation at levels similar to text translation the services and technology companies in our sector will have to develop new processes, workflows, data collections, and tools. Opportunities arise for new specializations.
A demarcation line is becoming clearer in the next five years, between bite-sized translations and long-read translations, between product localization and transcreation. Product localization has its own characteristics. Gone are the days of once-a-year big product releases. We already live in an age of continuous delivery. Tasks and jobs become smaller and smaller, down to a segment and phrase level. The concept of a project is blurring. Processes become more agile and integrated with product development.
The quantum leap in machine translation comes at a rescue. But whether or not machine translation is used, translation is always on and virtually real-time. From a service and technology provider perspective, platform integration seems to be inevitable. Several innovative cloud-based platforms with drag-and-drop translation features at various quality levels have appeared on the market in recent years. They will challenge the status quo and force many vendors and buyers to follow.
But what if more than a human touch is needed, when the story needs to be recreated or transcreated in other languages and for other communities? Translators become writers and cultural consultants, brand promoters. They become crucial to the success of a product in a new market. The cascaded supply chains come under pressure and disintermediation will become a theme, again. New ventures like Translate and Create will challenge established translation service houses. Convergence with publishing and advertising and marketing services seems a natural way to go. It will be interesting to see how the market evolves around this widening gap of automatic and creative translations.
In 2022 we will look back at a five-year sprint of Neural MT that has brought an unprecedented improvement in machine translation quality. If it could be expressed in a percentage, the experts will say that five years of NMT is equal to the twenty years of SMT that preceded. It is spooky in the way that even the researchers often don’t know what sparked these rapid improvements. The machines have become self-learning and take a thousand decisions to arrive at better results.
What’s more: the Deep Learning technology takes any data - monolingual, bilingual, audio and even videos - to build versatile engines that can do lip reading (and translation) and even translate between languages for which no direct bilingual data were available. The machine translations are more fluent and natural than what we are used to today, hiding potential inaccuracies from our blind eyes. Neural MT has lent itself very well to the emergence of more speech-to-speech translation apps, thanks to the fluency features and the lower burden on storage capacity.
All of this leads to a reality that all published content is translated in at least fifty languages by default or at least virtually, meaning that it is available on-demand, real-time and probably free. This is a giant step from today’s reality, even though today (2017) the total volume of machine translation output is already 500 times larger than the total translation production of all human translators together.
Rumour has it that the success of Neural MT is followed by a new breakthrough in MT technology that some already refer to as Quantum MT. The Quantum MT generation may be able to add the precision to machine translation that is needed to bridge the accuracy gap in the current Neural MT systems.
In 2022, the translation industry will be much less constrained when it comes to the spread of languages. Ongoing globalization will continue to open markets. Populist trends in politics will hardly stop the pursuit of more customers around the world by businesses both from Western and Asian countries. In fact, what we will see is that the ease of e-commerce brings customers closer, also to small and medium-sized companies, stimulating further growth in global trade.
The China One-Belt One-Road program, for instance, is primarily focusing on the countries along the historic Silk Road, adding quite a few languages to the mix, covered by the translation industry. If today a global enterprise covers on average 25 languages, we speculate that this will double in the next five years. The ubiquitous availability of machine translation can fill the gap to a large extent.
The new generation of Neural MT systems not only improves performance but also adds new languages faster and easier because it is less dependent on very large quantities of translation pairs. The so-called zero-shot approach to building machine translation engines lets developers build engines out of unpaired data of different, but related languages. Ironically enough, rather than putting translators out of a job, we think that the availability of bulk machine translation in new languages will stimulate the demand for creative long-story translations. The interest in new cultures and languages will grow as a result of the technology.
The ubiquitous availability of translation as a utility, often free but not always adequate or good enough, triggers new business and pricing models. Why not charge only when the user clicks and consumes the translation? Why not differentiate the pricing depending on content profiles, the number of clicks? Or perhaps introduce a metric that helps us determine when a high-quality human translation is needed.
The language service companies have become used to reinvent their business and refresh their service portfolios and will continue to do so in the next five years, only more rapidly and radically. The fixed price per word model will go away. Translation tech firms will charge based on the use of their platforms. The creative translation firms will rather price on an hourly basis depending on the talents that are needed.
Particularly in the translation industry, we have lost sight too often of why we translate, whom we translate for, and how the translation is used. Too often translations are produced as an obligatory item in old-fashioned push or publishing models without much care for usability and findability. This must change if we are to follow the trends towards more democratic and user-centric business models. Service providers in the translation sector are likely to play a more important role in consulting on usability and cultural diversity. In fact, there can be tremendous merit in engaging users in defining features, labeling, translation and terminology of services and products in new markets. Users become the new talents.
In 2022, when we look back on the story of the translation industry in the past five years, we will not see such a smooth journey through the ten chapters highlighted in this story. Of course not. There will be hiccups and fall-outs, trials and errors, and severe competition. The providers in the translation industry continue to spend an inordinate percentage of revenue on old-fashioned sales. The industry will suffer from the Bodo Dilemma*: an abundance of tools, technology, data and innovative solutions combined with a painstaking shortage of talents.
Data becomes increasingly important, while the technology edge is diminishing as a result of more open source solutions and sharing of the latest advancements through academic papers. The overriding trends towards data, data-sharing, machine learning and the cloud will, on the one hand, lead to fascinating innovations, growth, and maturity but fire up concerns over privacy and security on the other hand. We may see translation blocking as a phenomenon, similar to add-blocking, to protect companies and supply chains from unwanted and unsolicited free translations.
Some providers will prosper offering secure and closed translation services and technologies on-premise. But in the long run, in the next lustrum of the industry perhaps, the cloud may prove to be irresistible even for the most paranoid, especially if and when the new quantum computer technology delivers the ultimate safe security. Who knows?
*The Bodo dilemma is named after Bodo Vahldieck, Quality Manager at VMware, who expressed his frustration at the TAUS Industry Summit about not being able to find young talents who are willing and able to come and work with the fantastic localization technology suites at his company.
If you want to know more, subscribe to the TAUS newsletters. Or, if you want to help shape the future of the translation industry, come to a TAUS event this year in Girona (Spain) or San Jose, CA (USA) or join one of the TAUS User Groups.