Massively Multilingual (M2)
12 minute read
Massively Multilingual (M2) represents a new conceptual space and the new TAUS Events program where language technology and data converge to support a myriad of applications in all fields of commerce, government, society and science.

Bit by bit, we are cracking the human language code. It’s happening fast and it marks a new beginning in both market expansion and the broader setting of human evolution. Language digitization in general is boosting economies, generating a spate of innovations, and opening up opportunities to push human civilization to higher levels of knowledge and mutual understanding.

What is M2?

We define Massively Multilingual (M2) as a new conceptual space (and hence a program) where language technology and data converge to support a myriad of applications in all fields of commerce, government, society and science. We believe this M2 concept needs to be recognized as a catalytic step forward in a world where AI is advancing rapidly and 2030s communication designs are already on the drawing board.

This space can be characterized as an effort to scale up the promise of language technology exponentially from being a mere productivity enhancer in a relatively small translation industry into a radical but necessary solution to very large-scale communication opportunities and challenges. That’s the narrative driving this new M2 program.

This document outlines the impacts of a successful M2 program in helping reinvent the translation industry and augment the potential for human communication.

New TAUS Program

M2 focuses exclusively on the future of world language activities from a collective business, technology and government perspective.

The launch of this program makes a clean break from fifteen years of TAUS events in which we have been constrained by compromises between old and new technologies, between a cost-per-word and a free-machines economy. Inevitably, we concentrated on incremental gains in efficiency and scale. As such, our programs seemed limited by the size of the existing localization industry. But the future beckons.

This M2 program sets its sights beyond existing constraints and trade-offs. We seek to realize the full promise of language technology successes and future breakthroughs. For our companies, our innovation scenarios, and for society at large. Our ambition is not limited to a ten percent gain in efficiency or adding a few more languages to the palette of locales we already manage. We want to radically raise the stakes and become World-Ready. We prefer to calculate in multiples of hundreds, and take the scale of the Human Genome Project as our inspiration.

Impacts of M2 on the Translation Industry

In this new program, we take an exponential view, by looking at the impacts of language and translation technology beyond the local revolution it will cause in our traditional industry. Here’s why.

Historically, language and translation technology has been limited to providing incremental productivity gains in professional activities such as translation, localization, and interpretation. This resulted in an inward-looking industry over-focused on its workflows. The broader implications of a world without language barriers was just for dreamers.

Even today, witnessing the quantum leaps in NMT, entrepreneurs in the translation industry tend to seek compromises rather than taking a more radical view of the possible impacts on their customers’ and their own activities.

What if businesses and governments really can become massively multilingual in all their communications in the coming decade? And what if there are free machines that generate an abundance of translation? What then could be the broader impact of this M2 scenario on society as a whole in a constantly more connected world of bigger cities and a growing population (especially in Africa and Asia) that is generally wealthier (for many of us at least), more globalized yet also fragmented.

The global translation industry finds itself now in a 'mixed economy' state: on one side a vertical cascaded supply chain and on the other the new, flat, free machines model. The speed with which the machines are improving when fed with the right quality and volumes of data makes translation a near-zero marginal cost type of business (in the spirit of Jeremy Rifkin). This means that once the right infrastructure is in place, the production of a new translation costs nearly nothing and capacity becomes infinite.

In translation 1.0, every job has been sent down the supply chain of project managers, linguists, reviewers, etc. who all add further cost every time a new translation is needed. It therefore comes as no surprise that the output from the free machines is already thousands of times greater than the capacity of the 'old' industry. Google alone translated 300 trillion words last year compared to an estimated 200 billion words total output from the professional translation industry.

Is today’s hybrid economic model even sustainable? How realistic is it to think that we can just add more capacity and skills into our existing economic model to make a genuine impact on a potentially huge global market?

World-Readiness - in the true sense of the word - may only be feasible if we can shift to something like the 'social commons' of real-time data. This could be achieved by applying an economic model that is peer-directed and which scales laterally, rather than vertically. The free machines economy equates translation with information by removing the condition of scarcity, creating abundance. Currently, translation as a social good, not owned by anyone and free to the user, exists alongside translations that are owned and paid for by governments and corporations. How much longer will these parallel channels need to exist?

A translation service industry that embraces the free-machine economic model and focuses on adding value in other, more innovative and probably technology-driven approaches will form a crucial catalyst for positive outcomes from the mega trends likely to remodel the world in the coming decade.

M2 as a Catalyst in a Changing World

From The Global Trends to 2030 Report, published by the European Strategy and Policy Analysis System (ESPAS), we highlight four mega trends for which M2 will serve as catalyst.

M2 in an ever more connected world

The global increase in connectivity seems unstoppable. By 2030, 125 billion devices will be connected to the internet, up from 27 billion in 2017. In this hyper-connected world communications will take on new forms, speeds and scale. Nearly all will be instantaneous, often processed at the edge, and most human-to-human interfaces will have a supporting machine or bot somewhere in the mix.

Language digitization technology makes any connectivity more intelligent. M2 will therefore be indispensable in this superconductive world, providing vital features such as question-answering, clustering, sentiment analysis, and sorting human input for further processing. It is this kind of automation that scales up human access to both other humans and any other form of content across a huge continuum of spoken and written tongues.

M2 and global economic growth

A second megatrend is continuous economic growth. According to the ESPAS report, by 2030 the majority of the world’s population will earn somewhere between 67- 200% of the median income in a given country, a rise of 2 billion compared to today. Most of these people with more disposable income will be located in emerging economies.

By streamlining, personalizing and globalizing e-commerce, customer support, marketing, online learning, and all enterprise functions through world-ready capabilities such as automated knowledge management and speech and text translation in all active languages, M2 will give a richer voice to and intelligent responses for some 5.3 billion customers.

M2 and urbanization

By 2030, two-thirds of the world population will be living in cities. A megatrend that comes along with the urbanization is the increasing language multiplicity of cities. New York with a population of 8.6 million now counts over 300 languages. London and Amsterdam are not far behind, and most cities of over a million or so inhabitants will be more aware of hosting numerous languages, and addressing the implications. Whether due to climate-induced migration or the search for a better life, this trend will gradually stimulate new community practices in education, social services, policing, legal affairs and healthcare. The ESPAS Report describes how cities in many respects become the focal point of governance, more so than nation states. Cities are much closer to the daily lives and grievances of citizens.

M2-centric experience and skills, therefore, will be crucial for governments (in particular city authorities) to establish effective, inclusive communications with and for all their citizens’ languages to ensure better lifestyles, facilities, security, crisis management, knowledge sharing, political inclusion, and collective well-being.

M2 and global demographics

The world population as a whole will continue to grow, but less in the developed world. The real peaks will show up in Sub-Saharan Africa and South Asia (Nigeria, Tanzania, Ethiopia, India, and Pakistan, for instance). Europe and North America will hardly change in size.

From a communication perspective this means that the scope is shifting rapidly to a vast amount of new languages. Reaching the next billion users means for many companies that they need to expand the number of languages they cover with their localization capacity to 150 or more. India alone counts 22 official languages (and more than a hundred other languages that are spoken in the country). Africa counts 140 languages with 11 million or more speakers. All these languages are typified as ‘low-resource’ languages, which means that they are hardly represented on the internet and that the language data that are needed to train applications such as machine translation are very scarce.

M2 is the key to developing applications for new languages, as we see already demonstrated in experiments with Massively Multilingual MT engines. Large-scale collaboration is needed to collect and aggregate the data that will empower the technologies that will open up the world and break down the language divides as completely as possible.



Jaap van der Meer founded TAUS in 2004. He is a language industry pioneer and visionary, who started his first translation company, INK, in The Netherlands in 1980. Jaap is a regular speaker at conferences and author of many articles about technologies, translation and globalization trends.

Related Articles
The evolution of the language industry over the past two decades includes a transition from rule-based Machine Translation to the integration of AI. Learn more about how two industries converge at the TAUS conferences in Rome and Albuquerque this year.
Embrace the GenAI revolution at the TAUS Annual Conference 2023 to thrive in the LLM era. Join this defining moment in the language & localization industry.
Notes from the TAUS Massively Multilingual Conference 2022