Jack Welde: Unlocking Content Value for Every Last Person on Earth

7 minute read

This month we asked Jack Welde, the CEO of the US translation platform Smartling, about billion-dollar valuations, technology futures and the meaning of data. We also got his take on blockchain.

Does a fragmented market such as translation services pose a special problem for corporate growth?

Companies grow organically and inorganically, starting with core competencies and then buying in more– that has largely been the business model for the companies in the industry with the highest growth rates. So a fragmented market actually means more opportunities to grow in different ways, both internally and through M&A.

A fragmented market means many market segments. And different market segments have different needs, like to buy in different ways, must be reached in different ways,and require different solutions. For example, Smartling has had a lot of success in the fast-growing, venture-backed startup segment, as well as the segment of large enterprises with complex global content requirements. We have invested heavily in these two segments, and our focus has been on providing solutions for these two segments. Now we are expanding into additional segments, with slightly modified solutions for each. 

What are the key tech changes that are pushing you forward today?

There is clearly a greater end-user expectations for more transactional-type interfaces to get stuff done – people want to click, tap, or swipe to hail a car, order a coffee, or get a date. But what hasn't changed is the constant business focus on increasing engagement and reaching more people. We have now reached an interesting point in history: companies can literally reach the last person on earth, if they have an appropriate product and it’s properly localized. So we work with companies who are working hard to try to embrace the next billion people. With the right offering, in the right languages, companies can deliver their products or services to that last person.

A second big change in our industry is the rise of neural MT, which is gradually eliminating the demarcation between human and machine translation. It is creating more opportunities for a productive closed loop between machine and human, including machine enabled tools that make the human more productive, and human inputs that make the machine more accurate going forward.

There has also been an explosion of data. Our individual clients are literally making billions of decisions a year on which content to translate, into which languages, with machine versus human, with which workflows, etc. . This raises a slew of questions: Which engine do I use? Which workflow? What is the most optimized aspect of the whole process? Can I pre-publish before internal reviews are complete? How do I remove bottlenecks such as approving things for quotes?

Doing this over millions of words, on a frequent basis, into many languages, with many integrations requires software solutions. So more and more companies want - and need - to embrace software-type solutions to configure their platforms to automate every aspect to handle the content production process, while humans manage the exceptions.

This is a fairly obvious development, and Smartling joined an industry that was ripe for introducing software solutions to better manage global content production.

Do you see opportunities opening up for new types of language services?

When we started out nine years ago, we saw that more activities were going digital (for example how many people read a pulped tree newspaper any longer?).

How does this impact companies wanting to create content? The answer was that adding any sort of useful contextual value to translation content was going to become important, from transcribing videos to indexing terms.

So when people say that IoT or voice is going to be the new big thing, my feeling is that maybe one or two companies get to work with the leaders, but this doesn’t make voice into a big market. The real issue is identifying where most content is being translated, as this will be the fatter part of the market.

Regulated industries such as legal, pharma, insurance and government will continue to have larger margins due to the nature of their content. More of that content will continue to be used to engage more people, and that in turn means in more languages - the language people speak is the most basic vector of personalization.

We are all data-driven today. What does the “D” word cover for you?

One of the first decisions we made was to store away every piece of data we could imagine and take advantage of it on behalf of and for the benefit of our customers. So we have segmented all data by customer and built a number of ways of using that data-driven approach to help them. We have close to 1,000 customers using this method.

This means we don't and can’t share that operational data, but we can create a number of ways of using technology to produce better outcomes. For example, organizing content and profiling it for translation, leveraging information for translation, optimizing workflows, helping customers estimate the quality of NMT engines, and benchmarking performance and outcomes for customers.

So I'm super bullish on data and very interested by many of the widely available open-source machine learning engines and platforms such as TensorFlow and Pytorch. These enable better pattern matching than humans can attain, and therefore offer real advantages. So we’re big fans of datapoints that help us make decisions.

This means that for us the concept of a data market is not so useful. Would it mean that people could take credit for each segment they produced? There are no good examples of that. And that turns into a conversation about the "magic of blockchain".

I’ll make a prediction that in the next five years no company (excluding an initial coin offering (ICO) driven by a public token backed by a private coin to raise money) in our industry will be using blockchain as a sort of public ledger to credit people with translations they’ve done that couldn't be done more effectively or efficiently a standard SQL database! Blockchain is unnecessary, inefficient overkill.

How do you evaluate staff for your translation process?

We’re always looking for really high-quality, disciplined translators that understand the verticals they operate in, have the right experience, and do high-quality work. We have a very rigorous testing and ongoing evaluation process, and fewer than 5% of the candidates pass the test for working with us and our global 500 companies.

What we think is important is for translators to work hard to develop their domain expertise and really hone their craft as expert translators. As MT begins to move more upmarket, this will drive out less experienced, less capable translators, and enable the best translators to specialize in what they can do well -- for a higher premium.

One interesting shift is that the current generation is very comfortable with working digitally, so I don't have to mess with such things as project creation, TMs and updating glossaries etc. Translation practice is a more transactional operation, but we are seeing that the older generation still wants to use traditional CAT tools, while the newer has embraced the cloud and centralized tools for data sharing.

Why not use in-house MT?

That is not something we want to invest in right now. We integrate with all MT systems but I think it is very challenging as a message to say you have invested in only one MT engine. We have an “MT router” that allows us to put in content and in near real-time assess which MT is best for this content, use case and job at hand.

Moreover, I don't think our clients want to know about the sausage-making that goes into the process. There is a lot of discussion about workflows and optimizing processes and where you use MT. But at the end of the day, the machine can do all that by using smart AI. Optimization means that humans should not be making a decision at every single step. The machine can take care of most things, and humans should manage by exception.

You sound fairly optimistic about 2019

I am. Companies want to – and can - reach out to the entire world. It is an unbelievably cool opportunity to be able to unlock value so that the best of everything can be shared globally rather than be locked up in one language or country. This brings the world closer, and our industry is helping to enable that smaller, more connected world. I may be pessimistic about blockchain, but I’m certainly not pessimistic about reaching countries and markets around the world. What we are doing is trying to make the translation part easier, and that’s always the challenge!


Long-time European language technology journalist, consultant, analyst and adviser.

Related Articles
Purchase TAUS's exclusive data collection, featuring close to 7.4 billion words, covering 483 language pairs, now available at discounts exceeding 95% of the original value.
Explore the crucial role of language data in training and fine-tuning LLMs and GenAI, ensuring high-quality, context-aware translations, fostering the symbiosis of human and machine in the localization sector.
Domain Adaptation can be classified into three types - supervised, semi-supervised, and unsupervised - and three methods - model-centric, data-centric, or hybrid.