TAUS Blog

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that have been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
in Machine Translation

10 points about MT. Fact or Fiction?

Font size: Larger Smaller
Rate this blog entry:

Let me start out by saying that I love Machine Translation. There is just something magical about inputting text which is totally foreign to you on a computer screen, clicking a button and then reading it in your own language. If you are lucky enough to own your own MT server or software, like I do, and can adjust dictionary terms and create language rules, it becomes even more magical.

I am somewhat of an authority on MT, having spent four years of my life developing software and workflows around MT. We developed a product, the GTS Wordpress Plugin, which translates Wordpress websites and blogs into over 30 languages. Our software has been installed in over 1,000 websites. More about that here.

So when Jaap asked me to write a blog post on the TAUS blog, I thought that perhaps it would be a good idea to express my own convictions about MT. I’ll state my case, but it will be up to you to decide if the points I make are fact or fiction. Please feel free to add your comments to this post and contribute to the discussion.

Ready? Here goes:

1. MT is as mature as it will ever be. Let’s face it. Ever since SMT became the standard, MT reached its maturity stage. For years people said that MT is in the embryonic stage and predicted that technological breakthroughs will make it better, near human quality. I say that the time for that has passed and that MT is as good as it will ever be. For several years now people have been saying that Google has been making deals with content owners to align zillions of megatons of corpora for MT training. But the result, as anyone who uses Google Translate knows, is often laughable.

2. MT will never be as good as professional, human translation. The logical succession to my previous point is that MT will never, ever be as good as human translation.

3. The VCs have rendered their decision: MT is out, human translation is in. In the last 2-3 year a number of venture capital companies have poured millions into companies that develop human translation automation platforms. Smartling, One Hour Translation and Gengo are some examples. The smart VC money is going into human translation and no or very little money is going into MT. What does that indicate about the financial viability of MT as a business? 

4. Post-editing MT will never go mainstream with translators. We need to face reality. Professional translators will always disdain MT, mistrust it. MT will never be a hit with translators. LSPs that want to sell edited MT will either have to cajole freelancers to take work that they don’t want, hire inferior translators, or train in-house staff.

5. Post-edited MT is not as good as from-scratch. Everyone has heard the ‘you get 2 out of 3’ saying. When you deliver post-edited translations, it will be cheap and fast, but will not be (as) good. LSPs will need to have two SLAs, one for a pure human process and one for PE-MT. I have heard this stated by Wayne Bourland, the localization chief at Dell, at the AMTA 2012 conference in San Diego. I think that this should  become an industry standard.

6. Universal translator? Only on Star Trek. Now and then we read about a new speech-to-speech Universal Translator, where one person speaks e.g. in Chinese on one end and the other person hears e.g. English on the other end. Google, Microsoft and others have developed prototypes. I say that these products will never reach a point where they can be used in day-to-day situations, at least not in our lifetime.

7. Training MT engines is a game for big boys. Training an MT engine for a specific domain, customer etc. is a very costly and knowledge intensive exercise. You need to have heavy computing resources, computational linguists, software engineers, system admins. There are not that many companies that have the resources to train MT and this activity is best left to the experts. One example that comes to mind is Alon Lavie at Safaba who offers custom MT development.

8. Training in a cloud is pipe dream. Recently, some companies have come out with cloud solutions which allow you to upload a training corpus and create and access a custom MT engine. Sounds like a great idea but I say that it is about as feasible as the Universal Translator.

9. 99% of the world’s population will remain ignorant about what MT is really about. When I tell people that I sell translation services for a living, I invariably hear someone ask :”isn’t that all done by Google nowadays?” I used to get irked by these responses and respond with long speeches. Now I just smile and say “have a nice day.”

10. The demand for MT will grow. Everyone who is connected to this business sees how the demand for translation services is constantly growing. As more and more content needs to get translated, organizations will turn to MT as a viable option for material which is non mission-critical.

David Grunwald is the founder and owner of GTS Global Translations, a technical translation company. He can be reached at davidg@gts-translation.com. You can also find him on LinkedIn and on Twitter (@davegrun).

People in this conversation

  • Guest - Tom Hoar

    Your eagle eye caught me! Does your interpretation of “fair use” leave room for my whimsical parody? I shared a very personal experience. In 1992 I was an accomplished amateur photographer (silver of course) with my work hanging in the Smithsonian. In 1993, I was the one who said, “Digital imaging will never surpass my darkroom skills.”

    Declaring any ingenuity-driven technology “is as mature as it will ever be” is akin to assigning an RIP headstone on humanity itself. Thankfully, I sensed a bit of satirical devilʼs advocacy in your piece. I have no idea what the future will bring. Keep up the good work keeping us thinking!

  • Guest - David Grunwald

    Hello Tom Hoar, interesting that the blog post from 1993 (even though blogs did not exist back then) uses the exact same words that I did. Will they sue me for copyright?

  • Guest - Tom Hoar

    AMAZING! I found this 1993 BLOG. It was posted soon after Adobe released in Photoshop v2.5 for Windows. That was only 2½ years after releasing Photoshop v1.01. It's uncanny!

    1. Digital graphics is as mature as it will ever be. Let’s face it. Ever since Photoshop became the standard, digital graphics reached its maturity stage. For years people said that digital graphics is in the embryonic stage and predicted that technological breakthroughs will make it better, near human quality. I say that the time for that has passed and that digital graphics is as good as it will ever be.

    2. Digital graphics will never be as good as professional, human artwork. The logical succession to my previous point is that digital graphics will never, ever be as good as human artwork.

    3. The VCs have rendered their decision: digital graphics are out, human artwork is in. In the last 2-3 year a number of venture capital companies have poured millions into companies that develop human artwork. The smart VC money is going into human artwork and no or very little money is going into digital graphics. What does that indicate about the financial viability of digital graphics as a business?

    4. Touched-up digital graphics will never go mainstream with artists. We need to face reality. Professional artists will always disdain digital graphics, mistrust it. Digital graphics will never be a hit with artists. LSPs that want to sell edited digital graphics will either have to cajole freelancers to take work that they don’t want, hire inferior artists, or train in-house staff.

    5. Touched-up digital graphics is not as good as from-scratch. Everyone has heard the ‘you get 2 out of 3’ saying. When you deliver Touched-up artwork, it will be cheap and fast, but will not be (as) good.

  • Guest - Dion Wiggins

    Alon, I agree 100% with your answers. Spot on.

  • Guest - Steven Marzuola

    I don't work in MT, but I do have a degree in computer science and I have worked as a professional translator for the past 20 years. Although most members of my professional share your deep skepticism about how much MT can be further improved, I see a few areas where it could be better. These remarks are aimed at Google Translate, and may not apply to a commercial or other version.

    One would be to incorporate meta-information, such as: country or region of the source text, register, whether it is formal or informal, spoken or written, field of knowledge. Most of the errors that I see in Google Translate could be addressed by taking that type of information into account.

    There are also large steps forward in the field of user interfaces, to allow a user and a machine to work together more effectively. For instance, if a user could specify a glossary of preferred translations: "In this translation I need you to always translation XXX as YYY." That would also correct many errors or inconsistencies.

  • Guest - Alon Lavie

    David makes some strong statements and predictions regarding the current state of MT and where the technology is heading. I agree with some, strongly disagree with others, and for yet a few more, I find it impossible to say "True" or "False" since the issue is not quite "Black and White". David frames this as a "True or False" engagement exercise, so I'll play along, and provide my answers, with some brief justifications:

      MT is as mature as it will ever be: FALSE
      I strongly disagree on this one. As an expert who is deeply immersed in MT research, I can confidently state that MT technology is far from mature, and much progress is yet to come. MT today is a complex technology application based on Machine Learning, and very much continues to evolve. Expect evolution, however, not revolution. MT will continue to improve - especially content-specific highly-targeted and adapted MT (as opposed to generic MT engines that try to cover everything).

      MT will never be as good as professional, human translation.: IT DEPENDS...
      This simple sounding statement isn't simple at all. Yes, MT will likely never translate some types of content at a human level. And MT quality will probably continue to vary quite a bit across different types of content, and even from one sentence to the next. But for some technical types of content, dedicated highly-adapted MT is already reaching levels of 40% or more of segments being perfect or close to perfect (requiring no human correction). And at such levels of quality, MT and post-edited MT are already extremely valuable and useful in many scenarios. In fact, with the right MT, post-edited MT can actually be higher in quality than unassited human translation.

      The VCs have rendered their decision: MT is out, human translation is in.: NOT QUITE TRUE
      Yes, significant financial investment in recent years has gone into technology-based companies that have come up with disruptive solutions for improving the productivity of human translation processes. But look more carefully - these have not so much changed the human translation process itself but have focused on improving the management of human translation at a massive scale. I think that's primarily due to this being the "low hanging fruit" in transforming the language translation industry. MT is still at a relatively early adoption stage and is not an easy technology to get working right. But adoption appears to be accelerating and I think the financing will follow in the next few years.

      Post-editing MT will never go mainstream with translators.: MAYBE. MAYBE NOT
      It is certainly true that there has been a lot of resistance and suspicion. Translation Memories were also not very popular in their early days. It will take time for translators to become familiar and knowledgeable about MT, and for fair and effective compensation models to emerge. Some translators may never want to work with MT, and that's their prerogative. I think MT will become as mainstream as TMs in most CAT tools in just a few years time.

      Post-edited MT is not as good as from-scratch.: IT DEPENDS.
      This too sounds like a simple statement but is nothing but. Bad MT for the wrong content-type, and/or an ineffective process for post-editing MT can certainly result in a lower quality final translation than translation from-scratch. And unfortunately that's a very likely outcome with many MT implementations in the market. But we have actual data and case studies that also show the opposite - that with the right MT for the right content-type, the consistency offered by MT results in post-edited MT of quality that is in fact higher than translation from-scratch.

      Universal translator? Only on Star Trek.: PARTLY TRUE.
      Speech-to-speech translation is another application on which I have worked extensively in previous years. It has progressed quite significantly in recent years, but here too expect slow evolution and not a revolution.

      Training MT engines is a game for big boys.: PARTLY TRUE.
      Thanks David for specifically mentioning me and my company Safaba as one of "the big boys". David is correct to point out that developing top-quality highly-optimized enterprise MT engines requires deep expertise and some amount of investment, and for global enterprises with high volumes of consistent domain content, this type of MT has tremendous value. For others, such as LSPs, it's really a matter of ROI. There is a basic trade-off here. DIY solutions will get you lower quality MT than our expert-managed "Do it for you" (DIFY) solution, at a significantly lower cost. In some cases, that's a better ROI, in others it isn't.

      Training in a cloud is pipe dream.: FALSE.
      I'm puzzled by this statement. Cloud-based platforms have been breaking new ground over the last couple of years. MT in such cloud-based platforms is already a reality, and goes particularly well with some of the recent DIY solutions. There are and should be some concerns regarding data privacy and security with such solutions that use public cloud-based computing platforms, but they are definitely a reality.

      99% of the world’s population will remain ignorant about what MT is really about. TRUE.
      No disagreement here. But that's true of a lot of complex technologies and services - where most of the population has no idea how they work.

      The demand for MT will grow.: VERY TRUE.
      I'm glad to end this list with something we fully agree about! I would be in the wrong business if I didn't believe this to be true. I'm pretty sure I'm in the right business. I certainly really enjoy it, despite all the difficulties and challenges along the way!

  • Guest - Manuel Herranz

    I agree with some things you say David but completely disagree with your statement on customization. Pangea is a platform created specifically to let users take control of their own customization. It works, it is successful, results have been reported and it has lowered entry level by empowering users

Add comment

Blog Archive

Recent Comments

Guest - Aveitos
Way cool! Some extremely valid points! I appreciate you penning this article and also the rest of th...
Guest - NAZAR HUSSAIN KAZMI
Dear Friends

I am greatly impressed to know about your professional activities especially in the ...