Translating with Cerebral Machines: Where Can It Take Us?

Neural Machine Translation (NMT) systems have achieved impressive results in many Machine Translation (MT) tasks in the past couple of years. This is mainly due to the fact that Neural Networks can solve non-linear functions, making NMT perfect for mimicking the linguistic rules followed by the human brain.

However, creating machines that can adequately replicate the immensely complicated and nuanced translation styles of a professional translator is still a far cry in the field of translation technology. At KantanLabs, we are researching advanced Hybrid MT systems, which will incorporate the best of both worlds – the tried and tested traditional Statistical Machine Translation (SMT) systems on one hand, and the cutting-edge Neural Networks on the other.

What do we mean by Hybrid MT systems?

‘Hybrid MT’ is a general term referring to the exploitation of two different MT technologies that, jointly, target one goal. For example, rule-based MT and phrase-based SMT are often married in a single end-to-end MT solution. In fact, KantanMT implements a hybrid MT technology that combines rules, phrase-based SMT, translation memories (TM) and terminology. From this point forward, the term ‘Hybrid MT’ will be used to refer to the system that combines KantanMT and Neural MT technology.

What are Artificial Neural Networks (ANNs) and Neural Machine Translation (NMT)?

Before discussing how NMT research will benefit the language and localization industry, let us briefly look at what Artificial Neural Networks (ANNs) are and how they help contribute to the MT quality. Introduced back in 1950, ANNs are a paradigm for information processing, inspired by the way biological nervous systems (like the brain) deal with information.

ANNs are networks of computational units, i.e artificial neurons, that operate on the same mathematical principles as biological neurons. Each neuron (biological or artificial) stores basic information. Neurons react in one of two possible ways when fed with a specific type of signal – just like a switch – if the pressure threshold is surpassed, the light goes on, else there is no output.

When a neuron produces an output, it is in turn used as an input by the next neuron in the network. This allows a large neural network (both artificial and biological) to ‘remember’. In order to learn, however, the network needs to be presented with examples. Each example is the information related to the combination of an input with its correct output. Feeding a large number of different training examples will force each single neuron to adapt its threshold in such a way that the network, as a whole, will learn about (all) the input-output pairs.

Moreover, it will be able to give correct outputs for unseen inputs. Consider, for example, a child that learns to respond to its name pronounced by its parents over and over again, that is, by repetition. The child learns at some point to respond to its name not only being pronounced by its parent but also by people that it has not seen before.

Since ANNs attempt to process information in a similar fashion to that of the human brain, they have shown brilliant results for tasks such as text classification, image recognition, clustering, and prediction.

In the basic type of ANNs, the signal flows in one direction – a neuron at a given level will not give input to a neuron on a preceding level and neither to itself. A more advanced ANN model is the Recurrent Neural Network (RNN) model, where one neuron can transfer the signal to neurons on preceding levels or even to itself. This type of neural networks is particularly suitable for MT tasks where the length of the input is unknown. NMT attempts to utilize RNN paradigm to improve translation quality.

I have spoken about using tools such as Caffe, Theano and TensorFlow elsewhere, in relation to building RNN networks for MT. However, similar to SMT, NMT requires a large amount of bilingual data to train a neural network. At KantanLabs, we will be working on solutions, which will produce high quality translations, even with limited training data. As long as the data is of high quality, and rich in terminological standards, we generate high quality NMT output.

Who will benefit from using NMT and how?

At KantanLabs, we aim to create hybrid MT systems, which will substantially reduce the load on human post-editors. The novel automated post-editing (APE) environment will focus on likely mistakes made by the MT engines, and will exploit incremental adaptation of the corrections to avoid repetitive mistakes.

The NMT platform will produce significantly more fluent translations than SMT by utilising the contextual information. This in turn will mean that the manual translation tasks will consume less time and the clients’ cost for translation will be reduced.

What is an even more exciting benefit of our Hybrid MT system is that now human translators will have more time and opportunities to focus their translation efforts on more appealing projects, leaving the tedious, repetitive translation jobs to our hybrid MT systems.

It has been noted in research that MT is more suited to translation tasks that involve repetitive text, style and terminology rather than tasks that involve creativity, sensitivity and cultural understanding. The latter are suitable for human translation. In our experience, translators show more interest in challenging, new tasks, rather than repetitive tasks that require automation.

Finally, the advancements in NMT research is just the tip of the iceberg. There is a fair amount of research and developmental road to traverse, but I believe hybrid MT systems will soon become the future of automated translation technology, which can adequately mimic the nuanced translations of human translators.

Translating with Cerebral Machines: Where Can It Take Us?

Incorporating the best of both worlds: Statistical Machine Translation (SMT) systems , and Neural Networks.

What do we mean by Hybrid MT systems?

What are Artificial Neural Networks (ANNs) and Neural Machine Translation (NMT)?

Who will benefit from using NMT and how?