Machine Translation: The Complete Guide

Did you know that machine translation came into existence in the 1950s? The past 10-15 years, however, have been called the golden age for this technology. MT has been steadily improving and has become a widely used, handy tool for translators in their day-to-day work.

What is Machine Translation?

There are many definitions for machine translation. To put it simply, machine translation (MT), also called automated translation, is a process where a computer software translates text from one language to another without human involvement. MT works with large amounts of source and target languages that are compared and matched against each other by a machine translation engine.

A Brief History of Machine Translation

The very first mention of a comparable system to modern machine translation dates back to the 9th century in the Abbasid Caliphate (present-day Iraq) where an Arabic cryptographer, Al-Kindi, developed a method for systemic language translation. These techniques are still used today by machine translation engines.

There were various attempts to create a system that produces translations automatically. For example, in 1933 “the machine for the selection and printing of words when translating from one language to another” was presented by Russian scientist Petr Petrovich Troyanskii. The machine was quite simple, as mentioned in Free Code Camp’s article. It consisted of a film camera, word cards, and a typewriter. It would already take morphology and grammatical rules (gender, number, etc.) into account when producing the translation output.

At the beginning of the Cold War (in the 1950s), IBM started experimenting with machine translation. It was on their computer that the first automated translation came into life. They translated 49 Russian sentences into English with the IBM computer. Although this was regarded as a huge success, the translation process was carried out on carefully selected sentences where any possibility of ambiguity was eliminated.

This breakthrough, however, prompted a race among nations which ultimately led to the birth of what is known as machine translation today.

Over the next few decades, advancements were made after some years of stagnation. In the US, two systems (Logos and SYSTRAN) were used for military purposes. Canada also developed a system called METEO which translated weather forecasts between English and French in Québec.

The next significant surge in the development of MT came in the 1980s when Japan also joined the race. Since then, the world has seen the advancement of machine translation, and more and more engines have emerged.

Let’s take a chronological look, in order of when they came onto the scene, of today’s modern machine translation engines.

blog_divider_memoq_MT__1

Types of Machine Translation

Statistical Machine Translation (SMT)

Statistical machine translation relies on bilingual corpora to produce the translation output. It uses these corpora (i.e., the source and the target of the same text) to come up with statistical analyses and to estimate the probability of each translation that might be correct for a given string. It then uses that probability to choose the string that was determined to be the most likely to be accurate. It is important to note that this type of MT does not take context into account. It is merely based on statistics.

Rule-Based Machine Translation (RBMT)

In contrast to SMT, rule-based machine translation relies on linguistic information such as data taken from various dictionaries and grammar. This type of MT relies on grammatical rules when it produces the translation output, considering the morphology, syntax, and also the semantics of the strings. It analyzes the grammatical structures of both the source and the target language to create the translation output.

It is also important to note that since languages can be regarded as living organisms (i.e., they constantly change), rule-based MT models are no longer frequently used.

Hybrid Machine Translation

As the name suggests, hybrid machine translation uses a combination of the statistical and the rule-based types of MT. Although you get the best of both worlds regarding quality, hybrid machine translation still tends to require a large amount of human editing.

Neural Machine Translation (NMT)

Neural machine translation teaches itself how to translate by using a large neural network. This method is becoming more and more popular as it provides better results with language pairs.

As TranslateFX pointed out, “the parameters of this neural network are created and refined via training the network with millions of sentence pairs. Each sentence pair modifies the neural network slightly as it runs through each sentence pair using an algorithm called back-propagation.” This ensures that this type of machine translation keeps getting better at producing translations as it is trained.

NMT, thus far, is the most advanced form of machine translation. However, it has its drawbacks like mistranslations while the target text looks very natural, hence it's harder to spot issues when post-editing.

Machine Translation Post-editing

As the name suggests, post-editing of machine translation is needed when a linguist has an MT engine translate a document. They then provide post-editing on the machine translation. Translators and LSPs all over the world are taking advantage of this technique since it combines the speed of MT and the knowledge of the human mind.

When machine translation is discussed in the professional translator arena, it typically refers to machine translation post-editing. Despite how advanced machine translation has become over the years, it is still rare that the translation output does not need to be edited and proofread by a human translator.

Of course, there are texts where plain machine translation suffices. You can read about this in more detail in another article in this series where we discussed the differences and the various use cases concerning MT and MTPE.

When to Use Machine Translation

As previously mentioned, there are a variety of aspects to consider when deciding whether to use human translation, machine translation, or a combination of both in a translation project.

Target audience

One important factor to take into account is where and for what the translated text will be used. For example, if you are translating internal documents or material that will only be accessed by a small group of people within a company where the only goal is mutual understanding and not accuracy, you do not need MTPE, and plain MT can be enough.

However, when it comes to higher-value content, accessed by a large group of stakeholders (such as a website, a game, or important documents intended for a larger audience), you can probably use machine translation, but make sure you apply the right amount of editing to make the content accessible and easy to read.

Text type

Keep in mind that not all texts are suitable for post-editing machine translation. Whether you should use MTPE in your workflow heavily depends on the type of text as well as the intended use of the translation output.

Machine translation post-editing can be a reasonable way to go about the translation of the following document types:

Blog posts or press releases
Technical texts
Informal documents
News articles
Manuals, instructions

However, you must be careful and/or refrain from this method if your source is:

UX/UI copy: these usually consist of small segments and are heavily dependent on the context
Texts that require extensive experience and knowledge in a specific field (such as life science, engineering, or law)
Marketing/advertising copy, where wordplay and humor are important factors

And, of course, there are texts where it is not advisable to use machine translation at all. In the case of literary translation, for example, you not only want to put the message into the target language but language is also used to convey emotion to the reader. Another example would be creative concepts where the material is heavily based on context and messaging.

Pros and Cons of Using (Post-edited) Machine translation

Machine translation is a topic that even divides the translation industry. Some think it should never be used while others see the future in MTPE. One thing is for sure is that post-edited MT will never be as accurate and as “human” as a text produced by an actual human translator. We can, however, get close. In many cases, that is enough if you consider the time spent on translating large amounts of text. MTPE can be a useful method to include in your translation workflow.

blog_divider_memoq_MT__2

How You Can Benefit from Using MT

It can be a time-saver

Compared to human translation, which is usually said to be around 2,000 words per day, with post-edited machine translation, you can reach a volume of up to 7,000 words per day. Of course, how appropriate the source text is for machine translation needs to be considered. For example, the more ambiguous the source text, the more post-editing is needed once the output has been produced.

Pro-tip by memoQ: We advise that you don’t try to save time by eliminating the first step of machine translation post-editing (i.e., the evaluation of the source text and testing different MT engines). It might seem like an easy way to reduce the work, but it is worth it, in the long run, to not skip this step.

It usually costs less

Of course, once you reduce the time for translation, the overall costs associated with the given project also decrease. You have to be careful and correctly evaluate the so-called editing distance to ensure that the MT output requires less editing than the total amount of time required for full human translation. It usually takes a lot of evaluation and testing to reach the point where you can produce similar quality with post-editing in less time (and associated with fewer costs) than human translation from scratch.

Works with large bodies of text

When you want to translate a large body of text as a human translator, it can be very time-consuming if you don’t have the necessary resources to find matches, such as translation memories or bilingual corpora. In this case, you can benefit from pre-translating your text (or at least parts of the source documents) with an MT engine to save time and reduce the amount of text to be translated.

You can reach high quality

If evaluated properly and you can choose an MT engine that best suits your source text, you can achieve high-quality translation that closely matches the quality of a human-translated text. All that, in less time, and for less money.

blog_divider_memoq_MT__3

The Disadvantages of Machine Translation

There are, of course, use cases, where machine translation (even with post-editing) just doesn’t work. Let’s see where the pitfalls of MT can lie.

Not all texts are suitable

One big disadvantage of using MT instead of human translation is that MT cannot take context into account. There are types of texts which are more fit for machine translation, and other types where it is more beneficial to go with a human translator only.

Accuracy

Context cannot always be predicted by a machine translation engine. This means that it also cannot take into account the style of the source text or the cultural references contained in the original copy.

Machine Translation in Your Localization Workflow

If you want to include machine translation post-editing in your localization projects, it is crucial that you correctly follow the necessary steps. If you want to take maximum advantage of MT and make sure you indeed save time and costs, you have to first invest in it by taking the necessary initial steps before jumping in and pre-translating and editing your segments.

The four steps that make up the ideal post-editing machine translation workflow (according to Nimdzi) are:

Preparation of the source text
First, the source text needs to be prepared for post-editing. This is also when you should decide which MT engine to use.
Testing and QA
Once you have your machine translation engine candidates, you can take a small portion of your source text and test the performance of different engines.
Post-editing
This is what most people think of as PEMT. The step where a translator evaluates and edits the target text.
QA
The final evaluation of the target text by translators and reviewers.

At memoQ, we advise that once you start investing in MT within your company and your projects, you should spend a considerable amount of testing and evaluating to see which engine and which method is fit for your text types, your quality expectations, and your industry.

Machine Translation in memoQ

If you work with machine-translated text and you understand how much post-editing is needed for your project, you can get straight to work in memoQ. Our TMS includes a myriad of different MT engines to choose from. To learn which ones are available and how these integrations work with memoQ, visit our Machine Translation page. By combining term bases, translation memories, and machine translation, memoQ can be a perfect solution for machine translation post-editing.

Get started with memoQ today!

Zsófia Lelner

Linguist turned content marketer, telling the story of memoQ.

Machine Translation: The Complete Guide

What is Machine Translation?

A Brief History of Machine Translation