Common errors in machine translation – how to fix them with full post-editing

Written by Meinrad Reiterer | 13 October 2020

Machine translation divides opinions in the translation industry like no other subject. But are the critics justified? It’s certainly not (yet) 100% reliable – so we thought we’d explain the common sources of errors in machine translations and how they can be fixed with full post-editing.

“Man AND machine”, not “Man OR machine”

In a previous blog on machine translation we explained how “Man OR Machine?” is the wrong question to ask, because the answer is “Man AND Machine”! Machine translation can’t yet match the quality of human translations, so to produce that level of quality the machine translation output needs to be edited: this process is called full post-editing. We describe what exactly full post-editing involves here.

Three sources of errors in machine translations

There are still aspects of translation which machine translation engines struggle with. So we really can’t emphasize this enough: pay attention when post-editing! Otherwise, you’re likely to miss common errors which can affect the quality of the translation. The following examples illustrate the aspects where machine translation isn’t yet at the same level as human translation, and why post-editing is so important. In the screenshots, the source text is always on the left and the target text is always on the right.

Terminology

Carefully managed terminology is one of the building blocks for high-quality translations. It involves using databases (known as term bases) to store key terms that have been reviewed and approved by individual clients, providing a vital tool for translators and helping to ensure all translations are clear and consistent – a key element in optimizing the translation process.
But machine translation engines can’t (yet) incorporate these term bases into their output, resulting in inconsistent vocabulary and unclear, sometimes even incorrect translations. Here’s an example:

The problem: The English text uses the terms “user’s manual” and “instruction manual”, and they’ve been translated with the same word in German (“Bedienungsanleitung”). It may be that the two English terms refer to different sets of instructions, but the machine translation engine hasn’t recognized this. So at the very least it has made the text harder to understand for German readers, and it could be downright misleading.

Context

Using various bits of information to form coherent ideas is a natural human thought process, but it’s beyond machine translation engines. Rather than interpreting text based on context, as humans do, engines translate sentence by sentence and as a result don’t detect how certain parts relate to other parts. This means the output they produce often doesn’t fit together – another potential source of misunderstandings. Here’s an illustration of what we mean:

The problem: The “it” in the source text in segment 9 refers to “the file” mentioned in segment 8. This “it” has been translated as “es” in German, which is incorrect here. The German word for “file” is “Datei”, which is feminine – so the pronoun should be “sie” rather than “es”. So the reference to the file has been lost in the German translation, and the German readers don’t know what will be displayed on the screen. This has happened because the machine translation engine treats each sentence separately, and doesn’t recognize the links between them the way we do.

Text components

Certain text components are another sign that machine translation engines aren’t yet a match for the human brain. When translating from one language into another, human translators will naturally use the right text components, such as quotation marks, for the target language. Machine translation engines, by contrast, use the same quotation marks whatever language they’re translating into.

Other similar errors include:

Inconsistent phrasing
Inconsistent use of formal/informal language
Inconsistent use of imperatives

The machine translation output looks like this:

Incorrect quotation marks:

The problem: The quotation marks have simply been copied from the source text, even though they’re wrong in the target language. The German quotation marks should look like „this”, rather than like “this”.

Inconsistent use of formal/informal language

The problem: Here, the German translation has used the formal “Sie” in one segment and the informal “du” in the next. English doesn’t distinguish between the formal and informal “you”, of course, and this inconsistency has arisen because the machine translation engine hasn’t treated the segments as one collective whole.

Inconsistent use of imperatives

The problem: The engine’s inability to “think” beyond each segment also leads to inconsistent use of imperatives. There’s more than one way to phrase imperatives in German, and in segment 5 the engine has used one way (putting the imperative “Wählen Sie” at the start) and in segment 6 it has used another (putting the imperative “öffnen” at the end). This results in an inconsistent text, and it can be particularly problematic if there are lots of imperatives one after the other.

Post-editing: vital for producing clear, high-quality translations and avoiding extra costs

Even though at times it may feel like you’re splitting hairs, failure to fix incorrect terminology, contextual issues and unsuitable text components can have serious consequences. Not only do these errors reduce the quality of the translation, they can also create misunderstandings in technical documentation – and in the worst-case scenario lead to improper use of machinery, resulting in accidents, injuries and potential claims for damages against the manufacturer. Plus, every sentence or word written differently adds to what you pay for your translations: instead, if you phrase sentences the same way every time and use the same words for the same components you’ll only pay to have them translated once. So full post-editing needs to be an essential part of the machine translation process in order for the results to compare with texts produced by human translators.

This post has discussed a few examples of errors in machine translation, but there are lots more aspects that need to be taken into account with full post-editing. So we’ve put together a simple checklist that you can use to ensure your machine translation projects are delivered successfully.

View full post