MEINRAD's Blog

How to make your texts suitable for machine translation

Written by Meinrad Reiterer | 25 April 2024

As a technical editor or content creator, your destiny – or more accurately, the destiny of your texts – is more or less in your own hands. You can make your texts suitable for machine translation and post-editing by bearing these processes in mind when you begin producing your content. We’d like to give you some top tips.

The quality of a translation depends to a large degree on the quality of the source text, and this is particularly true if you’re going to use machine translation. That’s because the MT engine can’t read between the lines and pick up on implied meanings like a human being can. Many errors and ambiguities which a human translator would easily notice in the source text and then correct and clarify in their translation can prove to be stumbling blocks for artificial intelligence – and the result is nonsensical or incorrect translations, which in turn makes life harder (and sometimes too hard) for the post-editor.

Where machine translation reaches its limits

If you don’t bear in mind the limitations and “quirks” of MT engines, there’s a good chance you won’t get satisfactory results. There are some types of texts where artificial intelligence struggles, especially texts which feature cultural references or emotive and stylistic elements. Advertising and marketing texts are typical examples, but even technical texts aren’t necessarily always suitable for machine translation. If they have minimal context, ambiguous phrases, long and convoluted sentences and/or lots of specialist terminology, the MT engine is almost certain to produce poor translations of these sections at least. And any syntax errors or spelling mistakes will also lead to a worse MT output.

This then gives the post-editor a lot more to do when working on the text, sometimes to the extent that it eliminates the justification for using the machine translation with full post-editing workflow. The good news is that whether your texts are suitable or not for machine translation is (largely) in your hands.

Think ahead and write with MT in mind

The better and clearer a text is (and the fewer mistakes it contains), the more suitable it will be for machine translation, and the more you’ll benefit from machine translation and post-editing. There’s a growing trend among technical editors to write in simple language. This doesn’t just benefit users in the source language – it also makes life easier for translators, whether they’re using the human translation or the machine translation with full post-editing workflow.

“Not like that” – top tips to make texts suitable for MT

If you want to make the most of machine translation, you need to be aware of how the MT algorithms work when writing your texts. As well as avoiding typos and other mistakes, here are some specific things to bear in mind when creating your content:

Top tip 1: Keep sentences short, but make sure phrases and words have enough context

MT engines struggle with long, convoluted sentences. They don’t recognize which sections relate to each other, so they sometimes produce inaccurate translations and can become more of a hindrance than a help. Ideally, you should use simple sentences (no longer than 25-30 words) and keep parentheses or subordinate clauses to a minimum. If your sentence is getting too long, simply put a full stop and divide it into two or more short sentences. And a clear word order helps too.

On the other hand, there is such a thing as too short. Lists without any context (such as GUI texts) pose real problems for machine translation tools. For example, the German word “Bank” can mean either a bank or a bench – so having context is crucial.

Top tip 2: Keep things clear and rein in your creativity

The bank/bench issue leads to our next top tip: avoid words with multiple meanings. If you “feed” the MT engine words that can have multiple meanings, it will “choose” one of them – often the wrong one, even if it’s in a text with full sentences and plenty of context.
And this is also a risk if you use a wide variety of synonyms in your source texts. Forget what you were taught at school about not repeating words in your essays or stories, and make your texts suitable for MT by using the same words for the same things.

From start to finish, your text also needs to be clear and unambiguous in terms of who needs to do what. So avoid sentences without any verbs.

Top tip 3: Avoid pronouns, the passive voice, negatives and abbreviations

Machine translation tools also struggle with pronouns, the passive voice, negatives and abbreviations.

If you use pronouns, the MT engine won’t “know” what they refer back to, and this is a particular problem when translating from English into languages with gendered pronouns. The English pronoun “it”, for example, needs to be translated as “er”, “sie” or “es” in German depending on whether the object in question is masculine, feminine or neuter. So if you simply write “It must be switched off”, the MT output is likely to have the wrong German pronoun. You should also avoid abbreviations as far as possible, especially if they’re not widely known, as the MT engine will probably translate them wrongly too. And the same applies for words in capital letters.

Top tip 4: Don’t put bullet point lists in the middle of a sentence

Sentences interrupted by a list of bullet points are hard for readers to understand and for MT engines to translate correctly. Readers will probably need to look at these sentences several times to understand them, but the MT engine will just translate them – usually incorrectly.

Top tip 5: Think of the layout

Producing texts suitable for machine translation also means structuring the source text so that the MT engine you’re using can tell which parts of the text go together. Formatting with tabs and spaces isn’t a good idea, and neither is using line breaks in the middle of a sentence. You should tidy up the layout before sending your document to the translation agency, especially if it’s a converted PDF file.

Top tip 6: Check the text for typos, spelling mistakes and grammatical errors

Finally, make sure that the punctuation in your document is consistent and correct, and check that upper and lower case letters have been used appropriately. The MT engine won’t be able to identify and correct these kinds of mistakes in the way that humans can – often automatically and subconsciously.

Summary

The machine translation engine will only deliver good, usable results if you give it well-produced texts. This is the only way to keep the post-editor’s workload to a reasonable level and to actually save the time and money you were hoping to save when you decided to use machine translation. So the best approach is to bear MT in mind when you start writing your texts. But even if you haven’t considered what machine translation can and can’t do, there’s always the opportunity to revise, correct and adapt the text you’ve produced before you get it translated. This process is called pre-editing, and it can really pay off!

 

 

 

Main image © Adobe Stock