Don’t worry, we’ve all done it! We’ve copied a text into Google Translate, chosen the language we want and seen what it spits out. It costs absolutely nothing and takes just a few seconds – sounds great, right?
Machine translation engines like Google Translate used to be a bit of a joke, given how many terrible translations they came up with, but the output from machine translation engines has improved considerably in recent years. Find out more about the development of machine translation engines here.
It’s no surprise that tools like Google Translate have become so popular, as they make it easier and quicker than ever before to communicate and work with people around the world. It’s also no surprise that various machine translation engines have now established themselves on the market. But what are the major differences between them, for instance in terms of the quality of their output and the data security they offer? Let’s take a closer look at two of the best-known machine translation engines.
You might remember the first time you tried out Google Translate – it could have been as far back as 2006, as that was when it was launched. There’s a good chance you’ll have got some amusing results. Take the German word “Kernseife”, for example: “Kernenergie” means nuclear energy, and “Seife” means soap, so what did Google Translate do? You guessed it: it came up with “nuclear soap”.
As funny as it is to imagine what that might be (we reckon it sounds like something Homer Simpson accidentally made at work), what’s more interesting is why these mistakes happen. The technology behind Google Translate uses statistical methods to translate texts, which means it translates word for word based on patterns of language usage and doesn’t take the context into account. And that missing context makes its translations incorrect or completely nonsensical. But that was then, and mistakes like these are now rare when using Google Translate to translate into the 103 languages it supports. Google has updated how it works, and for certain languages it uses neural networks, a form of artificial intelligence which can be thought of as a replication of the nervous system in our brain. A huge amount of existing online data is used to train the neural network, which stores a variety of contextual and linguistic information. This means neural systems are better at learning that statistical systems, and it also makes them more flexible in the translation process – with certain language pairs, the results are amazingly good.
But although Google Translate often produces very good results with shorter texts, it’s a long way from being 100% reliable. And an even bigger concern is data security. Just look at Google’s Terms of Service:
“When you upload, submit, store, send or receive content to or through our Services, you give Google (and those we work with) a worldwide licence to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes that we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content.”
In short, that means the data is not secure. So if you need to translate sensitive commercial information, steer well clear of Google Translate.
The DeepL machine translation engine has made a name for itself ever since it was launched in 2017 by a start-up based in Cologne, which claimed it could beat the big players like Google and Bing. And they were right: multiple tests have shown that DeepL produces better translations than Google Translate.
But what makes it stand out?
There are two main versions of the machine translation engine: a free version limited to 5,000 characters, and a Pro version costing 20 euros a month which allows users to translate up to a million characters into German, English, French, Spanish, Portuguese, Italian, Dutch, Polish and Russian. What really caught the attention of translation agencies everywhere was that the Pro version features APIs and a software plug-in for CAT tools.
In terms of how to use Google Translate and DeepL, at first it seems like there’s not much difference: they both have two input fields, and once the text to be translated has been copied or typed into one field, the translation appears in the field next to it. Although at the moment Google Translate can translate into over 100 languages compared with DeepL’s nine, the latter allows users to highlight individual words to see alternative translations.
The real difference between these two engines is in the technology they use. Both use neural networks, but Google Translate (like most other machine translation engines) uses what are known as recurrent neural networks. By contrast, DeepL uses convolutional neural networks (CNNs), which produce better all-round results for longer, continuous sequences of words. Although CNNs aren’t perfect, and so far haven’t been used by other machine translation providers, they process texts better in parallel and produce better translations as a result.
The CNNs used by DeepL have been trained using the database from its own online dictionary Linguee. Linguee searches the net for translations, adds them to its database, and uses algorithms and user feedback to evaluate them. The company behind DeepL has not let on exactly how their machine translation engine compensates for the weaknesses of CNNs.
We’ve already mentioned how the Terms of Service for Google Translate make clear that data is not secure, but what about DeepL? Our experience is that the free version is absolutely fine for personal use. But like with Google Translate, the free version is not suitable for commercial use. Although it meets EU data protection regulations (after all, DeepL is based in Germany), only users who pay for the Pro version get end-to-end encryption when transferring data and the option to delete their source texts after translation.
The Pro version of DeepL is, in essence, suitable for commercial translations. However, its output isn’t yet as good as translations produced by humans. Its shortcomings are particularly obvious in texts where the style matters, such as creative marketing texts where a literal translation almost certainly won’t work. And texts that feature important safety or legal information (technical documentation, hazard information etc.) should be left to the experts: when it comes to keeping users of machinery safe and avoiding personal injury and claims for damages, errors in translations can’t be allowed to happen. So machine translation should always be used in tandem with manual correction of machine translation outputs, known as post-editing.
Good translation agencies will be familiar with the advantages and disadvantages of machine translation and can use it to benefit you. So when it really matters, speak to the professionals at your translation agency!
Main image: © MEINRAD