Proud and Doubt. What makes translation human and what makes human translations superior by design.




Greater than 4 minutes, my friend!

Proud

When I’m translating (or writing, or coding) and delivering/publishing, I want to be proud of what I did. It’s part of human nature, I guess. My name is on the stamp, so I want that to be meaningful. Machines don’t have that kind of healthy ego. They have other qualities of course: they are fast and they don’t need to sleep once in a while…  The fact many professionals want to be proud of their work is a quality driver. Machines don’t have this kind of intrinsic motivation.

Doubt

Machines also don’t doubt what they do. They don’t feel the need to lookup how something was formulated a couple of paragraphs back. They don’t mind to be boring, and they can’t see the difference between “boring” and “consistent”. Sometimes we want to be consistent, sometimes we prefer to use a synonym. We can feel this, and we can discuss this with others and with ourselves. We use a lot of language in our head to deliver what we consider to be the best. Just like machine translation engines that rely on a language model, we have our own language model: the sentences we write down are the best we can come up with. While the current generation of machine translations engines have a disconnect between the translation model (producing hundreds of translations for every sentence) and the language model (picking the most fluent one of those), we humans don’t have that disconnect: when we pick the best sentence, we also evaluate the translation: is the transfer of meaning optimal, and is it optimal for the future reader of this document. All this makes our human translations better by design.  The future generations of engines using neural networks, may do it more like us, but I doubt they’ll be able to have doubts like we do.

Human Language

Our language was created by us so we can be clear when we communicate. Others validate us all the time, from the cradle to the grave. We are trained to do our very best. I was once managing a science driven project in which it was key translators only fixed what was really wrong (the meaning was wrong, for example, or the word order made the sentence unreadable, …) We asked human translators NOT to create perfect quality and we gave them a list of errors that were OK with: conjugation errors, no big deal; getting a plural wrong: no big deal; using a strange combination of words in the context: no problem…  We scored the quality level of what they delivered and for most translators it took up to 1500 sentences (all delivered in small batches to us) to get it right. ALL translators started by translating as they always do: the fixed the small errors as well. In the end, every translator was capable of delivering what we needed, but it clearly was not natural to them. In case you wonder why the scientists were not keen on the small errors: they wanted to see if a machine was capable of learning to avoid big errors. When small errors get fixed, it may confuse the machine. If the training material also contained fixes of small errors, they could not measure the impact of the training data on the existing model.

Machines

I’m happy to see a popular generic translation engine make less mistakes than a few years ago. I can finally read what my Romanian and Japanese friends post on Facebook. When I don’t get it, I use another generic engine to see if it makes more sense to me. But despite my gratitude, I know that when sentences are too complex, when their meaning is influenced by context, machine translation has a high probability of not delivering a sentence that makes sense.

I’m amazed that we can train our own engine in a very narrow domain and translate documents from that narrow domain in such a way that the biggest challenge in editing the translated documents is finding the error in the haystack. This is also why I advise translators to open their job and check not only the fuzzy match statistics, but also to check which engine was used to produce the translation. Just keep in mind that even a well trained engine can produce a crappy job: when it translates a document that did not belong to the same domain as the one the engine was trained on. That human mistake has upset already a lot of translators, who then blame the machine 😉

Risk Management

When machine translation has been used to pre-translate a job, we don’t know how good it was. Based on our experience, we may have our doubts, but the truth is this: sometimes it will be OK, sometimes it won’t. And we don’t know. If we use machine translation, we add risk to our job. We speculate on our own work.

We’re not even close to enrich machines with “proud & doubt”, but at least the next step of machine translation has been validated: we can use machine learning technology to compare the output of machine translation engines, and pick the best one. We can use that technology to estimate if an MT output is worth our time, or not. This type of capability will not replace Google’s or Microsoft’s MT engines, but it may help us when we have doubt about taking on a new translation job.

This new type of machine translation and machine translation quality assessment will help us make the risk visible. So we know how much risk we’re taking. It will help us have less doubt about machines. We can focus our doubt on our own work. So we can always be proud of it.

Gert Van Assche

About Gert Van Assche

At Datamundi we're paying a fair price to linguists and translators evaluating (label/score/tag) human translations and machine translations for large scale NLP research projects.

3 thoughts on “Proud and Doubt. What makes translation human and what makes human translations superior by design.

  1. I believe that humanoid translators in future, 30-100 years from now, will be as equally good as human translators and there will be a war between humanoid versus human translators.
    Will human translators win?
    One key is arbitrary nature of human language. As long as human beings naturally keep this nature, human translators will have an opportunity to win the war. But, if linguists-formalists successfully engineered human language into strictly predictable and orderly human languages, than humanoid translators will win.

    Report comment
    1. Dani, you may be right about this. With what I know about humans and machines today, I don’t know if humans will ever be able to trust the machines blindly, especially not when it comes down to written language and documents that really matter. If by then we’re travelling in cars without driving them ourselves, maybe… I cannot look that far in the future. The point is: we humans appreciate certainty and machines, at least in the near future, can only provide us with probability. That makes it even more important that we humans can raise doubt. No certainty without doubt.

      Report comment

Leave a Reply

The Open Mic

Where translators share their stories and where clients find professional translators.

Find Translators OR Register as a translator