Greater than 1 minutes
This article did draw some attention, and people criticize the testing method, the test set… Well, I honestly don’t know about that.
The past 3 years I’ve been doing a lot of comparative tests on human and MT translations, using new tools and relying on skilled human evaluators, all linguists of some kind. The MT output we evaluate has been produced by various competing engines, so our observations are not only about the Microsoft Chinese > English language pair.
We’ve seen 3 things that are worth mentioning:
- MT is improving fast, but… sometimes the fluency is better than the accuracy. Depending on the type of text you are translating, this can work in your favor… or not.
- MT engines are learning much more of your work, so if you are using an MT system with feedback loop (like LILT), you are gaining much more profit than before. Of course, this only has value if you always translate similar texts, or if you work for the same customer(s).
- Editing MT output, is often less of a hassle these days: just replacing, moving or adding a word(group). You would believe (and our customers may have heard) it is “less work”, but that may not always be true: in the past it was much easier to spot mistakes. Today the fluency may be misleading… Our tools register time when people evaluate the translation quality, and we see that better MT output does not mean humans can decide faster about the quality. Spotting mistakes is still brain-work.
Finally: not all neural MT language pairs have improved the same way, and as ever: all depends on the document you need to have translated. Some documents are just not fit for machines.
As professionals, we need to be informed, and know there is some truth in a maybe too positive message. But the trend is clear.