Why speech recognition technology is unfairly distributed and what you can do about it




  • Greater than 2 minutes, my friend!

    As an Irish citizen, I have mixed emotions when it comes to British rule in Ireland. Politically it was disastrous, but it did leave us with one advantage. The language most commonly spoken in Ireland is English, and it is spoken in many countries that are also former British colonies. This historical accident also means that language technologies are also more advanced than for Irish Gaelic due to the economics of R&D.

    This is particularly true for Automatic Speech Recognition software in the form of Dragon Naturally Speaking (DNS), a sophisticated offline speech recognition application. Many translators claim the software doubles their productivity, and hence their earnings. But it is not available for most of the world’s languages. A survey carried out a few years ago by Dragos Ciobanu in the University of Leeds found that speech recognition is most commonly used by English, French, German and Spanish translators. This is probably because these languages are well supported in DNS. Though the application also supports Italian, Dutch and Japanese, it seems it has not taken off in these language communities. It is not clear if this is due to underlying accuracy problems or just a lack of word-of-mouth propagation. There are several indicators that use of DNS is on the rise. A recent survey by the CioL and ITI found that 15% of mainly UK-based respondents use speech recognition and 84% reported that DNS was their preferred tool. Also, groups like “Translators who use Speech Recognition” on Facebook provide support and encouragement to translators who are new to the dictation game.

    This is all very well for translators who translate into English etc., but not for other languages. For example, Nordic or Slavic languages are relegated to using off-the-shelf services from Google, Microsoft, Apple or Nuance that do not adapt to the voices of their users or the texts they translate. Also, they cannot be improved by users in terms of adding new terminology via the vocabulary editor found in DNS. This is unfortunate as translators mostly use specific terminology when we work.

    But the benefits of speech recognition technology are not just in terms of faster translation. The software can also have health benefits. Excessive keyboard use has been shown to cause Repetitive Strain Injuries, like carpal tunnel syndrome. In fact, many translators found their way to speech recognition software due to sore hands, wrists or tennis elbow. The ergonomic advantages do not stop there. Working with a microphone can also provide more freedom in terms of body position when working so neck and back problems can also be reduced.

    This is why I am particularly excited by some work coming out of my former research lab in the ADAPT Centre in Trinity College Dublin. They are currently running a survey to see if there is interest in speech recognition software integrated into CAT tools amongst translators in languages not supported by Dragon Naturally Speaking. The idea is to use data from manual transcription to develop speech recognition for new languages and harness the power of MT to improve the speech recognition by guessing what the translator is saying using the additional context found in the source segment.

    If your target language does not have good speech recognition and you would like to see that changed, you might want to take 10 minutes to fill out the survey on the website. I am reliably informed that they plan to develop languages based on the perceived needs amongst respondents, so you could also tip the balance of power in favour of your target language by sharing the survey in any online groups you belong to that are focused on translators in your target language.

    The link to the SightCAT survey website is HERE.

    One thought on “Why speech recognition technology is unfairly distributed

    1. John, have you tried using your telephone to dictated? That’s what I do, since the phone has a lot more languages than Dragon. A Brazilian colleague developed a hack to use TimeViewer + cellphone to remotely operate his CAT tool and do his tralations through dictation. I haven’t gone all the way yet, so I dictate into an MS Word document on my phone, save it to the cloud, then retrieve on my computer. It works.

      Nice read. Thanks!

    Leave a Reply

    The Open Mic

    Where translators share their stories and where clients find professional translators.

    Find Translators OR Register as a translator