Recent research in the field of interpreting has shown that systems for speech translation handle the interpretation of person names poorly. This shortcoming does not only lead to errors that can seriously distort the meaning of the input but also hinders the adoption of such systems in application scenarios where the translation of named entities, like person names, is crucial.
Automatic speech recognition (ASR) and speech translation (ST) systems both have a transcription and translation accuracy rate of about 40% for personal names.
These incredibly low numbers in accuracy requiring area are showing not just immense issues in the system or lack of its advancement but could also lead to harsh consequences in the world of diplomacy.
Today we will discuss the main factors that strongly influence name recognition during the live translation and while using automated speech recognition tools. We will touch on the measures you should take to improve person name translation and avoid misinterpretation on a big scale.
Factors Influencing Name Recognition
Translating the person’s name is difficult for live translation and ASR systems. Humans still struggle to operate with foreign names during live translation, so interpreters are interested in using computer-assisted translation to help with this complicated task.
Here are 3 main factors that influence the ability of a system to translate a person’s name:
- The token frequency in the target transcripts and translations. The person names occurring with similar variations at least in 3 different languages are likely to be correctly translated. The rarer the name is, the higher the chance of its incorrect representation.
- The nationality of the referent. Some names belonging to languages different from interpreters’ native are “Englished” (e.g. Youngseen instead of Jensen, Alex instead of Oleksii) during the live translation. ASR and ST systems are trained to recognize English sounds and learn that English phoneme-to-grapheme mapping is inappropriate for non-English names. UK referents have more than twice the occurrences of non-UK referents which makes the process of their name translation even more difficult.
Who Are We Talking About? Handling Person Names in Speech
- The nationality of the speaker. The speaker’s accent can influence the correct understanding of person names during the live and ASR translation. New research at the University of Chicago shows that people don’t only find it harder to understand a speaker with an accent but are also less likely to find the information they say truthfully.
Generalize and Disambiguate
When the researchers added the referent names to the sample, they concluded that the more frequently a name appears, the more likely ASR and live translation would transcribe the name correctly.
Who Are We Talking About? Handling Person Names in Speech …
Here are 2 main solutions to improve person name translation:
- Increasing robustness to non-UK referents. A multilingual system, trained to recognize and translate speech in different languages, is more robust to achieve better performance on non-English names for ASR / ST.
- Closing the gap between automated speech recognition (ASR) and speech translation (ST). Both ASR and ST have to recognize the names from the fluent speech during f.e.live translation and produce them as in the output. ST model can close the performance gap with the ASR by conditioning the target prediction not only on the input audio but also on the generated transcript.
In the table, is the comparison of the multilingual models with triangle ST trained on the same data.
Conclusion
Humans and machines have different strengths and weaknesses. But when it comes to personal names in speech recognition, they both struggle in handling names in languages that they haven’t used before.
Three factors that can interfere with precise live interpretation are the frequency of the user name in the target translation, the nationality of the referent, and the speaker’s accent.
To improve foreign name recognition, we should try to increase the robustness of non-UK referents and work on closing the gap between ASR and ST systems.
Lastly misinterpretation could be everted by using the live translation wether during meetings or video conferences. Find the best way to break down the language barriers and increase information accuracy.
Follow TechStrange for more Technology, Business, and Digital Marketing News.