AI-Powered Speech Translation

Microsoft Translator’s speech translation
technology incorporates two different Artificial Intelligence systems. When audio first enters the system, it goes
through a Neural Network that transcribes the audio into text. However, as we speak, we often unknowingly
repeat or use filler words such as “um” or “like” in English. The speech translation system knows this,
and removes these filler words to produce a translation that makes sense. The system uses a natural language processing
technology called TrueText which normalizes the text and finds the most appropriate words
based on the context of the full sentence. Once the text is normalized, it goes through
a second AI system – Neural Machine Translation. This additional neural network translates
the text into the target language – in this case, English to Japanese. If the application also offers an audio output
– such as the Microsoft Translator live feature – a speech synthesizer, or Text
to Speech, will then enunciate the text and the user will be able to hear the audio translation
in their own language.

