Student’s research helps prevent message from getting lost in (machine) translation
By Suraaj Aulakh
In today’s world, it is much easier to communicate with someone who doesn’t speak the same language thanks to convenient translating applications such as Google Translate and Skype Translator. However, if you’ve ever used such tools, you may have noticed that sometimes phrases don’t translate correctly. And that is due to the central challenge in designing these systems—human language is complex.
“Languages don’t perfectly align with each other,” says Anahita Mansouri Bigvand, a PhD student in the School of Computing Science who studies natural language processing and machine translation.
“One word in English could align with one or more words in another language, or it could align to no words in the other language,” she adds.
Working with professor Anoop Sarkar in SFU’s Natural Language Laboratory, Mansouri Bigvand is developing a better word alignment model for statistical machine translation (SMT).
Word alignment involves more than simply finding the corresponding word in another language. Some words have different meanings, and without context an SMT model could incorrectly translate the text.
That’s why Mansouri Bigvand has developed a new model that learns, and improves, by considering both word alignments and alignment types such as semantic, function, grammatically inferred semantic and grammatically inferred function.
These linguistic tags provide additional information that, for example, can make a distinction between aligned function words in both languages versus aligned content words.
“By building a model using linguistically motivated alignment types, we can design an algorithm to improve word alignment, and hopefully improve translation quality,” she says.
And when Mansouri Bigvand tested out her new SMT model, which incorporates such algorithms, she saw a significant improvement in word alignment and translation quality.
“In our experimental results, the generative models we introduce that use alignment types significantly outperform the models without alignment types.”
Building an advanced word alignment model has many applications. In addition to text or voice translation systems, there are also voice-powered systems and virtual assistants— such as Amazon’s Alexa and Apple’s Siri—that rely on accurate machine translation. Aside from machine translation, word alignment also plays an important role in several natural language processing tasks that are central to human-computer interaction and artificial intelligence.
Last year Google Translate released a major update that significantly improved the translation quality across multiple languages, including those that were not previously supported by the application. This is because Google switched from an SMT model to a neural machine translation (NMT) model, which translates full sentences rather than individual words or sentence fragments.
So how does Mansouri Bigvand’s new SMT model compares to Google Translate’s NMT model?
“We’re not competing with Google,” she says. “Rather, we’ve got the same end goal but are using different models to get there.”
Mansouri Bigvand’s research will soon be published in the journal Transactions of the Association for Computational Linguistics, and she will be presenting her model at the Empirical Methods in Natural Language Processing conference in September.