How Big Data Can Combat Fake News

Humans have long had a fascination with the exchanging of news, driven by our eagerness to share ideas and connect with each other by engaging in discussions and debates.

Located on top of Burnaby Mountain inside the Discourse Processing Lab, Maite Taboada, professor of Linguistics at Simon Fraser University (SFU), is harnessing the power of big data to make social media and online discussion platforms a better and more reliable place for communication. Her research is at the intersection of linguistics, computational linguistics and data science.

From the first messengers who shouted out in cobblestone squares, to the rise of newspapers made possible by the printing press, to the 24-hour news cycle we have today, our obsession with the exchanging of news still burns strong.

As news has shifted online, social media has become the digital messenger, amplifying the reach of breaking news stories. Social media has changed the way people interact with each other in the sharing of information, personal messages, and opinions about current events, personal situations and experiences. Companies that were born in the digital age—like Facebook—first were celebrated as a way for people to more easily and quickly interact across different communities and locations around the world. Many of these same companies have now acknowledged to being used as a tool for spreading misinformation—or fake news—reaching hundreds of thousands, to millions of people.

Objectivity suddenly became difficult to pinpoint with information being spread at such a rapid and uncontrollable pace online. Sharing news has become so frictionless that it makes it easy to forgo a critical lens when we scroll through our social feeds. Reading the comments section doesn’t help either, as it is often filled with abuse, trolling, harassment, racism, and/or misogyny, effectively discouraging constructive discussions, leading to some sites—like The CBC in 2015— disabling their comments on stories about indigenous people due to the lack of resources for moderation, which continue today.

As companies and their teams of coders, engineers and data scientists scramble to fix these problems, a solution may exist in an unlikely place.

Leveraging AI to tackle misinformation

Contrary to the common notion that big data is exclusive to the science domains, Taboada is leveraging big data in an unexpected one: linguistics. Researchers across all disciplines are starting to understand how applying big data approaches can advance their research. Simon Fraser University empowers these researchers to unlock the potential of big data by offering powerful infrastructure, hands-on training, and expertise to deliver new research breakthroughs and innovations.

Traditionally, Natural Language Processing is great at classifying text, assigning it to pre-defined categories and semantic analysis, like text summarization. But Taboada is taking an innovative big data approach with Natural Language Processing, creating breakthroughs in identifying fake news and toxic comments — solutions desperately needed now more than ever. By using machine learning and deep learning neural networks, she can create programs that not only understand and classify words, but also exploit contextual information that helps machines better understand the nuances of language. Taboada and SFU postdoctoral fellow Fatemeh Torabi Asr believe that there is a language of fake news – a language for wrapping false information around facts. They found that fake news is shared more often than real news, making their research vital in stemming the spread of misinformation. Outside of social media, this approach is often used in spam detection, product review analysis, coding medical patient records and a variety of other problems dealing with data in online platforms.

“Most of the big data revolution in social media analysis has examined words in isolation, a ‘bag-of-words’ approach,” Taboada explains. “We believe it is possible to investigate big data—and social media data in general—by exploiting contextual information. This is important when detecting whether a comment is sarcastic—and therefore toxic—or harmless.”

With the alarming rise of fake news and online harassment endemic in comment sections, Taboada and her team are hard at work creating a fast and reliable way to identify bias and misinformation in news articles, potentially changing the landscape of how news is shared and engaged with online. If news shared on social media can be checked for accuracy, and toxic comments are automatically filtered to encourage thoughtful discussion, the perception of social media—and the way we engage with news over it—just might fundamentally change.

No one knows what the future of social media holds, as the public has started to understand its pitfalls and shortcomings. One thing that is desperately needed—with companies like Facebook and Google expending immense resources and money towards solving—is the problem of fake news and online harassment. Taboada is not just building solutions these companies urgently want — she is creating hope that one day, we will have a platform that people can share, engage, critique and foster conversation about the world around us in an environment that encourages respect for the human beings on the other side of the screen. All thanks to harnessing the power of big data.

Partner with SFU

Interested in working with researchers like Maite Taboada? Email to start the conversation.