How to fight misinformation with the support of machine learning tools?
How good are we at telling fact from fiction on
the Internet? Admittedly, it can be difficult at times – there’s a lot of
misinformation floating out there. Some sites and blogs routinely present
opinions as facts to score quick political points, others use misleading
headlines to trick us into clicking and sharing content and yet, others will
flat out lie to us, suggesting that goji berries, green coffee beans or some
other “weird trick” will magically burn off 50 pounds of belly fat without us
needing to exercise.
The rise of social media has created a seemingly
unstoppable force of misinformation. Propaganda pushed through state-sponsored
channels is disinformation, but the content in our social media feeds shared by
friends is misinformation. Misinformation can be described as information that
is unintentionally false, i.e. the person who is disseminating it believes that
it is true. While new technologies accelerate our ability to communicate with
each other, they also accelerate the spread of misinformation.
Contemporary social media platforms offer a rich
ground for the spread of misinformation. Combatting its spread is difficult for
two reasons: the profusion of information sources, and the generation of
"echo chambers." The profusion of information sources makes the
reader's task of weighing the reliability of information more challenging,
heightened by the untrustworthy social signals that go with such information.
The inclination of people to follow or support like-minded individuals leads to
the formation of echo chambers and filter bubbles. With no differing
information to counter the untruths or the general agreement within isolated
social clusters, the outcome is a dearth, and worse, the absence of a
collective reality, some writers argue.
As the world gets ready to tackle fake news,
technology has set the trend by showing us how to identify and tackle it. Here
are some ways to mitigate the spread of misinformation with the power of
machine learning.
1. News Quality
Scoring
The powers of machine learning could be
leveraged in combating misinformation by building a quality tag system capable
of determining the trustworthiness of websites. To achieve this, a publisher
presents its stories to the news quality scoring platform, which then assesses
the content to come up with a global score for quality. This process would be
done at scale, automatically, and using machine learning algorithm. A crucial
part of the quality tag system is labeling the dataset, i.e., thousands of news
articles. The process will be both automated and rely on collaborative
filtering.
The news quality scoring platform would rely on a
combination of two models to carry out its task. The first model involves two
sets of “signals” to assess the quality of journalistic work: Quantifiable
Signals and Subjective Signals. Quantifiable Signals are collected
automatically. These signals include the structure and patterns of the HTML
page, advertising density, use of visual elements, bylines, word count,
readability of the text, information density (number of quotes and named
entities). Subjective Signals are based on criteria used by editors (and
intuitively by readers) to assess the quality of a story: writing style,
thoroughness, balance & fairness, timeliness, etc. (This set will be used
only in the building phase of the model). — The second model is based on deep
learning techniques, like "text-embedding" in which texts from large
volumes of data (millions of articles) are converted into numerical values to
be fed into a neural network. This neural network returns the probability of
scoring, and with this score, a site’s factual accuracy could be determined.
2. Automated
Facts Checking
To fight misinformation, it is imperative to weigh facts
that the news in context purports to share. Automated facts checking
initiatives generally focus on one or more of three overlapping objectives: to
spot false or questionable claims circulating online and in other media; to
authoritatively verify claims or stories that are in doubt, or to facilitate
their verification by journalists and members of the public; and to deliver
corrections instantaneously, across different media, to audiences exposed to misinformation.
Using artificial intelligence and machine learning, the three elements – identification,
verification, and correction can be addressed.
Real-world automated facts checking efforts begin with
systems to monitor various forms of public discourse – speeches, debates,
commentary, news reports, and so on – online and in traditional media. Once
monitoring is in place, the central research and design challenge revolves
around the closely linked problems of identifying and verifying factual claims.
The best approach to this would be the reliance on a combination of natural
language processing and machine learning to identify and prioritize claims to
be checked. The natural language processing algorithm would go through the
subject of a story, headline, main body text and the geo-location. Further,
artificial intelligence will find out if other sites are reporting the same
facts. In this way, facts are weighed against reputed media sources using
artificial intelligence. Probabilistically, using machine learning, the system
would be able to analyze a news story against a database of information, facts
or past events and give some indicator signals whether the published
news/content needs to be double-checked or not.
3. Predict
Reputation
Even before eyeballs capture news items, knowing the
reputation of the source sharing the news will do a world of good to nip fake
news problem in the bud. A reference to the Wall Street Journal would raise no
doubt about the reputation of a news source. This becomes stronger when it is
compared with another source that is unknown. By creating a machine learning
model, it is possible to determine the authenticity of a website and predict a
website’s reputation, considering features like domain name and Google/Alexa
web rank.
4. Discover
Sensational Words
When it comes to news items, the headline is the
key to capture the attention of the audience. It is for this reason that
sensational headlines become a handy tool to capture readers’ interest. When
sensational words are used to spread fake news, it becomes a lure to attract
more eyeballs and spread the news faster and wider. By using keyword analytics,
machine learning can be instrumental in discovering and flagging fake news
headlines.
Misinformation can have devastating outcomes and the most
unfortunate fact is that it spreads more quickly and widely and is more
engaging or appealing to the viewers. This is because in the online world,
content choices are saturated, and the users have a limited attention span.
Spreading misinformation, thus, has become so prolific that it is now nearly
impossible for humanity to dig itself out of the quagmire. The last resort is
to devise machines to pull us out. Machine learning techniques with the support
of Artificial Intelligence have the capability to separate the good from the
bad through pattern recognition that facilitates learning behaviors from past
occurrences. Algorithms can be devised around these patterns to help in weeding
out the false from the truth. Thus, machine learning tools as listed above can
be devised to fight the spread of misinformation.
References
References
https://aboutbadnews.com/about-fake-news
Wikipedia/misinformation
http://aclweb.org/anthology/W18-5502
https://www.forbes.com/sites/charlestowersclark/2018/10/04/can-ai-put-an-end-to-fake-news-
dont-be-so-sure/#18d9bdf72f84
0 comments → Fighting Misinformation with AI
Post a Comment