Meta Is Building an AI to Fact-Check WikipediaAll 6.5 Million Articles – Singularity Hub

Most people older than 30 probably remember doing research with good old-fashioned encyclopedias. Youd pull a heavy volume from the shelf, check the index for your topic of interest, then flip to the appropriate page and start reading. It wasnt as easy as typing a few words into the Google search bar, but on the plus side, you knew that the information you found in the pages of the Britannica or the World Book was accurate and true.

Not so with internet research today. The overwhelming multitude of sources was confusing enough, but add the proliferation of misinformation and its a wonder any of us believe a word we read online.

Wikipedia is a case in point. As of early 2020, the sites English version was averaging about 255 million page views per day, making it the eighth-most-visited website on the internet. As of last month, it had moved up to spot number seven, and the English version currently has over 6.5 million articles.

But as high-traffic as this go-to information source may be, its accuracy leaves something to be desired; the page about the sites own reliability states, The online encyclopedia does not consider itself to be reliable as a source and discourages readers from using it in academic or research settings.

Metaof the former Facebookwants to change this. In a blog post published last month, the companys employees describe how AI could help make Wikipedia more accurate.

Though tens of thousands of people participate in editing the site, the facts they add arent necessarily correct; even when citations are present, theyre not always accurate nor even relevant.

Meta is developing a machine learning model that scans these citations and cross-references their content to Wikipedia articles to verify that not only the topics line up, but specific figures cited are accurate.

This isnt just a matter of picking out numbers and making sure they match; Metas AI will need to understand the content of cited sources (though understand is a misnomer, as complexity theory researcher Melanie Mitchell would tell you, because AI is still in the narrow phase, meaning its a tool for highly sophisticated pattern recognition, while understanding is a word used for human cognition, which is still a very different thing).

Metas model will understand content not by comparing text strings and making sure they contain the same words, but by comparing mathematical representations of blocks of text, which it arrives at using natural language understanding (NLU) techniques.

What we have done is to build an index of all these web pages by chunking them into passages and providing an accurate representation for each passage, Fabio Petroni, Metas Fundamental AI Research tech lead manager, told Digital Trends. That is not representing word-by-word the passage, but the meaning of the passage. That means that two chunks of text with similar meanings will be represented in a very close position in the resulting n-dimensional space where all these passages are stored.

The AI is being trained on a set of four million Wikipedia citations, and besides picking out faulty citations on the site, its creators would like it to eventually be able to suggest accurate sources to take their place, pulling from a massive index of data thats continuously updating.

One big issue left to work out is working in a grading system for sources reliability. A paper from a scientific journal, for example, would receive a higher grade than a blog post. The amount of content online is so vast and varied that you can find sources to support just about any claim, but parsing the misinformation from the disinformation (the former means incorrect, while the latter means deliberately deceiving), and the peer-reviewed from the non-peer-reviewed, the fact-checked from the hastily-slapped-together, is no small taskbut a very important one when it comes to trust.

Meta has open-sourced its model, and those who are curious can see a demo of the verification tool. Metas blog post noted that the company isnt partnering with Wikimedia on this project, and that its still in the research phase and not currently being used to update content on Wikipedia.

If you imagine a not-too-distant future where everything you read on Wikipedia is accurate and reliable, wouldnt that make doing any sort of research a bit too easy? Theres something valuable about checking and comparing various sources ourselves, is there not? It was a big a leap to go from paging through heavy books to typing a few words into a search engine and hitting Enter; do we really want Wikipedia to move from a research jumping-off point to a gets-the-last-word source?

In any case, Metas AI research team will continue working toward a tool to improve the online encyclopedia. I think we were driven by curiosity at the end of the day, Petroni said. We wanted to see what was the limit of this technology. We were absolutely not sure if [this AI] could do anything meaningful in this context. No one had ever tried to do something similar.

Image Credit: Gerd Altmann from Pixabay

Read more here:
Meta Is Building an AI to Fact-Check WikipediaAll 6.5 Million Articles - Singularity Hub

Related Posts

Comments are closed.