This could lead to the next big breakthrough in common sense AI – MIT Technology Review
AI models that can parse both language and visual input also have very practical uses. If we want to build robotic assistants, for example, they need computer vision to navigate the world and language to communicate about it to humans.
But combining both types of AI is easier said than done. It isnt as simple as stapling together an existing language model with an existing object recognition system. It requires training a new model from scratch with a data set that includes text and images, otherwise known as a visual-language data set.
The most common approach for curating such a data set is to compile a collection of images with descriptive captions. A picture like the one below, for example, would be captioned An orange cat sits in the suitcase ready to be packed. This differs from typical image data sets, which would label the same picture with only one noun, like cat. A visual-language data set can therefore teach an AI model not just how to recognize objects but how they relate to and act on one other, using verbs and prepositions.
But you can see why this data curation process would take forever. This is why the visual-language data sets that exist are so puny. A popular text-only data set like English Wikipedia (which indeed includes nearly all the English-language Wikipedia entries) might contain nearly 3 billion words. A visual-language data set like Microsoft Common Objects in Context, or MS COCO, contains only 7 million. Its simply not enough data to train an AI model for anything useful.
Vokenization gets around this problem, using unsupervised learning methods to scale the tiny amount ofdata in MS COCO to the size of English Wikipedia. The resultant visual-language model outperforms state-of-the-art models in some of the hardest tests used to evaluate AI language comprehension today.
You dont beat state of the art on these tests by just trying a little bit, says Thomas Wolf, the cofounder and chief science officer of the natural-language processing startup Hugging Face, who was not part of the research. This is not a toy test. This is why this is super exciting.
Lets first sort out some terminology. What on earth is a voken?
In AI speak, the words that are used to train language models are known as tokens. So the UNC researchers decided to call the image associated with each token in their visual-language model a voken. Vokenizer is what they call the algorithm that finds vokens for each token, and vokenization is what they call the whole process.
The point of this isnt just to show how much AI researchers love making up words. (They really do.) It also helps break down the basic idea behind vokenization. Instead of starting with an image data set and manually writing sentences to serve as captionsa very slow processthe UNC researchers started with a language data set and used unsupervised learning to match each word with a relevant image (more on this later). This is a highly scalable process.
The unsupervised learning technique, here, is ultimately the contribution of the paper. How do you actually find a relevant image for each word?
Lets go back for a moment to GPT-3. GPT-3 is part of a family of language models known as transformers, which represented a major breakthrough in applying unsupervised learning to natural-language processing when the first one was introduced in 2017. Transformers learn the patterns of human language by observing how words are used in context and then creating a mathematical representation of each word, known as a word embedding, based on that context. The embedding for the word cat might show, for example, that it is frequently used around the words meow and orange but less often around the words bark or blue.
This is how transformers approximate the meanings of words, and how GPT-3 can write such human-like sentences. It relies in part on these embeddings to tell it how to assemble words into sentences, and sentences into paragraphs.
Theres a parallel technique that can also be used for images. Instead of scanning text for word usage patterns, it scans images for visual patterns. It tabulates how often a cat, say, appears on a bed versus on a tree, and creates a cat embedding with this contextual information.
The insight of the UNC researchers was that they should use both embedding techniques on MS COCO. They converted the images into visual embeddings and the captions into word embeddings. Whats really neat about these embeddings is that they can then be graphed in a three-dimensional space, and you can literally see how they are related to one another. Visual embeddings that are closely related to word embeddings will appear closer in the graph. In other words, the visual cat embedding should (in theory) overlap with the text-based cat embedding. Pretty cool.
You can see where this is going. Once the embeddings are all graphed and compared and related to one another, its easy to start matching images (vokens) with words (tokens). And remember, because the images and words are matched based on their embeddings, theyre also matched based on context. This is useful when one word can have totally different meanings. The technique successfully handles that by finding different vokens for each instance of the word.
For example:
Go here to read the rest:
This could lead to the next big breakthrough in common sense AI - MIT Technology Review
- What your Wikipedia reading says about you: Study find different styles - The New Daily - November 14th, 2024 [November 14th, 2024]
- Going down a Wikipedia rabbit hole? Science says youre one of these three types - The Conversation - October 26th, 2024 [October 26th, 2024]
- Studying Wikipedia browsing habits to learn how people learn - Penn Today - October 26th, 2024 [October 26th, 2024]
- Portland mayor candidate Rene Gonzalez violated rules by using public funds on Wikipedia page, auditor finds - Oregon Public Broadcasting - October 26th, 2024 [October 26th, 2024]
- Top 5 Editing Conflicts in Wikipedia Pages on Religion - Baptist News Global - October 26th, 2024 [October 26th, 2024]
- Wikipedia editors form urgent task force to combat rampant issues with recent wave of content: 'The entire thing was ... [a] hoax' - Yahoo! Voices - October 26th, 2024 [October 26th, 2024]
- Audit: Rene Gonzalez violated campaign finance law by using city funds to edit Wikipedia page - Fox 12 Oregon - October 26th, 2024 [October 26th, 2024]
- Auditor: Gonzalez violated the law by paying to update his Wikipedia entry - Portland Tribune - October 26th, 2024 [October 26th, 2024]
- Musk Says Wikipedia Controlled By Far-Left Activists, Urges People To Stop Donating To Them! - News24 - October 26th, 2024 [October 26th, 2024]
- Silent Hill 2 Remake Wikipedia page locked after salty fans try to rewrite its critically-acclaimed reception - Eurogamer - October 9th, 2024 [October 9th, 2024]
- The Silent Hill 2 Remakes Wikipedia page briefly got transformed into a phantasmagorical reflection of the psyches of idiots unable to accept reality... - October 9th, 2024 [October 9th, 2024]
- Outrage as Wikipedia changes grooming gangs article to moral panic from the 'Far-Right' - GB News - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 Falls Victim to Faux Review Bombing on Wikipedia - DualShockers - October 9th, 2024 [October 9th, 2024]
- No, you're not losing it, Silent Hill 2 Remake's Wikipedia page's review scores have been altered, and the site has had to lock it to stop people... - October 9th, 2024 [October 9th, 2024]
- Exploring (and building) the depths of Wikipedia - The Michigan Daily - October 9th, 2024 [October 9th, 2024]
- Wikipedia and Catholicism: Navigating Misinformation and Religious Bias - World Religion News - October 9th, 2024 [October 9th, 2024]
- Weird things are happening on the Silent Hill 2 remake Wikipedia page, as folks sabotage review scores for reasons - Sports Illustrated - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 Remake Wikipedia Page Locked After Fans Tried to Change Reviews - Rely on Horror - October 9th, 2024 [October 9th, 2024]
- Trolls Edit Silent Hill 2 Remake Wikipedia Page To Lower Its Review Scores - PlayStation Universe - October 9th, 2024 [October 9th, 2024]
- The Kremlin is rewriting Wikipedia - Hindustan Times - October 9th, 2024 [October 9th, 2024]
- Wikipedia Locks Silent Hill 2 Remake Page After It's Spammed With Fake Negative Reviews - TheGamer - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 remake Wikipedia locked after getting trolled - NME - October 9th, 2024 [October 9th, 2024]
- Wikimedia Technology Summit 2024 brings together tech enthusiasts and developers to bring inclusivity to Wikipedia and Wikimedia projects - Business... - October 9th, 2024 [October 9th, 2024]
- AI's threat to Wikipedia - ABC News - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 remake page on Wikipedia blocked after fans try to rewrite critics' positive reviews - ITC - October 9th, 2024 [October 9th, 2024]
- Matt Walsh Recalls Critics Trying to Get Him Arrested Using Wikipedia - The Daily Wire - October 4th, 2024 [October 4th, 2024]
- Wikipedia and Religion: Uncovering the Dynamics of Reliable Sources and Digital Bias - Baptist News Global - October 4th, 2024 [October 4th, 2024]
- Wikipedia: Accuracy or Prejudice? Islamophobia in the Web 2.0 Era - World Religion News - October 4th, 2024 [October 4th, 2024]
- Ultrarunner Camille Herron is dumped by Lululemon after her husband edited her rivals' Wikipedia pages to boos - Daily Mail - October 3rd, 2024 [October 3rd, 2024]
- Ultrarunner Camille Herrons Primary Sponsor Drops Her After Wikipedia Scandal - Runner's World - October 3rd, 2024 [October 3rd, 2024]
- Ultrarunner Camille Herron dropped by Lululemon following Wikipedia editing controversy - Runner's World UK - October 3rd, 2024 [October 3rd, 2024]
- Wikipedia relies on army of volunteers as it stares down AI - Devex - October 3rd, 2024 [October 3rd, 2024]
- This Ultramarathon Runner Was Dropped By A Major Sponsor Amid A Wikipedia Editing Scandal - Women's Health - October 3rd, 2024 [October 3rd, 2024]
- Wikipedia scandal: Heres why ultrarunner Camille Herron was dropped by Lululemon - Women's Agenda - October 3rd, 2024 [October 3rd, 2024]
- Guess The Wikipedia Footballer #4: Can you name these 10 footballers that played under Carlo Ancelotti? - Planet Football - October 3rd, 2024 [October 3rd, 2024]
- ANI vs Wikipedia: The free encyclopedias impact on India and more - The Hindu - September 16th, 2024 [September 16th, 2024]
- Wikipedia and AI: Could artificial intelligence kill the online encyclopedia? - Newstalk - September 16th, 2024 [September 16th, 2024]
- Reliable Sources: How Wikipedia Admin David Gerard Launders His Grudges Into the Public Record - World Religion News - August 31st, 2024 [August 31st, 2024]
- Wikipedia and the Digital Services Act: Lessons on the strength of community and the future of internet regulation - Le Taurillon - August 31st, 2024 [August 31st, 2024]
- Depths Of Wikipedia: This Page Is Dedicated To The Weird Side Of Wikipedia (97 New Pics) - AOL - August 31st, 2024 [August 31st, 2024]
- Wikipedia's Longest-Running Hoax Remained Online for Almost 10 Years: The Story of Jar'Edo Wens - The Journal - August 31st, 2024 [August 31st, 2024]
- 40 Times People Found Such Hilarious Gems On Wikipedia, They Just Had To Share (New Pics) - Bored Panda - August 31st, 2024 [August 31st, 2024]
- People only just learning hidden Wikipedia function that makes site easier to read - The Mirror - August 31st, 2024 [August 31st, 2024]
- Joe Hendry Corrects Wikipedia They Dont Believe In Me - eWrestlingNews - August 31st, 2024 [August 31st, 2024]
- Should the Reliability of Wikipedia Be Questioned for the Jewish Community? - The Times of Israel - August 27th, 2024 [August 27th, 2024]
- Rene Gonzalez's office under investigation following Wikipedia spending - KOIN.com - August 27th, 2024 [August 27th, 2024]
- The Wikipedia of medicine is in Quebec, and its growing fast! - CityNews Montreal - August 27th, 2024 [August 27th, 2024]
- George Russell Takes on the Wikipedia Challenge - Autosport - August 18th, 2024 [August 18th, 2024]
- Why All Roads Of Inquiry Lead To Wikipedia : 1A - NPR - August 18th, 2024 [August 18th, 2024]
- Wikipedia Edit-a-Thon Helps Close the Information Gaps on Santa Barbaras History - Santa Barbara Independent - August 18th, 2024 [August 18th, 2024]
- George Russell Takes on the Wikipedia Challenge - Mercedes-AMG PETRONAS F1 Team - August 18th, 2024 [August 18th, 2024]
- Wikipedia Deletes J.D. Vances Wartime Medals and Awards - Shore News Network - August 18th, 2024 [August 18th, 2024]
- Toyin Abraham: X users report her to Netflix, tag her as bully on Wikipedia page - Legit.ng - July 14th, 2024 [July 14th, 2024]
- From Wikipedia to The Great: 10 Medieval Studies Articles Published Last Month - Medievalists.net - June 12th, 2024 [June 12th, 2024]
- Ethereum researcher alleges Wikipedia of biased Solana coverage - Crypto Briefing - May 22nd, 2024 [May 22nd, 2024]
- Link Rot and Digital Decay on Government, News and Other Webpages - Pew Research Center - May 22nd, 2024 [May 22nd, 2024]
- El Paso librarian takes love of knowledge to Wikipedia - El Paso Inc. - May 22nd, 2024 [May 22nd, 2024]
- Assassin's Creed Shadows 'critics' have started vandalising IRL protagonist Yasuke's Wiki page - GAMINGbible - May 22nd, 2024 [May 22nd, 2024]
- People Are Vandalizing the Wikipedia Page for Assassin's Creed Shadows Protagonist Yasuke - GameRant - May 22nd, 2024 [May 22nd, 2024]
- Assassin's Creed Shadows sparks Wikipedia edit war over Yasuke - Niche Gamer - May 22nd, 2024 [May 22nd, 2024]
- Made J. Cole look like he died in the war: Drake vs Kendrick Lamar Rap Battle Gets a World War 2 Styled Wikipedia ... - FandomWire - May 7th, 2024 [May 7th, 2024]
- Mastodon Play 'Wikipedia: Fact or Fiction?' - Loudwire - March 30th, 2024 [March 30th, 2024]
- Wolff contacted Verstappen to explain Wikipedia statement - GPblog - March 30th, 2024 [March 30th, 2024]
- George Washington Masonic Memorial photo honored in Wikipedia photo competition - ALXnow - January 22nd, 2024 [January 22nd, 2024]
- In the War for Narratives Iran's Regime Takes to Wikipedia - NCRI - National Council of Resistance of Iran (NCRI) - January 22nd, 2024 [January 22nd, 2024]
- Kayla Braxton furious over wrong Wikipedia update, shares reaction - Sportskeeda - December 23rd, 2023 [December 23rd, 2023]
- Why Wikipedia's highway editors took the exit ramp. - Slate - December 14th, 2023 [December 14th, 2023]
- Dive into the weird and wonderful Depths of Wikipedia - WBUR News - December 14th, 2023 [December 14th, 2023]
- The 25 Most Popular Wikipedia Pages of 2023 - Mentalfloss - December 14th, 2023 [December 14th, 2023]
- ChatGPT is Wikipedia's most-viewed article in 2023 - CoinGeek - December 14th, 2023 [December 14th, 2023]
- These are the most read entries on Wikipedia in 2023: atomic bombs and much more. - Softonic EN - December 14th, 2023 [December 14th, 2023]
- Wikipedias Most-Viewed Articles of 2023 Revealed - Greek Reporter - December 14th, 2023 [December 14th, 2023]
- Watching the Napoleon Movie? Don't Forget to Read His Wikipedia Page. - Slate - November 24th, 2023 [November 24th, 2023]
- Crowdsourced fact-checking fights misinformation in Taiwan ... - Cornell Chronicle - November 24th, 2023 [November 24th, 2023]
- The Sunday Read: 'Wikipedia's Moment of Truth' - The New York Times - September 11th, 2023 [September 11th, 2023]
- 'The more vibrant the society, the more actors seek to influence Wikipedia' - Ynetnews - September 11th, 2023 [September 11th, 2023]
- SOMEONE Keeps Editing Joshua Wright's Wikipedia Page To Downplay The Whole 'Sleeping With 1Ls' Thing - Above the Law - September 11th, 2023 [September 11th, 2023]
- Why Wikipedia is so imperative for public relations - PR Daily - September 11th, 2023 [September 11th, 2023]
- More Wikipedia taunts as Max Verstappen erases a Lewis Hamilton World title - Yahoo Eurosport UK - September 11th, 2023 [September 11th, 2023]
- Local Teacher Becomes First Malaysian To Win Wikimedian Award ... - The Rakyat Post - September 11th, 2023 [September 11th, 2023]