This could lead to the next big breakthrough in common sense AI – MIT Technology Review
AI models that can parse both language and visual input also have very practical uses. If we want to build robotic assistants, for example, they need computer vision to navigate the world and language to communicate about it to humans.
But combining both types of AI is easier said than done. It isnt as simple as stapling together an existing language model with an existing object recognition system. It requires training a new model from scratch with a data set that includes text and images, otherwise known as a visual-language data set.
The most common approach for curating such a data set is to compile a collection of images with descriptive captions. A picture like the one below, for example, would be captioned An orange cat sits in the suitcase ready to be packed. This differs from typical image data sets, which would label the same picture with only one noun, like cat. A visual-language data set can therefore teach an AI model not just how to recognize objects but how they relate to and act on one other, using verbs and prepositions.
But you can see why this data curation process would take forever. This is why the visual-language data sets that exist are so puny. A popular text-only data set like English Wikipedia (which indeed includes nearly all the English-language Wikipedia entries) might contain nearly 3 billion words. A visual-language data set like Microsoft Common Objects in Context, or MS COCO, contains only 7 million. Its simply not enough data to train an AI model for anything useful.
Vokenization gets around this problem, using unsupervised learning methods to scale the tiny amount ofdata in MS COCO to the size of English Wikipedia. The resultant visual-language model outperforms state-of-the-art models in some of the hardest tests used to evaluate AI language comprehension today.
You dont beat state of the art on these tests by just trying a little bit, says Thomas Wolf, the cofounder and chief science officer of the natural-language processing startup Hugging Face, who was not part of the research. This is not a toy test. This is why this is super exciting.
Lets first sort out some terminology. What on earth is a voken?
In AI speak, the words that are used to train language models are known as tokens. So the UNC researchers decided to call the image associated with each token in their visual-language model a voken. Vokenizer is what they call the algorithm that finds vokens for each token, and vokenization is what they call the whole process.
The point of this isnt just to show how much AI researchers love making up words. (They really do.) It also helps break down the basic idea behind vokenization. Instead of starting with an image data set and manually writing sentences to serve as captionsa very slow processthe UNC researchers started with a language data set and used unsupervised learning to match each word with a relevant image (more on this later). This is a highly scalable process.
The unsupervised learning technique, here, is ultimately the contribution of the paper. How do you actually find a relevant image for each word?
Lets go back for a moment to GPT-3. GPT-3 is part of a family of language models known as transformers, which represented a major breakthrough in applying unsupervised learning to natural-language processing when the first one was introduced in 2017. Transformers learn the patterns of human language by observing how words are used in context and then creating a mathematical representation of each word, known as a word embedding, based on that context. The embedding for the word cat might show, for example, that it is frequently used around the words meow and orange but less often around the words bark or blue.
This is how transformers approximate the meanings of words, and how GPT-3 can write such human-like sentences. It relies in part on these embeddings to tell it how to assemble words into sentences, and sentences into paragraphs.
Theres a parallel technique that can also be used for images. Instead of scanning text for word usage patterns, it scans images for visual patterns. It tabulates how often a cat, say, appears on a bed versus on a tree, and creates a cat embedding with this contextual information.
The insight of the UNC researchers was that they should use both embedding techniques on MS COCO. They converted the images into visual embeddings and the captions into word embeddings. Whats really neat about these embeddings is that they can then be graphed in a three-dimensional space, and you can literally see how they are related to one another. Visual embeddings that are closely related to word embeddings will appear closer in the graph. In other words, the visual cat embedding should (in theory) overlap with the text-based cat embedding. Pretty cool.
You can see where this is going. Once the embeddings are all graphed and compared and related to one another, its easy to start matching images (vokens) with words (tokens). And remember, because the images and words are matched based on their embeddings, theyre also matched based on context. This is useful when one word can have totally different meanings. The technique successfully handles that by finding different vokens for each instance of the word.
For example:
Go here to read the rest:
This could lead to the next big breakthrough in common sense AI - MIT Technology Review
- What were the most popular Wikipedia pages of 2024? - Roanoke Times - December 22nd, 2024 [December 22nd, 2024]
- What we learned from Open AI whistleblower Suchir Balaji's Wikipedia Page - The Times of India - December 18th, 2024 [December 18th, 2024]
- From an old version of the Wikipedia page for Warren G and N... - kottke.org - December 18th, 2024 [December 18th, 2024]
- What were the most popular Wikipedia pages of 2024? - WCF Courier - December 18th, 2024 [December 18th, 2024]
- Encyclopedia of the Future: Why is Wikipedia Best Research Option? - Analytics Insight - December 18th, 2024 [December 18th, 2024]
- Wikipedia's Most-Viewed Articles of 2024: Politics, Football, and...Death? - PCMag Middle East - December 18th, 2024 [December 18th, 2024]
- Taxiride Fallout Continues Over Alleged Amendments To Band Wikipedia Page - The Music - December 18th, 2024 [December 18th, 2024]
- Delhi High Court to examine Caravan, Ken articles to decide interim relief in ANI vs Wikipedia - Bar & Bench - Indian Legal News - December 18th, 2024 [December 18th, 2024]
- Boriswave Wikipedia page set up in reference to immigration surge under ex-PM - The London Economic - December 18th, 2024 [December 18th, 2024]
- Wikipedia suspends pro-Palestine editors coordinating efforts behind the scenes - The Jerusalem Post - December 14th, 2024 [December 14th, 2024]
- Wikipedia's 7-year yogurt spelling war was longer than three Shakespeare plays - Boing Boing - December 14th, 2024 [December 14th, 2024]
- Wikipedia boyfriends on celebrating their mundane, anti-online corner of the internet - British GQ - December 14th, 2024 [December 14th, 2024]
- What were the most popular Wikipedia pages of 2024? - York News-Times - December 14th, 2024 [December 14th, 2024]
- Wikipedia's Most-Viewed Articles of 2024: Politics, Football, and...Death? - PCMag UK - December 14th, 2024 [December 14th, 2024]
- What were the most popular Wikipedia pages of 2024? - Martinsville Bulletin - December 14th, 2024 [December 14th, 2024]
- Death most popular thing on Wikipedia, again - Boing Boing - December 5th, 2024 [December 5th, 2024]
- Heres the top 25 list of most-viewed Wikipedia articles of 2024 - KXAN.com - December 5th, 2024 [December 5th, 2024]
- Here Are the Top 25 Wikipedia Searches for 2024 And #1 is BLEAK - Mediaite - December 5th, 2024 [December 5th, 2024]
- Morrissey hits out at Wikipedia for failing to set the record straight - The Independent - December 5th, 2024 [December 5th, 2024]
- Jimmy Wales on Why Wikipedia Is Still So Good - New York Magazine - December 5th, 2024 [December 5th, 2024]
- Here Are The 5 Most Read Wikipedia Pages In 2024 - The Spun - December 5th, 2024 [December 5th, 2024]
- Wikipedia reveals its most searched posts - 97.1 The Ticket - December 5th, 2024 [December 5th, 2024]
- Wikipedia just revealed what weve all been obsessing over in 2024 - Sherwood News - December 5th, 2024 [December 5th, 2024]
- The Terrible Towel Wikipedia page is a must-read yinzer masterpiece - PGH City Paper - December 5th, 2024 [December 5th, 2024]
- The Most Popular Wikipedia Pages Of The Year - iHeart - December 5th, 2024 [December 5th, 2024]
- Neither Donald Trump nor Taylor Swift: This was the most-viewed Wikipedia page in the U.S. in 2024 - AS USA - December 5th, 2024 [December 5th, 2024]
- What were the most popular Wikipedia pages of 2024? - Winona Daily News - December 5th, 2024 [December 5th, 2024]
- Morrissey Mad At Wikipedia, Claims He Was Never In The Nosebleeds Nor Slaughter And The Dogs - Stereogum - December 5th, 2024 [December 5th, 2024]
- Heres the top 25 list of most-viewed Wikipedia articles of 2024 - MSN - December 5th, 2024 [December 5th, 2024]
- The Nosebleeds and Slaughter And The Dogs Band members list explored as Morrissey slams Wikipedia listing - Soap Central - December 5th, 2024 [December 5th, 2024]
- Diddy, Dune, and Donald Trump: The most popular Wikipedia pages of 2024 - STV News - December 5th, 2024 [December 5th, 2024]
- India's bollywood, elections, and IPL among top 10 most viewed articles on Wikipedia - The Tatva - December 5th, 2024 [December 5th, 2024]
- Morrissey says he has no connection with The Nosebleeds and Slaughter And The Dogs, despite claims on Wikipedia - NME - December 5th, 2024 [December 5th, 2024]
- Wikipedia Called To Order By Samson Mow: The Urgency To Invest In Bitcoin - Cointribune EN - December 5th, 2024 [December 5th, 2024]
- Wikipedia and the ANI defamation suit | Explained - The Hindu - December 5th, 2024 [December 5th, 2024]
- A Wikipedia for cells: researchers get an updated look at the Human Cell Atlas, and its remarkable - Nature.com - November 23rd, 2024 [November 23rd, 2024]
- Opinion: Wikipedia has it out for Israel, and weve got the data to prove it - National Post - November 23rd, 2024 [November 23rd, 2024]
- Who edits history? Politics and business in the pages of Wikipedia - EU Reporter - November 23rd, 2024 [November 23rd, 2024]
- What your Wikipedia reading says about you: Study find different styles - The New Daily - November 14th, 2024 [November 14th, 2024]
- Going down a Wikipedia rabbit hole? Science says youre one of these three types - The Conversation - October 26th, 2024 [October 26th, 2024]
- Studying Wikipedia browsing habits to learn how people learn - Penn Today - October 26th, 2024 [October 26th, 2024]
- Portland mayor candidate Rene Gonzalez violated rules by using public funds on Wikipedia page, auditor finds - Oregon Public Broadcasting - October 26th, 2024 [October 26th, 2024]
- Top 5 Editing Conflicts in Wikipedia Pages on Religion - Baptist News Global - October 26th, 2024 [October 26th, 2024]
- Wikipedia editors form urgent task force to combat rampant issues with recent wave of content: 'The entire thing was ... [a] hoax' - Yahoo! Voices - October 26th, 2024 [October 26th, 2024]
- Audit: Rene Gonzalez violated campaign finance law by using city funds to edit Wikipedia page - Fox 12 Oregon - October 26th, 2024 [October 26th, 2024]
- Auditor: Gonzalez violated the law by paying to update his Wikipedia entry - Portland Tribune - October 26th, 2024 [October 26th, 2024]
- Musk Says Wikipedia Controlled By Far-Left Activists, Urges People To Stop Donating To Them! - News24 - October 26th, 2024 [October 26th, 2024]
- Silent Hill 2 Remake Wikipedia page locked after salty fans try to rewrite its critically-acclaimed reception - Eurogamer - October 9th, 2024 [October 9th, 2024]
- The Silent Hill 2 Remakes Wikipedia page briefly got transformed into a phantasmagorical reflection of the psyches of idiots unable to accept reality... - October 9th, 2024 [October 9th, 2024]
- Outrage as Wikipedia changes grooming gangs article to moral panic from the 'Far-Right' - GB News - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 Falls Victim to Faux Review Bombing on Wikipedia - DualShockers - October 9th, 2024 [October 9th, 2024]
- No, you're not losing it, Silent Hill 2 Remake's Wikipedia page's review scores have been altered, and the site has had to lock it to stop people... - October 9th, 2024 [October 9th, 2024]
- Exploring (and building) the depths of Wikipedia - The Michigan Daily - October 9th, 2024 [October 9th, 2024]
- Wikipedia and Catholicism: Navigating Misinformation and Religious Bias - World Religion News - October 9th, 2024 [October 9th, 2024]
- Weird things are happening on the Silent Hill 2 remake Wikipedia page, as folks sabotage review scores for reasons - Sports Illustrated - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 Remake Wikipedia Page Locked After Fans Tried to Change Reviews - Rely on Horror - October 9th, 2024 [October 9th, 2024]
- Trolls Edit Silent Hill 2 Remake Wikipedia Page To Lower Its Review Scores - PlayStation Universe - October 9th, 2024 [October 9th, 2024]
- The Kremlin is rewriting Wikipedia - Hindustan Times - October 9th, 2024 [October 9th, 2024]
- Wikipedia Locks Silent Hill 2 Remake Page After It's Spammed With Fake Negative Reviews - TheGamer - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 remake Wikipedia locked after getting trolled - NME - October 9th, 2024 [October 9th, 2024]
- Wikimedia Technology Summit 2024 brings together tech enthusiasts and developers to bring inclusivity to Wikipedia and Wikimedia projects - Business... - October 9th, 2024 [October 9th, 2024]
- AI's threat to Wikipedia - ABC News - October 9th, 2024 [October 9th, 2024]
- Silent Hill 2 remake page on Wikipedia blocked after fans try to rewrite critics' positive reviews - ITC - October 9th, 2024 [October 9th, 2024]
- Matt Walsh Recalls Critics Trying to Get Him Arrested Using Wikipedia - The Daily Wire - October 4th, 2024 [October 4th, 2024]
- Wikipedia and Religion: Uncovering the Dynamics of Reliable Sources and Digital Bias - Baptist News Global - October 4th, 2024 [October 4th, 2024]
- Wikipedia: Accuracy or Prejudice? Islamophobia in the Web 2.0 Era - World Religion News - October 4th, 2024 [October 4th, 2024]
- Ultrarunner Camille Herron is dumped by Lululemon after her husband edited her rivals' Wikipedia pages to boos - Daily Mail - October 3rd, 2024 [October 3rd, 2024]
- Ultrarunner Camille Herrons Primary Sponsor Drops Her After Wikipedia Scandal - Runner's World - October 3rd, 2024 [October 3rd, 2024]
- Ultrarunner Camille Herron dropped by Lululemon following Wikipedia editing controversy - Runner's World UK - October 3rd, 2024 [October 3rd, 2024]
- Wikipedia relies on army of volunteers as it stares down AI - Devex - October 3rd, 2024 [October 3rd, 2024]
- This Ultramarathon Runner Was Dropped By A Major Sponsor Amid A Wikipedia Editing Scandal - Women's Health - October 3rd, 2024 [October 3rd, 2024]
- Wikipedia scandal: Heres why ultrarunner Camille Herron was dropped by Lululemon - Women's Agenda - October 3rd, 2024 [October 3rd, 2024]
- Guess The Wikipedia Footballer #4: Can you name these 10 footballers that played under Carlo Ancelotti? - Planet Football - October 3rd, 2024 [October 3rd, 2024]
- ANI vs Wikipedia: The free encyclopedias impact on India and more - The Hindu - September 16th, 2024 [September 16th, 2024]
- Wikipedia and AI: Could artificial intelligence kill the online encyclopedia? - Newstalk - September 16th, 2024 [September 16th, 2024]
- Reliable Sources: How Wikipedia Admin David Gerard Launders His Grudges Into the Public Record - World Religion News - August 31st, 2024 [August 31st, 2024]
- Wikipedia and the Digital Services Act: Lessons on the strength of community and the future of internet regulation - Le Taurillon - August 31st, 2024 [August 31st, 2024]
- Depths Of Wikipedia: This Page Is Dedicated To The Weird Side Of Wikipedia (97 New Pics) - AOL - August 31st, 2024 [August 31st, 2024]
- Wikipedia's Longest-Running Hoax Remained Online for Almost 10 Years: The Story of Jar'Edo Wens - The Journal - August 31st, 2024 [August 31st, 2024]
- 40 Times People Found Such Hilarious Gems On Wikipedia, They Just Had To Share (New Pics) - Bored Panda - August 31st, 2024 [August 31st, 2024]