HuggingGPT: The Secret Weapon to Solve Complex AI Tasks – KDnuggets
Have you heard of the term Artificial General Intelligence (AGI)? If not, let me clarify. AGI can be thought of as an AI system that can understand, process, and respond the intellectual tasks just like humans do. It's a challenging task that requires an in-depth understanding of how the human brain works so we can replicate it. However, the advent of ChatGPT has drawn immense interest from the research community to develop such systems. Microsoft has released one such key AI-powered system called HuggingGPT (Microsoft Jarvis). It is one of the most mind-blowing things that I have come across.
Before I dive into the details of what is new in HuggingGPT and how it works, let us first understand the issue with ChatGPT and why it struggles to solve complex AI tasks. Large Language models like ChatGPT excel at interpreting textual data and handling general tasks. However, they often struggle with specific tasks and may generate absurd responses. You might have encountered bogus replies from ChatGPT while solving complex mathematical problems. On the other side, we have expert AI models like Stable Diffusion, and DALL-E that have a deeper understanding of their subject area but struggle with the broader tasks. We cannot fully harness the potential of LLMs to solve challenging AI tasks unless we develop a connection between them and the Specialized AI models. This is what HuggingGPT did. It combined the strengths of both to create more efficient, accurate, and versatile AI systems.
According to a recent paper published by Microsoft, HuggingGPT leverages the power of LLMs by using it as a controller to connect them to various AI models in Machine Learning communities (HuggingFace). Rather than training the ChatGPT for various tasks, we enable it to use external tools for greater efficiency. HuggingFace is a website that provides numerous tools and resources for developers and researchers. It also has a wide variety of specialized and high-accuracy models. HuggingGPT uses these models for sophisticated AI tasks in different domains and modalities thereby achieving impressive results. It has similar multimodal capabilities to OPenAI GPT-4 when it comes to text and images. But, it also connected you to the Internet and you can provide an external web link to ask questions about it.
Suppose you want the model to generate an audio reading of the text written on an image. HuggingGPT will perform this task serially using the best-suited models. Firstly, it will generate the image from text and use its result for audio generation. You can check the response details in the image below. Simply Amazing!
HuggingGPT is a collaborative system that uses LLMs as an interface to send user requests to expert models. The complete process starting from the user prompt to the model till receiving the response can be broken down into the following discrete steps:
In this stage, HuggingGPT makes use of ChatGPT to understand the user prompt and then breaks down the query into small actionable tasks. It also determines the dependencies of these tasks and defines their execution sequence. HuggingGPT has four slots for task parsing i.e. task type, task ID, task dependencies, and task arguments. Chat logs between the HuggingGPT and the user are recorded and displayed on the screen that shows the history of the resources.
Based on the user context and the available models, HuggingGPT uses an in-context task-model assignment mechanism to select the most appropriate model for a particular task. According to this mechanism, the selection of a model is considered a single-choice problem and it initially filters out the model based on the type of the task. After that, the models are ranked based on the number of downloads as it is considered a reliable measure that reflects the quality of the model. Top-K models are selected based on this ranking. Here K is just a constant that reflects the number of models, for example, if it is set to 3 then it will select 3 models with the highest number of downloads.
Here the task is assigned to a specific model, it performs the inference on it and returns the result. To enhance the efficiency of this process, HuggingGPT can run different models at the same time as long as they dont need the same resources. For example, if I give a prompt to generate pictures of cats and dogs then separate models can run in parallel to execute this task. However, sometimes models may need the same resources which is why HuggingGPT maintains an
The final step involves generating the response to the user. Firstly, it integrates all the information from the previous stages and the inference results. The information is presented in a structured format. For example, if the prompt was to detect the number of lions in an image, it will draw the appropriate bounding boxes with detection probabilities. The LLM (ChatGPT) then uses this format and presents it in human-friendly language.
HuggingGPT is built on top of Hugging Face's state-of-the-art GPT-3.5 architecture, which is a deep neural network model that can generate natural language text. Here is how you can set it up on your local computer:
The default configuration requires Ubuntu 16.04 LTS, VRAM of at least 24GB, RAM of at least 12GB (minimal), 16GB (standard), or 80GB (full), and disk space of at least 284 GB. Additionally, you'll need 42GB of space for damo-vilab/text-to-video-ms-1.7b, 126GB for ControlNet, 66GB for stable-diffusion-v1-5, and 50GB for other resources. For the "lite" configuration, you'll only need Ubuntu 16.04 LTS.
First, replace the OpenAI Key and the Hugging Face Token in the server/configs/config.default.yaml file with your keys. Alternatively, you can put them in the environment variables OPENAI_API_KEY and HUGGINGFACE_ACCESS_TOKEN, respectively
Run the following commands:
For Server:
Now you can access Jarvis' services by sending HTTP requests to the Web API endpoints. Send a request to :
The requests should be in JSON format and should include a list of messages that represent the user's inputs.
For Web:
For CLI:
Setting up Jarvis using CLI is quite simple. Just run the command mentioned below:
For Gradio:
Gradio demo is also being hosted on Hugging Face Space. You can experiment with it after entering the OPENAI_API_KEY and HUGGINGFACE_ACCESS_TOKEN.
To run it locally:
Note: In case of any issue please refer to the official Github Repo.
HuggingGPT also has certain limitations that I want to highlight here. For instance, the efficiency of the system is a major bottleneck and during all the stages mentioned earlier, HuggingGPT requires multiple interactions with LLMs. These interactions can lead to degraded user experience and increased latency. Similarly, the maximum context length is also limited by the number of allowed tokens. Another problem is the System's reliability, as the LLMs may misinterpret the prompt and generate a wrong sequence of tasks which in turn affects the whole process. Nonetheless, it has significant potential to solve complex AI tasks and is an excellent advancement toward AGI. Let's see in which direction this research leads us too. Thats a wrap, feel free to express your views in the comment section below.Kanwal Mehreen is an aspiring software developer with a keen interest in data science and applications of AI in medicine. Kanwal was selected as the Google Generation Scholar 2022 for the APAC region. Kanwal loves to share technical knowledge by writing articles on trending topics, and is passionate about improving the representation of women in tech industry.
Continued here:
HuggingGPT: The Secret Weapon to Solve Complex AI Tasks - KDnuggets
- How Do You Get to Artificial General Intelligence? Think Lighter - WIRED - November 28th, 2024 [November 28th, 2024]
- How much time do we have before Artificial General Intelligence (AGI) to turns into Artificial Self-preserving - The Times of India - November 5th, 2024 [November 5th, 2024]
- Simuli to Leap Forward in the Trek to Artificial General Intelligence through 2027 Hyperdimensional AI Ecosystem - USA TODAY - November 5th, 2024 [November 5th, 2024]
- Implications of Artificial General Intelligence on National and International Security - Yoshua Bengio - - October 31st, 2024 [October 31st, 2024]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - Business Insider - October 31st, 2024 [October 31st, 2024]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - MSN - October 31st, 2024 [October 31st, 2024]
- Bot fresh hell is this?: Inside the rise of Artificial General Intelligence or AGI - MSN - October 31st, 2024 [October 31st, 2024]
- Artificial General Intelligence (AGI) Market to Reach $26.9 Billion by 2031 As Revealed In New Report - WhaTech - September 26th, 2024 [September 26th, 2024]
- 19 jobs artificial general intelligence (AGI) may replace and 10 jobs it could create - MSN - September 26th, 2024 [September 26th, 2024]
- Paige Appoints New Leadership to Further Drive Innovation, Bring Artificial General Intelligence to Pathology, and Expand Access to AI Applications -... - August 16th, 2024 [August 16th, 2024]
- Artificial General Intelligence, If Attained, Will Be the Greatest Invention of All Time - JD Supra - August 11th, 2024 [August 11th, 2024]
- OpenAI Touts New AI Safety Research. Critics Say Its a Good Step, but Not Enough - WIRED - July 22nd, 2024 [July 22nd, 2024]
- OpenAIs Project Strawberry Said to Be Building AI That Reasons and Does Deep Research - Singularity Hub - July 22nd, 2024 [July 22nd, 2024]
- One of the Best Ways to Invest in AI Is Dont - InvestorPlace - July 22nd, 2024 [July 22nd, 2024]
- OpenAI is plagued by safety concerns - The Verge - July 17th, 2024 [July 17th, 2024]
- OpenAI reportedly nears breakthrough with reasoning AI, reveals progress framework - Ars Technica - July 17th, 2024 [July 17th, 2024]
- ChatGPT maker OpenAI now has a scale to rank its AI - ReadWrite - July 17th, 2024 [July 17th, 2024]
- Heres how OpenAI will determine how powerful its AI systems are - The Verge - July 17th, 2024 [July 17th, 2024]
- OpenAI may be working on AI that can perform research without human help which should go fine - TechRadar - July 17th, 2024 [July 17th, 2024]
- OpenAI has a new scale for measuring how smart their AI models are becoming which is not as comforting as it should be - TechRadar - July 17th, 2024 [July 17th, 2024]
- OpenAI says there are 5 'levels' for AI to reach human intelligence it's already almost at level 2 - Quartz - July 17th, 2024 [July 17th, 2024]
- AIs Bizarro World, were marching towards AGI while carbon emissions soar - Fortune - July 17th, 2024 [July 17th, 2024]
- AI News Today July 15, 2024 - The Dales Report - July 17th, 2024 [July 17th, 2024]
- The Evolution Of Artificial Intelligence: From Basic AI To ASI - Welcome2TheBronx - July 17th, 2024 [July 17th, 2024]
- What Elon Musk and Ilya Sutskever Feared About OpenAI Is Becoming Reality - Observer - July 17th, 2024 [July 17th, 2024]
- Companies are losing faith in AI, and AI is losing money - Android Headlines - July 17th, 2024 [July 17th, 2024]
- AGI isn't here (yet): How to make informed, strategic decisions in the meantime - VentureBeat - June 16th, 2024 [June 16th, 2024]
- Apple's AI Privacy Measures, Elon Musk's Robot Prediction, And More: This Week In Artificial Intelligence - Alphabet ... - Benzinga - June 16th, 2024 [June 16th, 2024]
- AGI and jumping to the New Inference Market S-Curve - CMSWire - June 16th, 2024 [June 16th, 2024]
- Apple's big AI announcements were all about AI 'for the rest of us'Google, Meta, Amazon and, yes, OpenAI should ... - Fortune - June 16th, 2024 [June 16th, 2024]
- Elon Musk Withdraws His Lawsuit Against OpenAI and Sam Altman - The New York Times - June 16th, 2024 [June 16th, 2024]
- Staying Ahead of the AI Train - ATD - June 16th, 2024 [June 16th, 2024]
- OpenAI disbands its AI risk mitigation team - - May 20th, 2024 [May 20th, 2024]
- BEYOND LOCAL: 'Noise' in the machine: Human differences in judgment lead to problems for AI - The Longmont Leader - May 20th, 2024 [May 20th, 2024]
- Machine Learning Researcher Links OpenAI to Drug-Fueled Sex Parties - Futurism - May 20th, 2024 [May 20th, 2024]
- What Is AI? How Artificial Intelligence Works (2024) - Shopify - May 20th, 2024 [May 20th, 2024]
- Vitalik Buterin says OpenAI's GPT-4 has passed the Turing test - Cointelegraph - May 20th, 2024 [May 20th, 2024]
- "I lost trust": Why the OpenAI team in charge of safeguarding humanity imploded - Vox.com - May 18th, 2024 [May 18th, 2024]
- 63% of surveyed Americans want government legislation to prevent super intelligent AI from ever being achieved - PC Gamer - May 18th, 2024 [May 18th, 2024]
- Top OpenAI researcher resigns, saying company prioritized 'shiny products' over AI safety - Fortune - May 18th, 2024 [May 18th, 2024]
- The revolution in artificial intelligence and artificial general intelligence - Washington Times - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands team devoted to artificial intelligence risks - Yahoo! Voices - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands safety team focused on risk of artificial intelligence causing 'human extinction' - New York Post - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands team devoted to artificial intelligence risks - Port Lavaca Wave - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands team devoted to artificial intelligence risks - Moore County News Press - May 18th, 2024 [May 18th, 2024]
- Generative AI Is Totally Shameless. I Want to Be It - WIRED - May 18th, 2024 [May 18th, 2024]
- OpenAI researcher resigns, claiming safety has taken a backseat to shiny products - The Verge - May 18th, 2024 [May 18th, 2024]
- Most of Surveyed Americans Do Not Want Super Intelligent AI - 80.lv - May 18th, 2024 [May 18th, 2024]
- A former OpenAI leader says safety has 'taken a backseat to shiny products' at the AI company - Winnipeg Free Press - May 18th, 2024 [May 18th, 2024]
- DeepMind CEO says Google to spend more than $100B on AGI despite hype - Cointelegraph - April 20th, 2024 [April 20th, 2024]
- Congressional panel outlines five guardrails for AI use in House - FedScoop - April 20th, 2024 [April 20th, 2024]
- The Potential and Perils of Advanced Artificial General Intelligence - elblog.pl - April 20th, 2024 [April 20th, 2024]
- DeepMind Head: Google AI Spending Could Exceed $100 Billion - PYMNTS.com - April 20th, 2024 [April 20th, 2024]
- Say hi to Tong Tong, world's first AGI child-image figure - ecns - April 20th, 2024 [April 20th, 2024]
- Silicon Scholars: AI and The Muslim Ummah - IslamiCity - April 20th, 2024 [April 20th, 2024]
- AI stocks aren't like the dot-com bubble. Here's why - Quartz - April 20th, 2024 [April 20th, 2024]
- AI vs. AGI: The Race for Performance, Battling the Cost? for NASDAQ:GOOG by Moshkelgosha - TradingView - April 20th, 2024 [April 20th, 2024]
- We've Been Here Before: AI Promised Humanlike Machines In 1958 - The Good Men Project - April 20th, 2024 [April 20th, 2024]
- Google will spend more than $100 billion on AI, exec says - Quartz - April 20th, 2024 [April 20th, 2024]
- Tech companies want to build artificial general intelligence. But who decides when AGI is attained? - The Bakersfield Californian - April 8th, 2024 [April 8th, 2024]
- Tech companies want to build artificial general intelligence. But who decides when AGI is attained? - The Caledonian-Record - April 8th, 2024 [April 8th, 2024]
- What is AGI and how is it different from AI? - ReadWrite - April 8th, 2024 [April 8th, 2024]
- Artificial intelligence in healthcare: defining the most common terms - HealthITAnalytics.com - April 8th, 2024 [April 8th, 2024]
- We're Focusing on the Wrong Kind of AI Apocalypse - TIME - April 8th, 2024 [April 8th, 2024]
- Xi Jinping's vision in supporting the artificial intelligence at home and abroad - Modern Diplomacy - April 8th, 2024 [April 8th, 2024]
- As 'The Matrix' turns 25, the chilling artificial intelligence (AI) projection at its core isn't as outlandish as it once seemed - TechRadar - April 8th, 2024 [April 8th, 2024]
- AI & robotics briefing: Why superintelligent AI won't sneak up on us - Nature.com - January 10th, 2024 [January 10th, 2024]
- Get Ready for the Great AI Disappointment - WIRED - January 10th, 2024 [January 10th, 2024]
- Part 3 Capitalism in the Age of Artificial General Intelligence (AGI) - Medium - January 10th, 2024 [January 10th, 2024]
- Artificial General Intelligence (AGI): what it is and why its discovery can change the world - Medium - January 10th, 2024 [January 10th, 2024]
- Exploring the Path to Artificial General Intelligence - Medriva - January 10th, 2024 [January 10th, 2024]
- The Acceleration Towards Artificial General Intelligence (AGI) and Its Implications - Medriva - January 10th, 2024 [January 10th, 2024]
- OpenAI Warns: "AGI Is Coming" - Do we have a reason to worry? - Medium - January 10th, 2024 [January 10th, 2024]
- The fight over ethics intensifies as artificial intelligence quickly changes the world - 9 & 10 News - January 10th, 2024 [January 10th, 2024]
- AI as the Third Window into Humanity: Understanding Human Behavior and Emotions - Medriva - January 10th, 2024 [January 10th, 2024]
- Artificial General Intelligence (AGI) in Radiation Oncology: Transformative Technology - Medriva - January 10th, 2024 [January 10th, 2024]
- Exploring the Potential of AGI: Opportunities and Challenges - Medium - January 10th, 2024 [January 10th, 2024]
- Full-Spectrum Cognitive Development Incorporating AI for Evolution and Collective Intelligence - Medriva - January 10th, 2024 [January 10th, 2024]
- Artificial Superintelligence - Understanding a Future Tech that Will Change the World! - MobileAppDaily - January 10th, 2024 [January 10th, 2024]
- Title: AI Unveiled: Exploring the Realm of Artificial Intelligence - Medium - January 10th, 2024 [January 10th, 2024]