AGI and jumping to the New Inference Market S-Curve – CMSWire
The Gist
Artificial general intelligence (AGI) has been the Holy Grail of AI for many decades. AGI is an application of strong AI that is defined as AI that can perform as well or better than humans on a wide range of cognitive tasks. There is much debate over when artificial general intelligence may be fully realized, especially with the current evolution of large language models (LLMs). For many people, AGI is something out of a science fiction movie that remains mostly theoretical. Others believe we have already reached AGI with the latest releases of Chat-GPT4o and Gemini Advanced.
Historically, we have used the Turing test as the measurement to determine if a system has reached artificial general intelligence. Created by Alan Turing in 1950 and originally called the Imitation Game, the test is largely based on three participants, an interrogator whose asks questions to the machine and human, the machine or system and the human who answers the question alongside the machine for comparison.
The criticism of the test is that it doesnt measure intelligence or any other human qualities. The foundational assumption that an interrogator can determine if a machine is thinking by comparing its behavior with human behavior has a lot of subjectivity and is not necessarily deterministic.
There is also lack of consensus on whether the modern LLMs have actually achieved AGI. In June 2022, Google claimed LaMDAhad passed the test, but critics quickly dismissed this as an advancement in fooling people you have intelligence rather than advancing toward AGI. The reality is that the test has outlived its usefulness.
Ray Kurzweil, a technology futurist, has spent much of his career making predictions on when we will reach AGI. In his recent talk at SXSW, he said he is sticking to his original prediction in 1999 that AI will match/surpass human intelligence by 2029.
But how will we know?
Related Article:The Quest for Achieving Artificial General Intelligence
Horizontal AI products like ChatGPT, Gemini, Midjourney, Dall-E have given millions of users exposure to the power of AI. To many, these AI platforms seem very smart as they can generate answers, compose songs and write code in seconds.
However, there is a big difference between AI and AGI. These current AI platforms are essentially highly efficient prediction machines because they have been trained on a large corpus of data. However, that does not enable creativity, logical reasoning and sensory perception.
As we move closer to artificial general intelligence, we need an accepted definition of AGI and a framework that truly measures these critical aspects of intelligence such as reasoning, creativity and sentience.
One approach is to consider artificial general intelligence as an end-to end intelligence supply chain encompassing all the capabilities needed to achieve AGI.
We can group the critical components needed for AGI into four major categories as follows:
Todays AI systems are mostly excelling at 1 and 2. For artificial general intelligence to be attained, we will need systems that can accomplish 3 and 4.
Achieving AGI will require further advances in algorithms, computing and data than what powers the models of today. Mimicking complex human behavior such as creativity, perceptions, learning and memory will require embodied cognition or learning from a multitude of senses or inputs. We also need systems and infrastructure that go beyond training.
Human intelligence is heavily based on logical reasoning. We understand cause and effect, deduce information from existing knowledge and make inferences. Reasoning algorithms let a system traverse knowledge representations, drawing conclusions and finding solutions. This goes beyond basic pattern matching, enabling a more humanlike problem-solving ability. Replicating similar processes is fundamental for an AI to achieve AGI.
The timing of artificial general intelligence remains uncertain, but when it does, its going to impact our lives, businesses and society significantly.
The real power of AI technology is still ahead of us.
Related Article:Can We Fix Artificial Intelligence's Serious PR Problem?
One of the prerequisites for achieving artificial general intelligence is the capability for AI inference, which is when an AI model produces accurate predictions or conclusions. Much of the computing power today is focused on model training. Model training is the stage when data is fed into a learning algorithm to produce a model. Training enables AI models to make accurate predictions when prompted.
AI can be divided into two major market segments training and inference. Today, many companies are focused on creating high-performance hardware for data center providers to conduct massive AI model training. For instance, Nvidia, controls more than 95% of the specialized AI chip market. They sell to major tech companies like Amazon, Meta, and Microsoft, which are believed to make up roughly 40% of its revenue.
However, the market will soon shift its focus to building inferencing infrastructure for generative AI applications. The inferencing market will quickly grow as Fortune 500 companies that are currently testing generative AI applications move into production deployment. New applications will also emerge that will require scale to support workloads across centralized cloud, edge computing and IoT (Internet of Things) devices.
Model training is a very computationally intensive process that takes a lot of time to complete. Inference is usually faster and much less resource-intensive. Inferencing boils down to running AI applications or workloads after models have been trained.
Inference is going to be 100 times bigger than training. Nvidia is really good at training but is not ideal for inference.
A pivot from training to inference may not be easy.
Nvidia was founded in 1993 long before the AI craze we see today. They were not initially focused on supplying AI hardware and software solutions and instead focused on creating graphics cards. As the PC market expanded and new applications such as Windows and gaming became prevalent, it became necessary to have dedicated hardware to handle the complicated tasks of 3D graphics processing. The opportunity to create high-performance processing units to support intensive computational operations in the PC and gaming market was not something that happens very often.
It turns out Nvidia struck gold with its GPU architectures. GPUs are well suited for AI for three primary reasons. They employ parallel processing; the systems scale up through high-performance interconnections creating supercomputing capabilities and the software for managing and tuning the stack for AI is broad and deep.
The idea of having separate hardware existed before Nvidia came onto the scene. For instance, the first Atari video game consoles, shipped in the 1970s, had graphics chips inside. And IBM had released the Professional Graphics Controller (PGA) which used an onboard Intel 8088 microprocessor to do video tasks. Silicon Graphics Inc or SGI also emerged as a dominant graphics player in the market in the late 1980s.
Things changed rapidly in 1993 with the release of a 3D game called Doom by game developer Id Software. Doom was the first mature, action-packed first-person shooter game on the market. Quake quickly followed and offered brand-new technical breakthroughs such as full real-time 3D rendering and online multiplayer. This paved the way for the dedicated graphics card market.
Nvidia didnt immediately rise to fame. The first product came in May 1995, called the NV1, which was a multimedia PCI card with graphics, sound, and gamepad support. However, the product flopped as the NV1 was not compatible with the leading graphics APIs at the time OpenGL, 3Dfx's Glide, etc. It wasnt until the Riva 128, launched in 1997 that the company saw success. At the time of launch, Nvidia had less than six weeks of cash left in the bank!
By the early 2000s, the graphics card market had drastically consolidated from over 30 to just three: Nvidia, ATI, and Intel taking up the low end. Nvidia coined the phrase General Processing Unit, or GPU, and set its sights on the broader compute market.
The opportunity to create new businesses in adjacent markets, outside your core business, is not something you see frequently. A shining example was Amazon, an online commerce company, that created a cloud computing platform, Amazon Web Services (AWS) from the technology components they created to run a massively scalable commerce platform. Uber, a ride-sharing company leveraged its backend infrastructure to launch a food delivery service called UberEATS.
In a similar fashion, Nvidia realized that its graphic processing units (GPUs) that powered many of the graphics hardware boards in PCs and gaming consoles had another use in accelerating mathematical operations. By investing in making GPUs programmable, they opened up their parallel processing capabilities to a wider variety of applications. This enabled high-performance computing to be more readily accessible and run on commodity hardware.
Their first venture into the high-performance computing (HPC) space with its CUDA parallel computing architecture, enabling GPUs to be used for general-purpose computing tasks. This capability helped sparked early breakthroughs in modern AI. Initial AI applications like Alexnet, a convolutional neural network (CNN) used to classify images, was unveiled in 2012. It was trained using just two of Nvidia's programmable GPUs.
The big discovery was that GPUs could massively accelerate neural network processing, or model training. As this began to spread among computer and data scientists, demand for Nvidias GPUs soared. In some ways, the AI revolution found Nvidia.
But that was just the beginning. Nvidias relentless pursuit of innovation led to a series of breakthrough architectures starting with the Turing architecture in 2018,which fused real-time ray tracing, AI, simulation, and rasterization to fundamentally change the way graphics processing worked. Turing featured new tensor cores, processors that accelerate deep learning training and inference, providing up to 500 trillion tensor operations per second. Tensor cores are essential building blocks of the NVIDIA solution that incorporates hardware, networking, software, libraries and optimized AI models. Tensor cores deliver significantly faster AI training times compared to traditional CUDA cores alone, which are primarily designed for general-purpose processing tasks and excel in parallel computing.
Nvidias rapid rate of innovation continued with subsequent architectural advancements with Ampere, Volta, Lovelace, Hopper and now Blackwell architectures. The H100 Tensor Core GPU was the first based on the Hopper architecture with over 80 billion transistors, built-in transformer engine, advanced NVLink inter-GPU communications and a second-generation multi-instance GPU (MIG).
The growth of computational power used to be governed by Moores Law, which predicted a doubling roughly every two years. Nvidias new Blackwell GPU has shattered expectations, increasing computational speed by over a thousand times in just eight years.
Whats good for training may not be good for inference.
There are still a limited number of AI applications in production today. Outside of a few large tech companies, very few corporations have advanced to running large-scale AI models in production. So most of the hardware focus has been on optimizing the hardware platform for training.
As the number of AI applications increases, the amount of compute a company uses for running models to respond to end-user requests will increase significantly. This will exceed the cost theyre spending on training today. The focus will then shift to optimizing hardware to reduce inference costs.
GPUs are well suited for the computational complexity of training. The workloads make it possible to split work across a few GPUs that are tightly interconnected. That makes reducing latency by distributing across low-end CPUs unrealistic.
However, this is not true for inference. The model weights are fixed and can easily be duplicated across many machines, so no communication is needed. This makes an army of commodity PCs and CPUs very appealing for applications relying on inference.
New companies like Groq are emerging that have the potential to be serious competitors in the AI chip market. This could pose a threat to Nvidia's dominance in the AI world.
Today, all the AI giants heavily rely on Nvidia to supply them with computing cards for mostly AI training with smaller demands on inference. The latest product, the H100 is still in high demand, remains costly (about $35,000 each) and only achieves inference speeds of 30-40 tokens per second. Compared to inference, training requires more stringent computing card specifications, especially in terms of memory size, which is growing close to 300 GB per card.
Groq's approach to neural network acceleration is radically different from Nvidias. The architecture opts for a single large processor with hundreds of functional units, which significantly reduces instruction decoding overhead. This architecture allows superior performance and reduced latencies, ideal for cloud services requiring real-time inferences.
Groqs secret sauce is its Logic Processing Unit (LPU) inference engines that are specifically engineered to address the two major bottlenecks faced by Large Language Models (LLMs) compute capacity and memory bandwidth. The LPU systems boast comparable, if not superior, compute power to GPUs and have eliminated external memory bandwidth bottlenecks, enabling faster generation of text sequences.
The realization that computational power was a bottleneck for AIs potential led to the inception of Groq and the creation of the LPU. Jonathan Rosswho initially began what became the TPU project at Google started Groq in 2016.
Nvidia remains well entrenched and will likely not be easy to dethrone. However, Groq has demonstrated that its vision of an innovative processor architecture can compete with industry giants.
There are tools emerging for machine learning that enable more efficient inferencing. Developed by Georgi Gerganov (the GG in GGML), GGML has emerged as a powerful and versatile tensor library, empowering developers to build and deploy high-performance machine learning applications across a wide spectrum of devices. It is designed to bring large-scale machine-learning models to commodity devices.
GGML is a lightweight engine that runs neural networks on C++. This is significant because it's fast, has no dependencies (pure C++) it's multi-platform, and can be easily ported to devices such as mobile phones. It defines a binary format for distributing large language models (LLMs) using quantization, a technique that allows LLMs to run on consumer hardware with effective CPU inferencing. It enables these big models to run on the CPU as fast as possible.
The benefit of GGML is it requires fewer resources to run, typically 4x less RAM requirements, and 4x less RAMbandwidthrequirements, and thus faster inference on the CPU.
Traditionally, inference is done on centralized servers in the cloud. However, tools like GGML are making it possible to do model inference on commodity devices at the network's edge. That is critical for low latency use cases like in self-driving cars.
GGML is empowering AI developers to harness the full potential of machine learning on everyday hardware. It provides an impressive array of features, is an open standard and has been optimized for Apple Silicon. GGML is poised to play a pivotal role in shaping the future of edge computing.
The future of AI is undoubtedly headed toward inference-centric workloads. While the training of LLMs and other complex AI models gets a lot of current attention, inference makes up the vast majority of actual AI workloads.
Enterprises should begin to understand how inference works and how it will help enable better use of AI to improve their products and services.
Learn how you can join our contributor community.
Link:
AGI and jumping to the New Inference Market S-Curve - CMSWire
- How much time do we have before Artificial General Intelligence (AGI) to turns into Artificial Self-preserving - The Times of India - November 5th, 2024 [November 5th, 2024]
- Simuli to Leap Forward in the Trek to Artificial General Intelligence through 2027 Hyperdimensional AI Ecosystem - USA TODAY - November 5th, 2024 [November 5th, 2024]
- Implications of Artificial General Intelligence on National and International Security - Yoshua Bengio - - October 31st, 2024 [October 31st, 2024]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - Business Insider - October 31st, 2024 [October 31st, 2024]
- James Cameron says the reality of artificial general intelligence is 'scarier' than the fiction of it - MSN - October 31st, 2024 [October 31st, 2024]
- Bot fresh hell is this?: Inside the rise of Artificial General Intelligence or AGI - MSN - October 31st, 2024 [October 31st, 2024]
- Artificial General Intelligence (AGI) Market to Reach $26.9 Billion by 2031 As Revealed In New Report - WhaTech - September 26th, 2024 [September 26th, 2024]
- 19 jobs artificial general intelligence (AGI) may replace and 10 jobs it could create - MSN - September 26th, 2024 [September 26th, 2024]
- Paige Appoints New Leadership to Further Drive Innovation, Bring Artificial General Intelligence to Pathology, and Expand Access to AI Applications -... - August 16th, 2024 [August 16th, 2024]
- Artificial General Intelligence, If Attained, Will Be the Greatest Invention of All Time - JD Supra - August 11th, 2024 [August 11th, 2024]
- OpenAI Touts New AI Safety Research. Critics Say Its a Good Step, but Not Enough - WIRED - July 22nd, 2024 [July 22nd, 2024]
- OpenAIs Project Strawberry Said to Be Building AI That Reasons and Does Deep Research - Singularity Hub - July 22nd, 2024 [July 22nd, 2024]
- One of the Best Ways to Invest in AI Is Dont - InvestorPlace - July 22nd, 2024 [July 22nd, 2024]
- OpenAI is plagued by safety concerns - The Verge - July 17th, 2024 [July 17th, 2024]
- OpenAI reportedly nears breakthrough with reasoning AI, reveals progress framework - Ars Technica - July 17th, 2024 [July 17th, 2024]
- ChatGPT maker OpenAI now has a scale to rank its AI - ReadWrite - July 17th, 2024 [July 17th, 2024]
- Heres how OpenAI will determine how powerful its AI systems are - The Verge - July 17th, 2024 [July 17th, 2024]
- OpenAI may be working on AI that can perform research without human help which should go fine - TechRadar - July 17th, 2024 [July 17th, 2024]
- OpenAI has a new scale for measuring how smart their AI models are becoming which is not as comforting as it should be - TechRadar - July 17th, 2024 [July 17th, 2024]
- OpenAI says there are 5 'levels' for AI to reach human intelligence it's already almost at level 2 - Quartz - July 17th, 2024 [July 17th, 2024]
- AIs Bizarro World, were marching towards AGI while carbon emissions soar - Fortune - July 17th, 2024 [July 17th, 2024]
- AI News Today July 15, 2024 - The Dales Report - July 17th, 2024 [July 17th, 2024]
- The Evolution Of Artificial Intelligence: From Basic AI To ASI - Welcome2TheBronx - July 17th, 2024 [July 17th, 2024]
- What Elon Musk and Ilya Sutskever Feared About OpenAI Is Becoming Reality - Observer - July 17th, 2024 [July 17th, 2024]
- Companies are losing faith in AI, and AI is losing money - Android Headlines - July 17th, 2024 [July 17th, 2024]
- AGI isn't here (yet): How to make informed, strategic decisions in the meantime - VentureBeat - June 16th, 2024 [June 16th, 2024]
- Apple's AI Privacy Measures, Elon Musk's Robot Prediction, And More: This Week In Artificial Intelligence - Alphabet ... - Benzinga - June 16th, 2024 [June 16th, 2024]
- Apple's big AI announcements were all about AI 'for the rest of us'Google, Meta, Amazon and, yes, OpenAI should ... - Fortune - June 16th, 2024 [June 16th, 2024]
- Elon Musk Withdraws His Lawsuit Against OpenAI and Sam Altman - The New York Times - June 16th, 2024 [June 16th, 2024]
- Staying Ahead of the AI Train - ATD - June 16th, 2024 [June 16th, 2024]
- OpenAI disbands its AI risk mitigation team - - May 20th, 2024 [May 20th, 2024]
- BEYOND LOCAL: 'Noise' in the machine: Human differences in judgment lead to problems for AI - The Longmont Leader - May 20th, 2024 [May 20th, 2024]
- Machine Learning Researcher Links OpenAI to Drug-Fueled Sex Parties - Futurism - May 20th, 2024 [May 20th, 2024]
- What Is AI? How Artificial Intelligence Works (2024) - Shopify - May 20th, 2024 [May 20th, 2024]
- Vitalik Buterin says OpenAI's GPT-4 has passed the Turing test - Cointelegraph - May 20th, 2024 [May 20th, 2024]
- "I lost trust": Why the OpenAI team in charge of safeguarding humanity imploded - Vox.com - May 18th, 2024 [May 18th, 2024]
- 63% of surveyed Americans want government legislation to prevent super intelligent AI from ever being achieved - PC Gamer - May 18th, 2024 [May 18th, 2024]
- Top OpenAI researcher resigns, saying company prioritized 'shiny products' over AI safety - Fortune - May 18th, 2024 [May 18th, 2024]
- The revolution in artificial intelligence and artificial general intelligence - Washington Times - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands team devoted to artificial intelligence risks - Yahoo! Voices - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands safety team focused on risk of artificial intelligence causing 'human extinction' - New York Post - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands team devoted to artificial intelligence risks - Port Lavaca Wave - May 18th, 2024 [May 18th, 2024]
- OpenAI disbands team devoted to artificial intelligence risks - Moore County News Press - May 18th, 2024 [May 18th, 2024]
- Generative AI Is Totally Shameless. I Want to Be It - WIRED - May 18th, 2024 [May 18th, 2024]
- OpenAI researcher resigns, claiming safety has taken a backseat to shiny products - The Verge - May 18th, 2024 [May 18th, 2024]
- Most of Surveyed Americans Do Not Want Super Intelligent AI - 80.lv - May 18th, 2024 [May 18th, 2024]
- A former OpenAI leader says safety has 'taken a backseat to shiny products' at the AI company - Winnipeg Free Press - May 18th, 2024 [May 18th, 2024]
- DeepMind CEO says Google to spend more than $100B on AGI despite hype - Cointelegraph - April 20th, 2024 [April 20th, 2024]
- Congressional panel outlines five guardrails for AI use in House - FedScoop - April 20th, 2024 [April 20th, 2024]
- The Potential and Perils of Advanced Artificial General Intelligence - elblog.pl - April 20th, 2024 [April 20th, 2024]
- DeepMind Head: Google AI Spending Could Exceed $100 Billion - PYMNTS.com - April 20th, 2024 [April 20th, 2024]
- Say hi to Tong Tong, world's first AGI child-image figure - ecns - April 20th, 2024 [April 20th, 2024]
- Silicon Scholars: AI and The Muslim Ummah - IslamiCity - April 20th, 2024 [April 20th, 2024]
- AI stocks aren't like the dot-com bubble. Here's why - Quartz - April 20th, 2024 [April 20th, 2024]
- AI vs. AGI: The Race for Performance, Battling the Cost? for NASDAQ:GOOG by Moshkelgosha - TradingView - April 20th, 2024 [April 20th, 2024]
- We've Been Here Before: AI Promised Humanlike Machines In 1958 - The Good Men Project - April 20th, 2024 [April 20th, 2024]
- Google will spend more than $100 billion on AI, exec says - Quartz - April 20th, 2024 [April 20th, 2024]
- Tech companies want to build artificial general intelligence. But who decides when AGI is attained? - The Bakersfield Californian - April 8th, 2024 [April 8th, 2024]
- Tech companies want to build artificial general intelligence. But who decides when AGI is attained? - The Caledonian-Record - April 8th, 2024 [April 8th, 2024]
- What is AGI and how is it different from AI? - ReadWrite - April 8th, 2024 [April 8th, 2024]
- Artificial intelligence in healthcare: defining the most common terms - HealthITAnalytics.com - April 8th, 2024 [April 8th, 2024]
- We're Focusing on the Wrong Kind of AI Apocalypse - TIME - April 8th, 2024 [April 8th, 2024]
- Xi Jinping's vision in supporting the artificial intelligence at home and abroad - Modern Diplomacy - April 8th, 2024 [April 8th, 2024]
- As 'The Matrix' turns 25, the chilling artificial intelligence (AI) projection at its core isn't as outlandish as it once seemed - TechRadar - April 8th, 2024 [April 8th, 2024]
- AI & robotics briefing: Why superintelligent AI won't sneak up on us - Nature.com - January 10th, 2024 [January 10th, 2024]
- Get Ready for the Great AI Disappointment - WIRED - January 10th, 2024 [January 10th, 2024]
- Part 3 Capitalism in the Age of Artificial General Intelligence (AGI) - Medium - January 10th, 2024 [January 10th, 2024]
- Artificial General Intelligence (AGI): what it is and why its discovery can change the world - Medium - January 10th, 2024 [January 10th, 2024]
- Exploring the Path to Artificial General Intelligence - Medriva - January 10th, 2024 [January 10th, 2024]
- The Acceleration Towards Artificial General Intelligence (AGI) and Its Implications - Medriva - January 10th, 2024 [January 10th, 2024]
- OpenAI Warns: "AGI Is Coming" - Do we have a reason to worry? - Medium - January 10th, 2024 [January 10th, 2024]
- The fight over ethics intensifies as artificial intelligence quickly changes the world - 9 & 10 News - January 10th, 2024 [January 10th, 2024]
- AI as the Third Window into Humanity: Understanding Human Behavior and Emotions - Medriva - January 10th, 2024 [January 10th, 2024]
- Artificial General Intelligence (AGI) in Radiation Oncology: Transformative Technology - Medriva - January 10th, 2024 [January 10th, 2024]
- Exploring the Potential of AGI: Opportunities and Challenges - Medium - January 10th, 2024 [January 10th, 2024]
- Full-Spectrum Cognitive Development Incorporating AI for Evolution and Collective Intelligence - Medriva - January 10th, 2024 [January 10th, 2024]
- Artificial Superintelligence - Understanding a Future Tech that Will Change the World! - MobileAppDaily - January 10th, 2024 [January 10th, 2024]
- Title: AI Unveiled: Exploring the Realm of Artificial Intelligence - Medium - January 10th, 2024 [January 10th, 2024]
- The Simple Reason Why AGI (Artificial General Intelligence) Is Not ... - Medium - December 2nd, 2023 [December 2nd, 2023]
- What does the future hold for generative AI? - MIT News - December 2nd, 2023 [December 2nd, 2023]