An introduction to generative AI with Swami Sivasubramanian – All Things Distributed
In the last few months, weve seen an explosion of interest in generative AI and the underlying technologies that make it possible. It has pervaded the collective consciousness for many, spurring discussions from board rooms to parent-teacher meetings. Consumers are using it, and businesses are trying to figure out how to harness its potential. But it didnt come out of nowhere machine learning research goes back decades. In fact, machine learning is something that weve done well at Amazon for a very long time. Its used for personalization on the Amazon retail site, its used to control robotics in our fulfillment centers, its used by Alexa to improve intent recognition and speech synthesis. Machine learning is in Amazons DNA.
To get to where we are, its taken a few key advances. First, was the cloud. This is the keystone that provided the massive amounts of compute and data that are necessary for deep learning. Next, were neural nets that could understand and learn from patterns. This unlocked complex algorithms, like the ones used for image recognition. Finally, the introduction of transformers. Unlike RNNs, which process inputs sequentially, transformers can process multiple sequences in parallel, which drastically speeds up training times and allows for the creation of larger, more accurate models that can understand human knowledge, and do things like write poems, even debug code.
I recently sat down with an old friend of mine, Swami Sivasubramanian, who leads database, analytics and machine learning services at AWS. He played a major role in building the original Dynamo and later bringing that NoSQL technology to the world through Amazon DynamoDB. During our conversation I learned a lot about the broad landscape of generative AI, what were doing at Amazon to make large language and foundation models more accessible, and last, but not least, how custom silicon can help to bring down costs, speed up training, and increase energy efficiency.
We are still in the early days, but as Swami says, large language and foundation models are going to become a core part of every application in the coming years. Im excited to see how builders use this technology to innovate and solve hard problems.
To think, it was more than 17 years ago, on his first day, that I gave Swami two simple tasks: 1/ help build a database that meets the scale and needs of Amazon; 2/ re-examine the data strategy for the company. He says it was an ambitious first meeting. But I think hes done a wonderful job.
If youd like to read more about what Swamis teams have built, you can read more here. The entire transcript of our conversation is available below. Now, as always, go build!
This transcript has been lightly edited for flow and readability.
***
Werner Vogels: Swami, we go back a long time. Do you remember your first day at Amazon?
Swami Sivasubramanian: I still remember… it wasnt very common for PhD students to join Amazon at that time, because we were known as a retailer or an ecommerce site.
WV: We were building things and thats quite a departure for an academic. Definitely for a PhD student. To go from thinking, to actually, how do I build?
So you brought DynamoDB to the world, and quite a few other databases since then. But now, under your purview theres also AI and machine learning. So tell me, what does your world of AI look like?
SS: After building a bunch of these databases and analytic services, I got fascinated by AI because literally, AI and machine learning puts data to work.
If you look at machine learning technology itself, broadly, its not necessarily new. In fact, some of the first papers on deep learning were written like 30 years ago. But even in those papers, they explicitly called out for it to get large scale adoption, it required a massive amount of compute and a massive amount of data to actually succeed. And thats what cloud got us to to actually unlock the power of deep learning technologies. Which led me to this is like 6 or 7 years ago to start the machine learning organization, because we wanted to take machine learning, especially deep learning style technologies, from the hands of scientists to everyday developers.
WV: If you think about the early days of Amazon (the retailer), with similarities and recommendations and things like that, were they the same algorithms that were seeing used today? Thats a long time ago almost 20 years.
SS: Machine learning has really gone through huge growth in the complexity of the algorithms and the applicability of use cases. Early on the algorithms were a lot simpler, like linear algorithms or gradient boosting.
The last decade, it was all around deep learning, which was essentially a step up in the ability for neural nets to actually understand and learn from the patterns, which is effectively what all the image based or image processing algorithms come from. And then also, personalization with different kinds of neural nets and so forth. And thats what led to the invention of Alexa, which has a remarkable accuracy compared to others. The neural nets and deep learning has really been a step up. And the next big step up is what is happening today in machine learning.
WV: So a lot of the talk these days is around generative AI, large language models, foundation models. Tell me, why is that different from, lets say, the more task-based, like fission algorithms and things like that?
SS: If you take a step back and look at all these foundation models, large language models… these are big models, which are trained with hundreds of millions of parameters, if not billions. A parameter, just to give context, is like an internal variable, where the ML algorithm must learn from its data set. Now to give a sense… what is this big thing suddenly that has happened?
A few things. One, transformers have been a big change. A transformer is a kind of a neural net technology that is remarkably scalable than previous versions like RNNs or various others. So what does this mean? Why did this suddenly lead to all this transformation? Because it is actually scalable and you can train them a lot faster, and now you can throw a lot of hardware and a lot of data [at them]. Now that means, I can actually crawl the entire world wide web and actually feed it into these kind of algorithms and start building models that can actually understand human knowledge.
WV: So the task-based models that we had before and that we were already really good at could you build them based on these foundation models? Task specific models, do we still need them?
SS: The way to think about it is that the need for task-based specific models are not going away. But what essentially is, is how we go about building them. You still need a model to translate from one language to another or to generate code and so forth. But how easy now you can build them is essentially a big change, because with foundation models, which are the entire corpus of knowledge… thats a huge amount of data. Now, it is simply a matter of actually building on top of this and fine tuning with specific examples.
Think about if youre running a recruiting firm, as an example, and you want to ingest all your resumes and store it in a format that is standard for you to search an index on. Instead of building a custom NLP model to do all that, now using foundation models with a few examples of an input resume in this format and here is the output resume. Now you can even fine tune these models by just giving a few specific examples. And then you essentially are good to go.
WV: So in the past, most of the work went into probably labeling the data. I mean, and that was also the hardest part because that drives the accuracy.
SS: Exactly.
WV: So in this particular case, with these foundation models, labeling is no longer needed?
SS: Essentially. I mean, yes and no. As always with these things there is a nuance. But a majority of what makes these large scale models remarkable, is they actually can be trained on a lot of unlabeled data. You actually go through what I call a pre-training phase, which is essentially you collect data sets from, lets say the world wide Web, like common crawl data or code data and various other data sets, Wikipedia, whatnot. And then actually, you dont even label them, you kind of feed them as it is. But you have to, of course, go through a sanitization step in terms of making sure you cleanse data from PII, or actually all other stuff for like negative things or hate speech and whatnot. Then you actually start training on a large number of hardware clusters. Because these models, to train them can take tens of millions of dollars to actually go through that training. Finally, you get a notion of a model, and then you go through the next step of what is called inference.
WV: Lets take object detection in video. That would be a smaller model than what we see now with the foundation models. Whats the cost of running a model like that? Because now, these models with hundreds of billions of parameters are very large.
SS: Yeah, thats a great question, because there is so much talk already happening around training these models, but very little talk on the cost of running these models to make predictions, which is inference. Its a signal that very few people are actually deploying it at runtime for actual production. But once they actually deploy in production, they will realize, oh no, these models are very, very expensive to run. And that is where a few important techniques actually really come into play. So one, once you build these large models, to run them in production, you need to do a few things to make them affordable to run at scale, and run in an economical fashion. Ill hit some of them. One is what we call quantization. The other one is what I call a distillation, which is that you have these large teacher models, and even though they are trained on hundreds of billions of parameters, they are distilled to a smaller fine-grain model. And speaking in a super abstract term, but that is the essence of these models.
WV: So we do build… we do have custom hardware to help out with this. Normally this is all GPU-based, which are expensive energy hungry beasts. Tell us what we can do with custom silicon hatt sort of makes it so much cheaper and both in terms of cost as well as, lets say, your carbon footprint.
SS: When it comes to custom silicon, as mentioned, the cost is becoming a big issue in these foundation models, because they are very very expensive to train and very expensive, also, to run at scale. You can actually build a playground and test your chat bot at low scale and it may not be that big a deal. But once you start deploying at scale as part of your core business operation, these things add up.
In AWS, we did invest in our custom silicons for training with Tranium and with Inferentia with inference. And all these things are ways for us to actually understand the essence of which operators are making, or are involved in making, these prediction decisions, and optimizing them at the core silicon level and software stack level.
WV: If cost is also a reflection of energy used, because in essence thats what youre paying for, you can also see that they are, from a sustainability point of view, much more important than running it on general purpose GPUs.
WV: So theres a lot of public interest in this recently. And it feels like hype. Is this something where we can see that this is a real foundation for future application development?
SS: First of all, we are living in very exciting times with machine learning. I have probably said this now every year, but this year it is even more special, because these large language models and foundation models truly can enable so many use cases where people dont have to staff separate teams to go build task specific models. The speed of ML model development will really actually increase. But you wont get to that end state that you want in the next coming years unless we actually make these models more accessible to everybody. This is what we did with Sagemaker early on with machine learning, and thats what we need to do with Bedrock and all its applications as well.
But we do think that while the hype cycle will subside, like with any technology, but these are going to become a core part of every application in the coming years. And they will be done in a grounded way, but in a responsible fashion too, because there is a lot more stuff that people need to think through in a generative AI context. What kind of data did it learn from, to actually, what response does it generate? How truthful it is as well? This is the stuff we are excited to actually help our customers [with].
WV: So when you say that this is the most exciting time in machine learning what are you going to say next year?
More:
An introduction to generative AI with Swami Sivasubramanian - All Things Distributed
- Machine Learning in Drug Discovery Market to Witness Exponential Growth: Key Players, $250M Eli Lilly Deal & Regional Insights for 2025-2034 -... - July 18th, 2025 [July 18th, 2025]
- Automated seafood freshness detection and preservation analysis using machine learning and paper-based pH sensors - Nature - July 18th, 2025 [July 18th, 2025]
- Do You Know What It Means To Train a Machine Learning Model? - LSU - July 18th, 2025 [July 18th, 2025]
- Establishment of an interpretable MRI radiomics-based machine learning model capable of predicting axillary lymph node metastasis in invasive breast... - July 18th, 2025 [July 18th, 2025]
- A Machine Learning-Reconstructed Dataset of River Discharge, Temperature, and Heat Flux into the Arctic Ocean - Nature - July 18th, 2025 [July 18th, 2025]
- Leveraging computational linguistics and machine learning for detection of ultra-high risk of mental health disorders in youths | Schizophrenia -... - July 18th, 2025 [July 18th, 2025]
- Development and validation of machine learning-based diagnostic models using blood transcriptomics for early childhood diabetes prediction - Frontiers - July 18th, 2025 [July 18th, 2025]
- Fatigue and stamina prediction of athletic person on track using thermal facial biomarkers and optimized machine learning algorithm - Nature - July 18th, 2025 [July 18th, 2025]
- Identifying the crucial oncogenic mechanisms of DDX56 based on a machine learning-based integration model of RNA-binding proteins - Nature - July 18th, 2025 [July 18th, 2025]
- AI and Machine Learning Skills Are Make or Break for Developers: 71% of Tech Leaders Wont Hire Without Them - Yahoo Finance - July 18th, 2025 [July 18th, 2025]
- Developing an explainable machine learning and fog computing-based visual rating scale for the prediction of dementia progression - Nature - July 18th, 2025 [July 18th, 2025]
- Prognosis of air quality index and air pollution using machine learning techniques - Nature - July 18th, 2025 [July 18th, 2025]
- Integrating vision transformer-based deep learning model with kernel extreme learning machine for non-invasive diagnosis of neonatal jaundice using... - July 18th, 2025 [July 18th, 2025]
- PlayStation 6 Likely to Feature 24 GB RAM for Advanced Ray Tracing and Machine Learning Without Raising Costs - Wccftech - July 18th, 2025 [July 18th, 2025]
- Machine Learning-Assisted Iterative Screening for Efficient Detection of Drug Discovery Starting Points - ACS Publications - July 16th, 2025 [July 16th, 2025]
- 2025 IT Camp on AI & Machine Learning for Beginners to be held August 5 - Southeastern Oklahoma State University - July 16th, 2025 [July 16th, 2025]
- Utilizing machine learning to predict MRI signal outputs from iron oxide nanoparticles through the PSLG algorithm - Nature - July 16th, 2025 [July 16th, 2025]
- Developing a machine-learning model to enable treatment selection for neoadjuvant chemotherapy for esophageal cancer - Nature - July 16th, 2025 [July 16th, 2025]
- Advancing crop recommendation system with supervised machine learning and explainable artificial intelligence - Nature - July 16th, 2025 [July 16th, 2025]
- Predicting clozapine-induced adverse drug reaction biomarkers using machine learning - Nature - July 16th, 2025 [July 16th, 2025]
- Postoperative complication severity prediction in penile prosthesis implantation: a machine learning-based predictive modeling study - Nature - July 16th, 2025 [July 16th, 2025]
- The Future of AI & Machine Learning: Perspective on Shaping Tomorrows Business Landscape - Vocal - July 16th, 2025 [July 16th, 2025]
- Machine Learning: Your Ticket to a Thriving Career in the Tech World - The Impressive Times - July 14th, 2025 [July 14th, 2025]
- Integrative analysis of multi-omics data and gut microbiota composition reveals prognostic subtypes and predicts immunotherapy response in colorectal... - July 14th, 2025 [July 14th, 2025]
- Comprehensive multi-omics and machine learning framework for glioma subtyping and precision therapeutics - Nature - July 14th, 2025 [July 14th, 2025]
- Development and validation of a machine learning-based nomogram for survival prediction of patients with hilar cholangiocarcinoma after... - July 12th, 2025 [July 12th, 2025]
- Geochemical-integrated machine learning approach predicts the distribution of cadmium speciation in European and Chinese topsoils - Nature - July 12th, 2025 [July 12th, 2025]
- Machine learning-based construction of a programmed cell death-related model reveals prognosis and immune infiltration in pancreatic adenocarcinoma... - July 12th, 2025 [July 12th, 2025]
- Application of supervised machine learning and unsupervised data compression models for pore pressure prediction employing drilling, petrophysical,... - July 12th, 2025 [July 12th, 2025]
- Machine learning identifies lipid-associated genes and constructs diagnostic and prognostic models for idiopathic pulmonary fibrosis - Orphanet... - July 12th, 2025 [July 12th, 2025]
- An evaluation methodology for machine learning-based tandem mass spectra similarity prediction - BMC Bioinformatics - July 12th, 2025 [July 12th, 2025]
- The Rise of AI in Trading: Machine Learning and the Stock Market - Disruption Banking - July 12th, 2025 [July 12th, 2025]
- Integrative analysis identifies IL-6/JUN/MMP-9 pathway destroyed blood-brain-barrier in autism mice via machine learning and bioinformatic analysis -... - July 12th, 2025 [July 12th, 2025]
- Interpretive prediction of hyperuricemia and gout patients via machine learning analysis of human gut microbiome - BMC Microbiology - July 10th, 2025 [July 10th, 2025]
- Machine learning-based identification of key factors and spatial heterogeneity analysis of urban flooding: a case study of the central urban area of... - July 10th, 2025 [July 10th, 2025]
- Developing machine learning frameworks to predict mechanical properties of ultra-high performance concrete mixed with various industrial byproducts -... - July 10th, 2025 [July 10th, 2025]
- Small Drones Market Trend Analysis and Forecast Report 2025-2034 | AI and Machine Learning Revolutionizing Autonomous Operations, Trade Tariffs Push... - July 10th, 2025 [July 10th, 2025]
- When a model touches millions: Hatim Kagalwala on accuracy accountability, and applied machine learning - Dataconomy - July 10th, 2025 [July 10th, 2025]
- New Study Uses Gait Data and Machine Learning for Early Detection of Anxiety and Depression - AZoSensors - July 10th, 2025 [July 10th, 2025]
- Machine Learning and the Evolution of Mobile Apps - CIO Applications - July 10th, 2025 [July 10th, 2025]
- Artificial Intelligence, Machine Learning, and Big Data in Thailand: Legal and Regulatory Developments 2025 - Lexology - July 10th, 2025 [July 10th, 2025]
- Karen Hao on how the AI boom became a new imperial frontier - Machine Learning Week 2025 - July 8th, 2025 [July 8th, 2025]
- Machine Learning and AI in Enhancing Image Analysis of 3D Samples - Drug Target Review - July 8th, 2025 [July 8th, 2025]
- Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027 - Machine Learning Week 2025 - July 8th, 2025 [July 8th, 2025]
- Explainable machine learning model for predicting the transarterial chemoembolization response and subtypes of hepatocellular carcinoma patients - BMC... - July 8th, 2025 [July 8th, 2025]
- Identification and validation of glucocorticoid receptor and programmed cell death-related genes in spinal cord injury using machine learning - Nature - July 8th, 2025 [July 8th, 2025]
- Multiclass leukemia cell classification using hybrid deep learning and machine learning with CNN-based feature extraction - Nature - July 6th, 2025 [July 6th, 2025]
- Predictive modeling and machine learning show poor performance of clinical, morphological, and hemodynamic parameters for small intracranial aneurysm... - July 6th, 2025 [July 6th, 2025]
- A robust machine learning approach to predicting remission and stratifying risk in rheumatoid arthritis patients treated with bDMARDs - Nature - July 6th, 2025 [July 6th, 2025]
- Ultrabroadband and band-selective thermal meta-emitters by machine learning - Nature - July 4th, 2025 [July 4th, 2025]
- Machine Learning is Surprisingly Good at Simulating the Universe - Universe Today - July 4th, 2025 [July 4th, 2025]
- Machine learning-assisted multi-dimensional transcriptomic analysis of cytoskeleton-related molecules and their relationship with prognosis in... - July 4th, 2025 [July 4th, 2025]
- Machine learning combined with multi-omics to identify immune-related LncRNA signature as biomarkers for predicting breast cancer prognosis - Nature - July 4th, 2025 [July 4th, 2025]
- Comprehensive machine learning analysis of PANoptosis signatures in multiple myeloma identifies prognostic and immunotherapy biomarkers - Nature - July 4th, 2025 [July 4th, 2025]
- Enhancing game outcome prediction in the Chinese basketball league through a machine learning framework based on performance data - Nature - July 4th, 2025 [July 4th, 2025]
- A novel double machine learning approach for detecting early breast cancer using advanced feature selection and dimensionality reduction techniques -... - July 4th, 2025 [July 4th, 2025]
- Machine learning for Parkinsons disease: a comprehensive review of datasets, algorithms, and challenges - Nature - July 4th, 2025 [July 4th, 2025]
- Cervical cancer prediction using machine learning models based on routine blood analysis - Nature - July 4th, 2025 [July 4th, 2025]
- Enhancing anomaly detection in IoT-driven factories using Logistic Boosting, Random Forest, and SVM: A comparative machine learning approach - Nature - July 4th, 2025 [July 4th, 2025]
- Predicting car accident severity in Northwest Ethiopia: a machine learning approach leveraging driver, environmental, and road conditions - Nature - July 4th, 2025 [July 4th, 2025]
- Sensormatic Solutions Adds Machine Learning to Shrink Analyzer - Ink World magazine - July 4th, 2025 [July 4th, 2025]
- Exploring the link between the ZJU index and sarcopenia in adults aged 2059 using NHANES and machine learning - Nature - July 4th, 2025 [July 4th, 2025]
- Combining multi-parametric MRI radiomics features with tumor abnormal protein to construct a machine learning-based predictive model for prostate... - July 2nd, 2025 [July 2nd, 2025]
- New insight into viscosity prediction of imidazolium-based ionic liquids and their mixtures with machine learning models - Nature - July 2nd, 2025 [July 2nd, 2025]
- Implementing partial least squares and machine learning regressive models for prediction of drug release in targeted drug delivery application -... - July 2nd, 2025 [July 2nd, 2025]
- Advanced analysis of defect clusters in nuclear reactors using machine learning techniques - Nature - July 2nd, 2025 [July 2nd, 2025]
- Machine learning analysis of kinematic movement features during functional tasks to discriminate chronic neck pain patients from asymptomatic controls... - July 2nd, 2025 [July 2nd, 2025]
- Enhanced machine learning models for predicting three-year mortality in Non-STEMI patients aged 75 and above - BMC Geriatrics - July 2nd, 2025 [July 2nd, 2025]
- Modeling seawater intrusion along the Alabama coastline using physical and machine learning models to evaluate the effects of multiscale natural and... - July 2nd, 2025 [July 2nd, 2025]
- A comprehensive study based on machine learning models for early identification Mycoplasma pneumoniae infection in segmental/lobar pneumonia - Nature - July 2nd, 2025 [July 2nd, 2025]
- Identifying ovarian cancer with machine learning DNA methylation pattern analysis - Nature - July 2nd, 2025 [July 2nd, 2025]
- High-isolation dual-band MIMO antenna for next-generation 5G wireless networks at 28/38 GHz with machine learning-based gain prediction - Nature - July 2nd, 2025 [July 2nd, 2025]
- Sony and AMD want to focus on machine learning for the PS6 - Instant Gaming News - July 2nd, 2025 [July 2nd, 2025]
- How Machine Learning is Reshaping the Future of Sports Betting? - London Daily News - July 2nd, 2025 [July 2nd, 2025]
- An interpretable machine learning model for predicting depression in middle-aged and elderly cancer patients in China: a study based on the CHARLS... - July 2nd, 2025 [July 2nd, 2025]
- These Eight Projects Showcase the Power of Machine Learning on the Edge - Hackster.io - June 29th, 2025 [June 29th, 2025]
- Build Custom AI Tools for Your AI Agents that Combine Machine Learning and Statistical Analysis - MarkTechPost - June 29th, 2025 [June 29th, 2025]
- Check out these essential tips and trends for SEO in 2025 as AI and machine learning loom large - EdTech Innovation Hub - June 29th, 2025 [June 29th, 2025]
- Using machine learning to predict the severity of salmonella infection - Open Access Government - June 28th, 2025 [June 28th, 2025]
- How AI and machine learning are transforming drug discovery - Pharmaceutical Technology - June 28th, 2025 [June 28th, 2025]