Archive for the ‘Machine Learning’ Category

Biological research and self-driving labs in deep space supported by artificial intelligence – Nature.com

Afshinnekoo, E. et al. Fundamental biological features of spaceflight: advancing the field to enable deep-space exploration. Cell 183, 11621184 (2020).

Article Google Scholar

Loftus, D. J., Rask, J. C., McCrossin, C. G. & Tranfield, E. M. The chemical reactivity of lunar dust: from toxicity to astrobiology. Earth Moon Planets 107, 95105 (2010).

Article Google Scholar

Pohlen, M., Carroll, D., Prisk, G. K. & Sawyer, A. J. Overview of lunar dust toxicity risk. NPJ Microgravity 8, 55 (2022).

Paul, A.-L. & Ferl, R. J. The biology of low atmospheric pressureimplications for exploration mission design and advanced life support. Am. Soc. Gravit. Space Biol. 19, 317 (2005).

Council, N. R. Recapturing a Future for Space Exploration: Life and Physical Sciences Research for a New Era (National Academies Press, 2011).

Goswami, N. et al. Maximizing information from space data resources: a case for expanding integration across research disciplines. Eur. J. Appl. Physiol. 113, 16451654 (2013).

Article Google Scholar

Nangle, S. N. et al. The case for biotech on Mars. Nat. Biotechnol. 38, 401407 (2020).

Article Google Scholar

Costes, S. V., Sanders, L. M. & Scott, R. T. Workshop on Artificial Intelligence & Modeling for Space Biology. Zenodo https://doi.org/10.5281/zenodo.7508535 (2023).

Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255260 (2015).

Article MathSciNet MATH Google Scholar

Topol, E. J. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again (Basic Books, 2019).

Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 4456 (2019).

Article Google Scholar

Scott, R. T. et al. Biomonitoring and precision health in deep space supported by artificial intelligence. Nat. Mach. Intell. https://doi.org/10.1038/s42256-023-00617-5 (2023).

National Academies of Sciences, Engineering, and Medicine, Policy and Global Affairs, Board on Research Data and Information & Committee on Toward an Open Science Enterprise Open Science by Design: Realizing a Vision for 21st Century Research (National Academies Press, 2018).

Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

Article Google Scholar

Berrios, D. C., Beheshti, A. & Costes, S. V. FAIRness and usability for open-access omics data systems. AMIA Annu. Symp. Proc. 2018, 232241 (2018).

Google Scholar

Low, L. A. & Giulianotti, M. A. Tissue chips in space: modeling human diseases in microgravity. Pharm. Res. 37, 8 (2019).

Article Google Scholar

Ronca, A. E., Souza, K. A. & Mains, R. C. (eds) Translational Cell and Animal Research in Space: 19652011 NASA Special Publication NASA/SP-2015-625 (NASA Ames Research Center, 2016).

Alwood, J. S. et al. From the bench to exploration medicine: NASA life sciences translational research for human exploration and habitation missions. NPJ Microgravity 3, 5 (2017).

Schatten, H., Lewis, M. L. & Chakrabarti, A. Spaceflight and clinorotation cause cytoskeleton and mitochondria changes and increases in apoptosis in cultured cells. Acta Astronaut. 49, 399418 (2001).

Article Google Scholar

Shi, L. et al. Spaceflight and simulated microgravity suppresses macrophage development via altered RAS/ERK/NFB and metabolic pathways. Cell. Mol. Immunol. 18, 14891502 (2021).

Article Google Scholar

Ferl, R. J., Koh, J., Denison, F. & Paul, A.-L. Spaceflight induces specific alterations in the proteomes of Arabidopsis. Astrobiology 15, 3256 (2015).

Article Google Scholar

Ou, X. et al. Spaceflight induces both transient and heritable alterations in DNA methylation and gene expression in rice (Oryza sativa L.). Mutat. Res. 662, 4453 (2009).

Article Google Scholar

Overbey, E. G. et al. Spaceflight influences gene expression, photoreceptor integrity, and oxidative stress-related damage in the murine retina. Sci. Rep. 9, 13304 (2019).

Article Google Scholar

Clment, G. & Slenzka, K. Fundamentals of Space Biology: Research on Cells, Animals, and Plants in Space (Springer Science & Business Media, 2006).

Yeung, C. K. et al. Tissue chips in space-challenges and opportunities. Clin. Transl. Sci. 13, 810 (2020).

Article Google Scholar

Low, L. A., Mummery, C., Berridge, B. R., Austin, C. P. & Tagle, D. A. Organs-on-chips: into the next decade. Nat. Rev. Drug Discov. 20, 345361 (2021).

Article Google Scholar

Globus, R. K. & Morey-Holton, E. Hindlimb unloading: rodent analog for microgravity. J. Appl. Physiol. 120, 11961206 (2016).

Article Google Scholar

Simonsen, L. C., Slaba, T. C., Guida, P. & Rusek, A. NASAs first ground-based Galactic cosmic ray simulator: enabling a new era in space radiobiology research. PLoS Biol. 18, e3000669 (2020).

Article Google Scholar

Buckey, J. C. Jr & Homick, J. L. The Neurolab Spacelab Mission: Neuroscience Research in Space: Results from the STS-90, Neurolab Spacelab Mission. NASA Technical Reports Server (NASA, 2003).

Diallo, O. N. et al. Impact of the International Space Station Research Results. NASA Technical Reports Server (NASA, 2019).

Vandenbrink, J. P. & Kiss, J. Z. Space, the final frontier: a critical review of recent experiments performed in microgravity. Plant Sci. 243, 115119 (2016).

Article Google Scholar

Massaro Tieze, S., Liddell, L. C., Santa Maria, S. R. & Bhattacharya, S. BioSentinel: a biological CubeSat for deep space exploration. Astrobiology https://doi.org/10.1089/ast.2019.2068 (2020).

Ricco, A. J., Maria, S. R. S., Hanel, R. P. & Bhattacharya, S. BioSentinel: a 6U nanosatellite for deep-space biological science. IEEE Aerospace Electron. Syst. Mag. 35, 618 (2020).

Article Google Scholar

Chen, Y. et al. Automated cells-to-peptides sample preparation workflow for high-throughput, quantitative proteomic assays of microbes. J. Proteome Res. 18, 37523761 (2019).

Article Google Scholar

Zampieri, M., Sekar, K., Zamboni, N. & Sauer, U. Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 36, 1523 (2017).

Article Google Scholar

Stephens, Z. D. et al. Big data: astronomical or genomical? PLoS Biol. 13, e1002195 (2015).

Article Google Scholar

Tomczak, K., Czerwiska, P. & Wiznerowicz, M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. 19, A68A77 (2015).

Google Scholar

Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580585 (2013).

Article Google Scholar

Atta, L. & Fan, J. Computational challenges and opportunities in spatially resolved transcriptomic data analysis. Nat. Commun. 12, 5283 (2021).

Article Google Scholar

Marx, V. Method of the year: spatially resolved transcriptomics. Nat. Methods 18, 914 (2021).

Article Google Scholar

Deamer, D., Akeson, M. & Branton, D. Three decades of nanopore sequencing. Nat. Biotechnol. 34, 518524 (2016).

Article Google Scholar

Mardis, E. R. DNA sequencing technologies: 20062016. Nat. Protoc. 12, 213218 (2017).

Article Google Scholar

Stuart, T. & Satija, R. Integrative single-cell analysis. Nat. Rev. Genet. 20, 257272 (2019).

Article Google Scholar

Asp, M. et al. A spatiotemporal organ-wide gene expression and cell atlas of the developing human heart. Cell 179, 16471660.e19 (2019).

Article Google Scholar

Giacomello, S. et al. Spatially resolved transcriptome profiling in model plant species. Nat Plants 3, 17061 (2017).

Article Google Scholar

Mao, X. W. et al. Characterization of mouse ocular response to a 35-day spaceflight mission: evidence of blood-retinal barrier disruption and ocular adaptations. Sci. Rep. 9, 8215 (2019).

Article Google Scholar

Jonscher, K. R. et al. Spaceflight activates lipotoxic pathways in mouse liver. PLoS ONE 11, e0152877 (2016).

Article Google Scholar

Beheshti, A. et al. Multi-omics analysis of multiple missions to space reveal a theme of lipid dysregulation in mouse liver. Sci. Rep. 9, 19195 (2019).

Article Google Scholar

Malkani, S. et al. Circulating miRNA spaceflight signature reveals targets for countermeasure development. Cell Rep. 33, 108448 (2020).

Article Google Scholar

da Silveira, W. A. et al. Comprehensive multi-omics analysis reveals mitochondrial stress as a central biological hub for spaceflight impact. Cell 183, 11851201.e20 (2020).

Article Google Scholar

Jiang, P., Green, S. J., Chlipala, G. E., Turek, F. W. & Vitaterna, M. H. Reproducible changes in the gut microbiome suggest a shift in microbial and host metabolism during spaceflight. Microbiome 7, 113 (2019).

Article Google Scholar

Beisel, N. S., Noble, J., Barbazuk, W. B., Paul, A.-L. & Ferl, R. J. Spaceflight-induced alternative splicing during seedling development in Arabidopsis thaliana. NPJ Microgravity 5, 9 (2019).

Polo, S.-H. L. et al. RNAseq analysis of rodent spaceflight experiments is confounded by sample collection techniques. iScience 23, 101733 (2020).

Article Google Scholar

Choi, S., Ray, H. E., Lai, S.-H., Alwood, J. S. & Globus, R. K. Preservation of multiple mammalian tissues to maximize science return from ground based and spaceflight experiments. PLoS ONE 11, e0167391 (2016).

Article Google Scholar

Krishnamurthy, A., Ferl, R. J. & Paul, A.-L. Comparing RNA-seq and microarray gene expression data in two zones of the Arabidopsis root apex relevant to spaceflight. Appl. Plant Sci. 6, e01197 (2018).

Article Google Scholar

Vrana, J. et al. Aquarium: open-source laboratory software for design, execution and data management. Synth. Biol. 6, ysab006 (2021).

Article Google Scholar

Miles, B. & Lee, P. L. Achieving reproducibility and closed-loop automation in biological experimentation with an IoT-enabled lab of the future. SLAS Technol. 23, 432439 (2018).

Article Google Scholar

Read the original:
Biological research and self-driving labs in deep space supported by artificial intelligence - Nature.com

What Is OpenAI Gym and How Can You Use It? – MUO – MakeUseOf

If you can't build a machine learning model from scratch or lack the infrastructure, merely connecting your app to a working model fixes the gap.

Artificial intelligence is here for everyone to use one way or the other. As for OpenAI Gym, there are many explorable training grounds to feed your reinforcement learning agents.

What is OpenAI Gym, how does it work, and what can you build using it?

OpenAI Gym is a Pythonic API that provides simulated training environments for reinforcement learning agents to act based on environmental observations; each action comes with a positive or negative reward, which accrues at each time step. While the agent aims to maximize rewards, it gets penalized for each unexpected decision.

The time step is a discrete-time tick for the environment to transit into another state. It adds up as the agent's actions change the environment state.

The OpenAI Gym environments are based on the Markov Decision Process (MDP), a dynamic decision-making model used in reinforcement learning. Thus, it follows that rewards only come when the environment changes state. And the events in the next state only depend on the present state, as MDP doesn't account for past events.

Before moving on, let's dive into an example for a quick understanding of OpenAI Gym's application in reinforcement learning.

Assuming you intend to train a car in a racing game, you can spin up a racetrack in OpenAI Gym. In reinforcement learning, if the vehicle turns right instead of left, it might get a negative reward of -1. The racetrack changes at each time step and might get more complicated in subsequent states.

Negative rewards or penalties aren't bad for an agent in reinforcement learning. In some cases, it encourages it to achieve its goal more quickly. Thus, the car learns about the track over time and masters its navigation using reward streaks.

For instance, we initiated the FrozenLake-v1 environment, where an agent gets penalized for falling into ice holes but rewarded for recovering a gift box.

Our first run generated fewer penalties with no rewards:

However, a third iteration produced a more complex environment. But the agent got a few rewards:

The outcome above doesn't imply that the agent will improve in the next iteration. While it may successfully avoid more holes the next time, it may get no reward. But modifying a few parameters might improve its learning speed.

The OpenAI Gym API revolves around the following components:

Since OpenAI Gym allows you to spin up custom learning environments, here are some ways to use it in a real-life scenario.

You can leverage OpenAI Gym's gaming environments to reward desired behaviors, create gaming rewards, and increase complexity per game level.

Where there's a limited amount of data, resources, and time, OpenAI Gym can be handy for developing an image recognition system. On a deeper level, you can scale it to build a face recognition system, which rewards an agent for identifying faces correctly.

OpenAI Gym also offers intuitive environment models for 3D and 2D simulations, where you can implement desired behaviors into robots. Roboschool is an example of scaled robot simulation software built using OpenAI Gym.

You can also build marketing solutions like ad servers, stock trading bots, sales prediction bots, product recommender systems, and many more using the OpenAI Gym. For instance, you can build a custom OpenAI Gym model that penalizes ads based on impression and click rate.

Some ways to apply OpenAI Gym in natural language processing are multiple-choice questions involving sentence completion or building a spam classifier. For example, you can train an agent to learn sentence variations to avoid bias while marking participants.

OpenAI Gym supports Python 3.7 and later versions. To set up an OpenAI Gym environment, you'll install gymnasium, the forked continuously supported gym version:

Next, spin up an environment. You can create a custom environment, though. But start by playing around with an existing one to master the OpenAI Gym concept.

The code below spins up the FrozenLake-v1. The env.reset method records the initial observation:

observation, info = env.reset()

Some environments require extra libraries to work. If you need to install another library, Python recommends it via the exception message.

For example, you'll install an additional library (gymnasium[toy-text]) to run the FrozenLake-v1 environment.

One of the setbacks to AI and machine learning development is the shortage of infrastructure and training datasets. But as you look to integrate machine learning models into your apps or devices, it's all easier now with ready-made AI models flying around the internet. While some of these tools are low-cost, others, including the OpenAI Gym, are free and open-source.

The rest is here:
What Is OpenAI Gym and How Can You Use It? - MUO - MakeUseOf

Machine Learning Programs Predict Risk of Death Based on Results From Routine Hospital Tests – Neuroscience News

Summary: Using ECG data, a new machine learning algorithm was able to predict death within 5 years of a patient being admitted to hospital with 87% accuracy. The AI was able to sort patients into 5 categories ranging from low to high risk of death.

Source: University of Alberta

If youve ever been admitted to hospital or visited an emergency department, youve likely had an electrocardiogram, or ECG, a standard test involving tiny electrodes taped to your chest that checks your hearts rhythm and electrical activity.

Hospital ECGs are usually read by a doctor or nurse at your bedside, but now researchers are using artificial intelligence to glean even more information from those results to improve your care and the health-care system all at once.

Inrecently published findings, the research team built and trained machine learning programs based on 1.6 million ECGs done on 244,077 patients in northern Alberta between 2007 and 2020.

The algorithm predicted the risk of death from that point for each patient from all causes within one month, one year and five years with an 85 percent accuracy rate, sorting patients into five categories from lowest to highest risk.

The predictions were even more accurate when demographic information (age and sex) and six standard laboratory blood test results were included.

The study is a proof-of-concept for using routinely collected data to improve individual care and allow the health-care system to learn as it goes, according to principal investigatorPadma Kaul, professor of medicine and co-director of theCanadian VIGOUR Centre.

We wanted to know whether we could use new methods like artificial intelligence and machine learning to analyze the data and identify patients who are at higher risk for mortality, Kaul explains.

These findings illustrate how machine learning models can be employed to convert data collected routinely in clinical practice to knowledge that can be used to augment decision-making at the point of care as part of a learning health-care system.

A clinician will order an electrocardiogram if you have high blood pressure or symptoms of heart disease, such as chest pain, shortness of breath or an irregular heartbeat. The first phase of the study examined ECG results in all patients, but Kaul and her team hope to refine these models for particular subgroups of patients.

They also plan to focus the predictions beyond all-cause mortality to look specifically at heart-related causes of death.

We want to take data generated by the health-care system, convert it into knowledge and feed it back into the system so that we can improve care and outcomes. Thats the definition of a learning health-care system.

Author: Ross NeitzSource: University of AlbertaContact: Ross Neitz University of AlbertaImage: The image is in the public domain

Original Research: Open access.Towards artificial intelligence-based learning health system for population-level mortality prediction using electrocardiograms by Padma Kaul et al. npj Digital Medicine

Abstract

Towards artificial intelligence-based learning health system for population-level mortality prediction using electrocardiograms

The feasibility and value of linking electrocardiogram (ECG) data to longitudinal population-level administrative health data to facilitate the development of a learning healthcare system has not been fully explored. We developed ECG-based machine learning models to predict risk of mortality among patients presenting to an emergency department or hospital for any reason.

Using the 12-lead ECG traces and measurements from 1,605,268 ECGs from 748,773 healthcare episodes of 244,077 patients (20072020) in Alberta, Canada, we developed and validated ResNet-based Deep Learning (DL) and gradient boosting-based XGBoost (XGB) models to predict 30-day, 1-year, and 5-year mortality. The models for 30-day, 1-year, and 5-year mortality were trained on 146,173, 141,072, and 111,020 patients and evaluated on 97,144, 89,379, and 55,650 patients, respectively. In the evaluation cohort, 7.6%, 17.3%, and 32.9% patients died by 30-days, 1-year, and 5-years, respectively.

ResNet models based on ECG traces alone had good-to-excellent performance with area under receiver operating characteristic curve (AUROC) of 0.843 (95% CI: 0.8380.848), 0.812 (0.8080.816), and 0.798 (0.7920.803) for 30-day, 1-year and 5-year prediction, respectively; and were superior to XGB models based on ECG measurements with AUROC of 0.782 (0.7760.789), 0.784 (0.7800.788), and 0.746 (0.7400.751).

This study demonstrates the validity of ECG-based DL mortality prediction models at the population-level that can be leveraged for prognostication at point of care.

Here is the original post:
Machine Learning Programs Predict Risk of Death Based on Results From Routine Hospital Tests - Neuroscience News

AWS and NVIDIA Collaborate on Next-Generation Infrastructure for Training Large Machine Learning Models and … – NVIDIA Blog

New Amazon EC2 P5 Instances Deployed in EC2 UltraClusters Are Fully Optimized to Harness NVIDIA Hopper GPUs for Accelerating Generative AI Training and Inference at Massive Scale

GTCAmazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced a multi-part collaboration focused on building out the world's most scalable, on-demand artificial intelligence (AI) infrastructure optimized for training increasingly complex large language models (LLMs) and developing generative AI applications.

The joint work features next-generation Amazon Elastic Compute Cloud (Amazon EC2) P5 instances powered by NVIDIA H100 Tensor Core GPUs and AWSs state-of-the-art networking and scalability that will deliver up to 20 exaFLOPS of compute performance for building and training the largest deep learning models. P5 instances will be the first GPU-based instance to take advantage of AWSs second-generation Elastic Fabric Adapter (EFA) networking, which provides 3,200 Gbps of low-latency, high bandwidth networking throughput, enabling customers to scale up to 20,000 H100 GPUs in EC2 UltraClusters for on-demand access to supercomputer-class performance for AI.

AWS and NVIDIA have collaborated for more than 12 years to deliver large-scale, cost-effective GPU-based solutions on demand for various applications such as AI/ML, graphics, gaming, and HPC, said Adam Selipsky, CEO at AWS. AWS has unmatched experience delivering GPU-based instances that have pushed the scalability envelope with each successive generation, with many customers scaling machine learning training workloads to more than 10,000 GPUs today. With second-generation EFA, customers will be able to scale their P5 instances to over 20,000 NVIDIA H100 GPUs, bringing supercomputer capabilities on demand to customers ranging from startups to large enterprises.

Accelerated computing and AI have arrived, and just in time. Accelerated computing provides step-function speed-ups while driving down cost and power as enterprises strive to do more with less. Generative AI has awakened companies to reimagine their products and business models and to be the disruptor and not the disrupted, said Jensen Huang, founder and CEO of NVIDIA. AWS is a long-time partner and was the first cloud service provider to offer NVIDIA GPUs. We are thrilled to combine our expertise, scale, and reach to help customers harness accelerated computing and generative AI to engage the enormous opportunities ahead.

New Supercomputing ClustersNew P5 instances are built on more than a decade of collaboration between AWS and NVIDIA delivering the AI and HPC infrastructure and build on four previous collaborations across P2, P3, P3dn, and P4d(e) instances. P5 instances are the fifth generation of AWS offerings powered by NVIDIA GPUs and come almost 13 years after its initial deployment of NVIDIA GPUs, beginning with CG1 instances.

P5 instances are ideal for training and running inference for increasingly complex LLMs and computer vision models behind the most-demanding and compute-intensive generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more.

Specifically built for both enterprises and startups racing to bring AI-fueled innovation to market in a scalable and secure way, P5 instances feature eight NVIDIA H100 GPUs capable of 16 petaFLOPs of mixed-precision performance, 640 GB of high-bandwidth memory, and 3,200 Gbps networking connectivity (8x more than the previous generation) in a single EC2 instance. The increased performance of P5 instances accelerates the time-to-train machine learning (ML) models by up to 6x (reducing training time from days to hours), and the additional GPU memory helps customers train larger, more complex models. P5 instances are expected to lower the cost to train ML models by up to 40% over the previous generation, providing customers greater efficiency over less flexible cloud offerings or expensive on-premises systems.

Amazon EC2 P5 instances are deployed in hyperscale clusters called EC2 UltraClusters that are comprised of the highest performance compute, networking, and storage in the cloud. Each EC2 UltraCluster is one of the most powerful supercomputers in the world, enabling customers to run their most complex multi-node ML training and distributed HPC workloads. They feature petabit-scale non-blocking networking, powered by AWS EFA, a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS. EFAs custom-built operating system (OS) bypass hardware interface and integration with NVIDIA GPUDirect RDMA enhances the performance of inter-instance communications by lowering latency and increasing bandwidth utilization, which is critical to scaling training of deep learning models across hundreds of P5 nodes. With P5 instances and EFA, ML applications can use NVIDIA Collective Communications Library (NCCL) to scale up to 20,000 H100 GPUs. As a result, customers get the application performance of on-premises HPC clusters with the on-demand elasticity and flexibility of AWS. On top of these cutting-edge computing capabilities, customers can use the industrys broadest and deepest portfolio of services such as Amazon S3 for object storage, Amazon FSx for high-performance file systems, and Amazon SageMaker for building, training, and deploying deep learning applications. P5 instances will be available in the coming weeks in limited preview. To request access, visit https://pages.awscloud.com/EC2-P5-Interest.html.

With the new EC2 P5 instances, customers like Anthropic, Cohere, Hugging Face, Pinterest, and Stability AI will be able to build and train the largest ML models at scale. The collaboration through additional generations of EC2 instances will help startups, enterprises, and researchers seamlessly scale to meet their ML needs.

Anthropic builds reliable, interpretable, and steerable AI systems that will have many opportunities to create value commercially and for public benefit. At Anthropic, we are working to build reliable, interpretable, and steerable AI systems. While the large, general AI systems of today can have significant benefits, they can also be unpredictable, unreliable, and opaque. Our goal is to make progress on these issues and deploy systems that people find useful, said Tom Brown, co-founder of Anthropic. Our organization is one of the few in the world that is building foundational models in deep learning research. These models are highly complex, and to develop and train these cutting-edge models, we need to distribute them efficiently across large clusters of GPUs. We are using Amazon EC2 P4 instances extensively today, and we are excited about the upcoming launch of P5 instances. We expect them to deliver substantial price-performance benefits over P4d instances, and theyll be available at the massive scale required for building next-generation large language models and related products.

Cohere, a leading pioneer in language AI, empowers every developer and enterprise to build incredible products with world-leading natural language processing (NLP) technology while keeping their data private and secure. Cohere leads the charge in helping every enterprise harness the power of language AI to explore, generate, search for, and act upon information in a natural and intuitive manner, deploying across multiple cloud platforms in the data environment that works best for each customer, said Aidan Gomez, CEO at Cohere. NVIDIA H100-powered Amazon EC2 P5 instances will unleash the ability of businesses to create, grow, and scale faster with its computing power combined with Coheres state-of-the-art LLM and generative AI capabilities.

Hugging Face is on a mission to democratize good machine learning. As the fastest growing open source community for machine learning, we now provide over 150,000 pre-trained models and 25,000 datasets on our platform for NLP, computer vision, biology, reinforcement learning, and more, said Julien Chaumond, CTO and co-founder at Hugging Face. With significant advances in large language models and generative AI, were working with AWS to build and contribute the open source models of tomorrow. Were looking forward to using Amazon EC2 P5 instances via Amazon SageMaker at scale in UltraClusters with EFA to accelerate the delivery of new foundation AI models for everyone.

Today, more than 450 million people around the world use Pinterest as a visual inspiration platform to shop for products personalized to their taste, find ideas to do offline, and discover the most inspiring creators. We use deep learning extensively across our platform for use-cases such as labeling and categorizing billions of photos that are uploaded to our platform, and visual search that provides our users the ability to go from inspiration to action," said David Chaiken, Chief Architect at Pinterest. "We have built and deployed these use-cases by leveraging AWS GPU instances such as P3 and the latest P4d instances. We are looking forward to using Amazon EC2 P5 instances featuring H100 GPUs, EFA and Ultraclusters to accelerate our product development and bring new Empathetic AI-based experiences to our customers.

As the leader in multimodal, open-source AI model development and deployment, Stability AI collaborates with public- and private-sector partners to bring this next-generation infrastructure to a global audience. At Stability AI, our goal is to maximize the accessibility of modern AI to inspire global creativity and innovation, said Emad Mostaque, CEO of Stability AI. We initially partnered with AWS in 2021 to build Stable Diffusion, a latent text-to-image diffusion model, using Amazon EC2 P4d instances that we employed at scale to accelerate model training time from months to weeks. As we work on our next generation of open-source generative AI models and expand into new modalities, we are excited to use Amazon EC2 P5 instances in second-generation EC2 UltraClusters. We expect P5 instances will further improve our model training time by up to 4x, enabling us to deliver breakthrough AI more quickly and at a lower cost.

New Server Designs for Scalable, Efficient AILeading up to the release of H100, NVIDIA and AWS engineering teams with expertise in thermal, electrical, and mechanical fields have collaborated to design servers to harness GPUs to deliver AI at scale, with a focus on energy efficiency in AWS infrastructure. GPUs are typically 20x more energy efficient than CPUs for certain AI workloads, with the H100 up to 300x more efficient for LLMs than CPUs.

The joint work has included developing a system thermal design, integrated security and system management, security with the AWS Nitro hardware accelerated hypervisor, and NVIDIA GPUDirect optimizations for AWS custom-EFA network fabric.

Building on AWS and NVIDIAs work focused on server optimization, the companies have begun collaborating on future server designs to increase the scaling efficiency with subsequent-generation system designs, cooling technologies, and network scalability.

Read more:
AWS and NVIDIA Collaborate on Next-Generation Infrastructure for Training Large Machine Learning Models and ... - NVIDIA Blog

Podcast: Machine Learning and Education The Badger Herald – The Badger Herald

Jeff Deiss 0:00Greetings, this is Jeff, director of the Badger Herald podcast. And today we have a very exciting episode were talking with Professor Kangwook Lee, part of the Electrical and Computer Engineering Department at the University of Wisconsin Madison. And were going to talk about his research on deep learning and recent developments in machine learning. And also a little bit about his influence on a popular test prep service called Riiid.

So, I originally saw your name in a New York Times article, about Riiid, which is a test prep service started by YJ Jang that uses deep learning to essentially better guide students towards more accurate test prep and just overall academic success. But we can get into that a little bit later. So first, if you want to introduce yourself, and just give a little background on your life.

Lee 1:18Alright, hi, Im Kangwook Lee. Again, Im assistant professor in the ECE department here. Came here in 2019, fall. So its been about three and a half years since I joined here. Ive been enjoying a lot, except the COVID. But everything is great. I mostly work on information theory, machine learning and deep learning in terms of research area. Before that, I did my PhD in Berkeley Masters and PhD in Berkeley. Before that, I was doing my undergrad studies in Korea, I grew up in Korea. So yeah, its been its been a while since I came to the United States. I did went back to Korea for three years for my military service, after my Ph.D., but yeah. So yeah, happy to meet you guys and talk about my research.

Deiss 2:09Of course, and thats the first question I have. So with any topic related to machine learning or information theory, even as someone who studied this at a pretty low level in school, it can be hard to wrap your head around some of these concepts, but maybe just in laymans terms, can you describe some of your recent research to give our listeners a better sense of what you do here at UW-Madison?

Lee 2:32Since I joined Madison, I worked on three different research topics. The first one was, how much data do we need to rely on machine learning? That one, I particularly studied the problem of recommendation where we have data from clients or customers, they provide their ratings on the different types of items. And from that kind of partially observed data. If you want to make recommendations for their future service, we should figure out how much data we need. So that kind of recommendation systems and algorithms was number one topic I worked on. The second topic I worked on was called trustworthy machine learning. So by trustworthy machine learning, I mean, machine learning algorithms are, in most cases, they are not fair. So they are not robust. And others are private, they used to leak private data that was used for training data. So there are many issues like this. And people started looking at how to solve this issue and make more robust, more fair, more or less more private algorithms. So those are the research topics I really liked working on in the last few years. I still work on them. Recently, I have started working on another research topic called large models. So large models are I guess you must have heard about like GPT, diffusion, models lips, those are the models that are becoming more and more popular, but we are lacking in theory in terms of how they work. So thats what I am surprised to see in this case.

Deiss 4:18Yeah, so I just wanted to ask you I often hear not necessarily in true academic papers, but just in the media, I hear about how some of these large models, especially if theyre convoluted, complicated neural networks or deep learning algorithms. Ive heard them described as a black box, where the actual mechanics of whats going on inside what what the algorithm is doing with the data is a little unclear from the outside, or as you have like a simple regression model. Its actually pretty easy to work out the math of what the algorithm is doing with the data but with a large model, is that the case and can describe a little bit about that black box problem that researchers have to deal with

Lee 4:57The black box aspect actually was for more general classes, lets say entire deep learning, you can say they are kind of blackbox. I, I think thats half correct, half incorrect, half incorrect in a sense that when we design those models, we have a particular goal that this, we want this to behave like this. So for instance, even if we call GPT, mostly are largely blackbox-ish, we still design the systems and algorithms such that it is good at predicting the next word. Thats, thats not something just came out out of box we designed such that it predicts the next word well, so. And thats what we are seeing in ChatGPT and OD GPT. So the, in terms of the operation or the final objective, they are doing what they people who designed wanted to do. So its less blackbox in that sense, however, how it actually works that well, I think thats the mysterious part, we couldnt expect how well it will work. But somehow it worked much better than what people expected. So explaining why thats the case. Thats an interesting research question. But thats what makes it a little black box-ish. Whats also very interesting to me is when it comes to GPT, and really large language models, while there is there are more mysterious things happening, going back to the first aspect. In fact, there are some interesting behaviors that people didnt intend to design. So things like incontext learning or future learning. Thats basically like, when you use GPT, you provide a few examples to the to the model, and the model is trying to learn some parents from the examples that are provided, which is a little bit beyond that what people used to expect from the model. So the model has some new properties or behaviors that we didnt design.

Deiss 7:00Yes, and I want to get back to ChatGPT for another perspective and a little bit, but one thing I saw that you were recently researching, I saw come up in interviews is about the straggler problem in machine learning. As far as I know, its where a certain I dont know if node is the correct term or just some part of the machine learning algorithm is so deficient that it brings down the performance of the whole algorithm as a whole. Can you describe a little bit about what the straggler problem is and the research youre doing on it?

Lee 7:29Yeah. So the straggler problem is, is a term that describes where you have a large cluster and your entire cluster is working on a particular task jointly. And if one of the nodes or machine within the cluster starts performing bad or starts producing wrong output or start behaving slower than the other, that the entire system is either getting wrong answers, or either they are becoming entirely very slow. So straggler problem basically means that you have a bigger system consisting of large workers, one of the few workers become very slow, or erroneous, the entire system becomes bad. Thats the phenomenon or the problem. This problem has been first observed in large data centers like Google or Facebook, about a decade ago, they were reporting that there are a few stragglers that make their entire data center really slow, and really bad in terms of the performances. So we started working on how to fix these problems using more principled approaches like information and coding theory, that are very related to large scale machine learning systems. Because large scale machine learning systems require cluster training, distributed training, that kind of stuff. So thats how its connected to distribute machine.

Deiss 8:57Very interesting stuff. I want to pivot away from your research for a little bit and just talk about how I originally heard about your name, like I said, In the beginning, I saw a New York Times article was about a test prep service. And why YJ Jang who started Riiid this test prep service, you said he was inspired by you to kind of use deep learning in his startup, whatever software he was originally creating, what is your relationship with him? And how did you influence them to utilize deep learning?

Lee 9:25Sure. Heres a friend of mine. He texted me with the link to the article is I was really interested to see that link to see the article. I met him about 10 years ago, when I was a student at Berkeley. He was also a student at Berkeley, but we didnt know each other. But we both participated in some some startup competetion over the weekend. So we had when we drove down to San Jose, where the startup competition was happening, and I didnt know him so I was on Find finding some other folks there. And we created a some demo and we gave a pitch. We won the second place, he won the first place.

Deiss 10:09Wow.

Lee 10:10So, and I was talking to him, Hey, where are you from? And he said he was from Berkeley. So Im from Berkeley. So I got to know him from there. I knew he was a really good businessman back then. But, but then we came back to Berkeley, we started talking more and more. And we had some idea of having a startup. So we had some ideas, we spent about six months developing business ideas, and also building some demos. It was also related to education. So its slightly different from what they are working on now. But eventually, we found that the business is really difficult to run. So we gave up. But after that, he started his own business. And he started asking me, Hey, I have this interesting problem. But I think machine learning could play a big role here. So he started sharing his business idea. And then that was the time when I was working on machine learning. In particular, I was working on recommendation system. And I was able to find the connection between the recommendation system, and what the problem they are working with the problem they are working on is students are working and spending so much time on prepping test. And they waste so much time on working on something they already know, efficient test prep is no different from not wasting time on watching some, something thats not yours on Netflix. So yeah, so thats the point where I started this kind of idea, sharing the sharing this idea with him. And in fact, deep learning was necessarily being used for recommendation system. So all these ideas I shared with him, and he made a great business out of it.

Deiss 11:54Yes, definitely. Obviously, test prep services like this are some ways in which machine learning and deep learning models could actually help educators. But in the media, and I see all the time, its all about ChatGPT all that I hear like every day, theres some new news about ChatGPT. And I think that actually the panel here at UW-Madison recently about students using this potentially to cheat on things that they didnt think you could cheat on before like having it write your essay for you and stuff. As an educator or someone connected to the education system here. Do you think that these chat bots pose a threat to traditional methods of teaching?

Lee 12:32My opinion, I would say no, I dont see much difference between the moment where we started having access to say calculators, or MATLAB, or Python, those are some things that we still exercise when we are in elementary school. In elementary schools we are supposed to do 12 plus 13 or 10 minus 5, youre still doing it. And of course, I mean, they can go home and to use calculator, and cheat. But we dont care. Because at some point, unless youre going to rely all those machines and devices to do entire your work, you have to do it on your own sometimes. And also you have to understand the principles behind those tasks. So for instance, essay writing is the biggest issues right now with ChatGPT. While I mean, you can always use ChatGPT without knowing anything about essay writing, and I think thats coming is going to be better and better way better this year. However, if you dont decide to not learn how to write essays, then you didnt you end up not knowing something thats really important in your life. So eventually people will choose to learn it anyway. And not cheat. In terms of how to fairly great them. Thats the problem. Yeah, I think grading is the issue. Entire education on breakout.

Deiss 14:01Yes, thats thats kind of the thing. In my opinion, I thought a similar thing where if a student is really good, and they want to improve, and they want to have that good grade on the final exam, whats whatever it is, theyre going to learn what they need to learn. But when it comes to grading individual assignments, I feel if something were it can write your essay for you, it throws the whole, the whole book out the window, where its like, how do I know how to grade things if I cant tell if someone wrote this by themselves for three days, or they put it into a chatbot essentially, regardless of ChatGPT kind of taking over the media and public discourse around machine learning. I often joke with my friends I say, if we think ChatGPT is cool, I dont know what like Google is cooking up in the back for 10 years. Who knows whats going to be here over the next decade? So in your opinion, are there more interesting developments in machine learning right now? People can expect to see and if so, what do you think they are?

Lee 14:56Yeah, but before we move on, I think Google also has a lot of interesting techniques and models, but they are just slower in terms of releasing them and adapting them. So well see, I think the recent announcement on part is super interesting. So well get to see more and more coming like that. So anyway, so talking about other interesting matters. Other than larger models, what also interests me, theres these are diffusion models, I guess, perhaps most have heard about, like data lead to where the model is where you provide text prompt and throw something for you. That was more or less fun, activities, because you couldnt do much with that, like textured image model. But I think the fundamental technique has been applied to many different domains. And now its being used for not just for images, but for audio music, something else like 3D assets, and things are going wider and wider. And we will probably see a moment where these things become really powerful and being used everywhere, basically. So I dont think we need to draw any diagrams by hands. When you create a PowerPoint, you just need to type, whatever you think, how it should look like. It should be able to draw everything for you. And any design problems any Ill say, think about web design, product design, things are going to be very different. Yeah.

Deiss 16:35Yes. I guess just to wrap it up, do people like to kind of fear monger about a lot of this stuff like this is going to destroy the job market, everyones going to be automated away? Thats just one thing I hear. But people people do have concerns about just the prevalence of machine learning thats kind of emerging in our lives. Do you have any concerns about whats going on right now, in the world of machine learning? Or do you think people might be a little too pessimistic?

Lee 17:03There are certainly I will say there are some certain jobs that are going to be less useful than now. Thats clearly a concern. However, for most jobs out there, I think, either they can be benefited from these models and tools, their productivity will become better. And they probably can make more money if they know how to use these tools better. However, for instance, lets say concept artist, or designers, for instance, talking about this diffusion models. At some point, these kind of automated models could become really good at doing almost a job almost as good as what theyre doing right now. And thats the point where its really tricky because either we were gonna see some two different markets, right now, if you go to pottery market, then there are handmade potteries. And factory made pottery is no one can distinguish, to be honest. Yeah, handmade pottery is even more unique. They have some slightly different ways of coloring, and it actually has a little bit of defects that made this handmade pottery is look even more unique and beautiful than the factory made ones. But back in the days, we used to appreciate factory made like pottery, no defect, completely symmetric. Thats what human couldnt make. But I think we are going that way. Because now models are going to be better at making perfect flawless architectures and designs. And probably what we will do as a human designers and artists have a little bit of I wouldnt call it flaws or defects, but well turn look like what machines can make. So maybe those two markets will emerge. And maybe those two markets will survive forever, like pottery market. So I dont know, I cannot expect what will happen, but Im still optimistic.

Deiss 19:05Awesome. I think thats a good end it off on a high note there. And thank you for coming to talk to me today on the Badger Herald podcast, and Im excited to see what you do next in your research.

Lee 19:14All right. Thank you. It was great talking to you.

Deiss 19:15Thank you so much.

Follow this link:
Podcast: Machine Learning and Education The Badger Herald - The Badger Herald