Archive for the ‘Machine Learning’ Category

This AI Research Addresses the Problem of ‘Loss of Plasticity’ in … – MarkTechPost

Modern deep-learning algorithms are now focused on problem environments where training occurs just once on a sizable data collection, never againall of the early triumphs of deep learning in voice recognition and picture classification employed such train-once settings. Replay buffers and batching were later added to deep understanding when applied to reinforcement learning, making it extremely close to a train-once setting. A large batch of data was also used to train recent deep learning systems like GPT-3 and DallE. The most popular approach in these situations has been to gather data continuously and then occasionally prepare a new network from scratch in a training configuration. Of course, in many applications, the data distribution varies over time, and training must continue in some manner. Modern deep-learning techniques were developed with the train-once setting in mind.

In contrast, the perpetual learning problem setting focuses on continuously learning from fresh data. The ongoing learning option is ideal for issues where the learning system must deal with a dynamic data stream. For instance, think of a robot that has to find its way around a house. The robot would have to be retrained from scratch or run the danger of being rendered useless every time the houses layout changed if the train-once setting was used. It would be necessary to retrain from scratch if the design changed regularly. On the other hand, the robot might easily learn from the new information and continuously adjust to the changes in the house under the ongoing learning scenario. The importance of lifelong learning has grown in recent years, and more specialized conferences are being held to address it, such as the Conference on Life-long Learning Agents (CoLLAS).

They emphasize the environment of ongoing learning in their essay. When exposed to fresh data, deep learning systems frequently lose most of what they have previously learned, a condition known as catastrophic forgetting. In other words, deep learning techniques do not retain stability in ongoing learning issues. In the late 1900s, early neural networks were the first to demonstrate this behavior. Catastrophic forgetting has recently gotten fresh interest due to the development of deep learning since several articles have been written about preserving stability in deep continuous learning.

The capacity to continue learning from fresh material is distinct from catastrophic forgetting and perhaps more essential to continuous learning. They call this capacity plasticity.Continuous learning systems must maintain plasticity because it enables them to adjust to changes in their data streams. If their data stream changes, continuously learning systems that lose flexibility may become worthless. They emphasize the problem of flexibility loss in their essay. These studies employed a configuration in which the network was first shown a collection of instances for a predetermined number of epochs, after which the training set was enlarged with new examples, and the training cycle repeated for an extra number of epochs. After accounting for the number of epochs, they discovered that the error for the cases in the first training set was lower than for the later-added examples. These publications offered proof that the loss of flexibility caused by deep learning and the backpropagation algorithm upon which it is based is a common occurrence.

New outputs, known as heads, were added to the network in its configuration when a new job was offered, and the number of outputs increased as more tasks were encountered. Thus, the effects of interference from old heads were mixed up with the consequences of plasticity loss. According to Chaudhry et al., the loss of plasticity was modest when old heads were taken out at the beginning of a new task, indicating that the major cause of the loss of plasticity they saw was interference from old heads. The fact that previously researchers only employed ten challenges prevented them from measuring the loss of plasticity that occurs when deep learning techniques are presented with a lengthy list of tasks.

Although the findings in these publications suggest that deep learning systems have lost some of their essential adaptability, no one has yet shown that continuous learning has lost plasticity. In the reinforcement learning field, where recent works have demonstrated a significant loss of plasticity, there is more evidence for the loss of plasticity in contemporary deep learning. By demonstrating that early learning in reinforcement learning issues can have a negative impact on later learning, Nishikin et al. coined the term primacy bias.

Given that reinforcement learning is fundamentally continuous as a consequence of changes in the policy, this result may be attributable to deep learning networks losing their flexibility in circumstances where learning is ongoing. Additionally, Lyle et al. demonstrated that some deep reinforcement learning agents may eventually lose their capacity to pick up new skills. These are significant data points, but because of the intricacy of contemporary deep reinforcement learning, it isnt easy to make any firm conclusions. These studies show that deep learning systems lose flexibility but fall short of providing a complete explanation of the phenomenon. These studies include those from the psychology literature around the turn of the century and more contemporary ones in machine learning and reinforcement learning. In this study, researchers from the Department of Computing Science, University of Alberta, and CIFAR AI Chair, Alberta Machine Intelligence Institute provide a more conclusive response to plasticity loss in contemporary deep learning.

They demonstrate that persistent supervised learning issues cause deep learning approaches to lose plasticity and that this plasticity loss can be severe. In a continuous supervised learning problem using the ImageNet dataset and including hundreds of learning trials, they first show that deep learning suffers from loss of plasticity. The complexity and related confusion that always develop in reinforcement learning are eliminated when supervised learning tasks are used instead. We can also determine the complete amount of the loss of plasticity thanks to the hundreds of tasks that we have. They next prove the universality of deep learnings lack of flexibility over a wide variety of hyperparameters, optimizers, network sizes, and activation functions using two computationally less expensive problems (a variation of MNIST and the slowly changing regression problem). They want a deeper grasp of its origins after demonstrating the severity and generality of loss of flexibility in deep learning.

Check out thePaper.All Credit For This Research Goes To the Researchers on This Project. Also,dont forget to joinour 29k+ ML SubReddit,40k+ Facebook Community,Discord Channel,andEmail Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

Read more:
This AI Research Addresses the Problem of 'Loss of Plasticity' in ... - MarkTechPost

What Is Kernel In Machine Learning And How To Use It? – Dataconomy

The concept of a kernel in machine learning might initially sound perplexing, but its a fundamental idea that underlies many powerful algorithms. There are mathematical theorems that support the working principle of all automation systems that make up a large part of our daily lives.

Kernels in machine learning serve as a bridge between linear and nonlinear transformations. They enable algorithms to work with data that doesnt exhibit linear separability in its original form. Think of kernels as mathematical functions that take in data points and output their relationships in a higher-dimensional space. This allows algorithms to uncover intricate patterns that would be otherwise overlooked.

So how can you use kernel in machine learning for your own algorithm? Which type should you prefer? What do these choices change in your machine learning algorithm? Lets take a closer look.

At its core, a kernel is a function that computes the similarity between two data points. It quantifies how closely related these points are in the feature space. By applying a kernel function, we implicitly transform the data into a higher-dimensional space where it might become linearly separable, even if it wasnt in the original space.

There are several types of kernels, each tailored to specific scenarios:

The linear kernel is the simplest form of kernel in machine learning. It operates by calculating the dot product between two data points. In essence, it measures how aligned these points are in the feature space. This might sound straightforward, but its implications are powerful.

Imagine you have data points in a two-dimensional space. The linear kernel calculates the dot product of the feature values of these points. If the result is high, it signifies that the two points have similar feature values and are likely to belong to the same class. If the result is low, it suggests dissimilarity between the points.

The linear kernels magic lies in its ability to establish a linear decision boundary in the original feature space. Its effective when your data can be separated by a straight line. However, when data isnt linearly separable, thats where other kernels come into play.

The polynomial kernel in machine learning introduces a layer of complexity by applying polynomial transformations to the data points. Its designed to handle situations where a simple linear separation isnt sufficient.

Imagine you have a scatter plot of data points that cant be separated by a straight line. Applying a polynomial kernel might transform these points into a higher-dimensional space, introducing curvature. This transformation can create intricate decision boundaries that fit the data better.

For example, in a two-dimensional space, a polynomial kernel of degree 2 would generate new features like x^2, y^2, and xy. These new features can capture relationships that werent evident in the original space. As a result, the algorithm can find a curved boundary that separates classes effectively.

The Radial Basis Function (RBF) kernel in machine learning is one of the most widely used kernels in the training of algorithms. It capitalizes on the concept of similarity by creating a measure based on Gaussian distributions.

Imagine data points scattered in space. The RBF kernel computes the similarity between two points by treating them as centers of Gaussian distributions. If two points are close, their Gaussian distributions will overlap significantly, indicating high similarity. If they are far apart, the overlap will be minimal.

This notion of similarity is powerful in capturing complex patterns in data. In cases where data points are related but not linearly separable, the usage of RBF kernel in machine learning can transform them into a space where they become more distinguishable.

The sigmoid kernel in machine learning serves a unique purpose its used for transforming data into a space where linear separation becomes feasible. This is particularly handy when youre dealing with data that cant be separated by a straight line in its original form.

Imagine data points that cant be divided into classes using a linear boundary. The sigmoid kernel comes to the rescue by mapping these points into a higher-dimensional space using a sigmoid function. In this transformed space, a linear boundary might be sufficient to separate the classes effectively.

The sigmoid kernels transformation can be thought of as bending and shaping the data in a way that simplifies classification. However, its important to note that while the usage of a sigmoid kernel in machine learning can be useful, it might not be as commonly employed as the linear, polynomial, or RBF kernels.

Kernels are the heart of many machine learning algorithms, allowing them to work with nonlinear and complex data. The linear kernel suits cases where a straight line can separate classes. The polynomial kernel adds complexity by introducing polynomial transformations. The RBF kernel measures similarity based on Gaussian distributions, excelling in capturing intricate patterns. Lastly, the sigmoid kernel transforms data to enable linear separation when it wasnt feasible before. By understanding these kernels, data scientists can choose the right tool to unlock patterns hidden within data, enhancing the accuracy and performance of their models.

Kernels, the unsung heroes of AI and machine learning, wield their transformative magic through algorithms like Support Vector Machines (SVM). This article takes you on a journey through the intricate dance of kernels and SVMs, revealing how they collaboratively tackle the conundrum of nonlinear data separation.

Support Vector Machines, a category of supervised learning algorithms, have garnered immense popularity for their prowess in classification and regression tasks. At their core, SVMs aim to find the optimal decision boundary that maximizes the margin between different classes in the data.

Traditionally, SVMs are employed in a linear setting, where a straight line can cleanly separate the data points into distinct classes. However, the real world isnt always so obliging, and data often exhibits complexities that defy a simple linear separation.

This is where kernels come into play, ushering SVMs into the realm of nonlinear data. Kernels provide SVMs with the ability to project the data into a higher-dimensional space where nonlinear relationships become more evident.

The transformation accomplished by kernels extends SVMs capabilities beyond linear boundaries, allowing them to navigate complex data landscapes.

Lets walk through the process of using kernels with SVMs to harness their full potential.

Imagine youre working with data points on a two-dimensional plane. In a linearly separable scenario, a straight line can effectively divide the data into different classes. Here, a standard linear SVM suffices, and no kernel is needed.

However, not all data is amenable to linear separation. Consider a scenario where the data points are intertwined, making a linear boundary inadequate. This is where kernel in machine learning step in to save the day.

You have a variety of kernels at your disposal, each suited for specific situations. Lets take the Radial Basis Function (RBF) kernel as an example. This kernel calculates the similarity between data points based on Gaussian distributions.

By applying the RBF kernel, you transform the data into a higher-dimensional space where previously hidden relationships are revealed.

In this higher-dimensional space, SVMs can now establish a linear decision boundary that effectively separates the classes. Whats remarkable is that this linear boundary in the transformed space corresponds to a nonlinear boundary in the original data space. Its like bending and molding reality to fit your needs.

Kernels bring more than just visual elegance to the table. They enhance SVMs in several crucial ways:

Handling complexity: Kernel in machine learning enables SVMs to handle data that defies linear separation. This is invaluable in real-world scenarios where data rarely conforms to simplistic structures.

Unleashing insights: By projecting data into higher-dimensional spaces, kernels can unveil intricate relationships and patterns that were previously hidden. This leads to more accurate and robust models.

Flexible decision boundaries: Kernel in machine learning grants the flexibility to create complex decision boundaries, accommodating the nuances of the data distribution. This flexibility allows for capturing even the most intricate class divisions.

Kernel in machine learning is like a hidden gem. They unveil the latent potential of data by revealing intricate relationships that may not be apparent in their original form. By enabling algorithms to perform nonlinear transformations effortlessly, kernels elevate the capabilities of machine learning models.

Understanding kernels empowers data scientists to tackle complex problems across domains, driving innovation and progress in the field. As we journey further into machine learning, lets remember that kernels are the key to unlocking hidden patterns and unraveling the mysteries within data.

Featured image credit: rawpixel.com/Freepik.

Originally posted here:
What Is Kernel In Machine Learning And How To Use It? - Dataconomy

Reddit Expands Machine Learning Tools To Help Advertisers Find … – B&T

Reddit has introduced Keyword Suggestions, a tool for advertisers that applies machine learning to help expand their keyword lists recommending relevant and targetable keywords, while filtering out keywords that arent brand suitable.

The new system is available via the Reddit Ads Manager and ranks each suggestion by monthly views, and opens up an expanded list of relevant targeting possibilities to increase the reach and efficiency of campaigns.

The tool is powered by advanced machine learning and natural language processing to find the most relevant terms.

This technology takes the original context of each keyword into consideration so that only those existing in a brand-safe and suitable environment are served to advertisers.

In practice, this means machine learning is doing the heavy lifting, pulling from the Reddit posts and conversations that best match each advertisers specific needs. Most importantly, this allows advertisers to show the most relevant ads to the Reddit users who will be most interested in them.

The promise and potential of artificial intelligence, while exciting, has also elevated the value of real, human interactions and interests for both consumers and marketers. As we enter a new chapter in our industry and evolve beyond traditional signals, interest-based, contextually relevant targeting will be the most effective way to reach people where theyre most engaged, said Jim Squires, Reddits EVP of business marketing and growth.

Powered by Reddits vast community of communities, which are segmented by interest and populated with highly engaged discussions, Keyword Suggestions leverages the richness of conversation on Reddit and provides advertisers with recommendations to easily and effectively target relevant audiences on our platform.

The platform has also boosted its interest-based targeting tools with twice the number of categories available for targeting.

Reddits continued focus on enhancing their targeting products via machine learning will certainly help advertisers reach more of their target audience and discover new audiences on the platform. Additionally, implementing negative keyword targeting strategies overall increases relevancy and improves performance, said GroupM vice president and global head of social, Amanda Grant.

Given the rich nature of conversations on the Reddit platform, we expect improved business outcomes as we tap into these tools to refine our focus on the right audience.

Read the original here:
Reddit Expands Machine Learning Tools To Help Advertisers Find ... - B&T

Seattle startup that helps companies protect their AI and machine learning code raises $35M – GeekWire

From left: Protect AI CEO Ian Swanson, CTO Badar Ahmed, and president Daryan Dehghanpisheh. (Protect AI Photo)

Seattle cybersecurity startup Protect AI landed $35 million to boost the rollout of its platform that helps enterprises shore up their machine learning code.

Protect AI sells software that allows companies to monitor the various layers and components of machine learning systems, detecting potential violations and logging information on those attacks. It primarily sells to large enterprises in regulated industries including finance, healthcare, life sciences, energy, government, and tech.

The fresh funding comes as AI has become a focal point for many enterprise-level executives, who are mandated to deploy the tech alongside their product suites, CEO Ian Swanson told GeekWire. This rapid adoption comes with elevated risks, he said.

[AI] is flying down the highway right now, he said. For a lot of organizations, that cant be stopped. So we need to make sure that we can maintain and understand it.

A KPMG survey found than only 6% of organizations have a dedicated team in place for evaluating risk and implementing risk mitigation strategies as part of their overall generative AI strategy.

At the same time, companies of all sizes are facing an increasing number of cyber threats, pressuring execs to invest heavily in their security systems.McKinsey and Co.predicts businesses will spend more than $100 billion on related services by 2025.

Protect AIs flagship product, AI Radar, creates a machine learning bill of materials to track a companys software supply chain components: operations tools, platforms, models, data, services, and cloud infrastructure. Swanson compares it to regular automotive maintenance and inspection, where tires and brakes need constant checks, along with ensuring the right fuel is used.

We really have to understand the ingredients and the recipe of all this, he said.

A hacker gaining access to a companys machine learning system can steal intellectual property or inject malicious code, Swanson said. For instance, Protect AI found a vulnerability in MLflow, a popular machine learning lifecycle platform used by Walmart, Time Warner, Prudential, and other large companies.

The startup presented its findings in March, pressuring MLflow to update its platform within a few weeks. The flaw, left unpatched, would have allowed unauthenticated hackers to read any file accessible on a users MLflow server and potentially inject code.

Protect AIs first product was NB Defense, an open-sourced app that works to address vulnerabilities in development platform Jupyter Notebooks. Protect AIs tools work in Google Cloud, Oracle Cloud, Microsoft Azure and Amazon Web Services.

In the AI cybersecurity space, there are several well-funded startups.

Swanson said Protect AI tracks the entire machine learning supply chain, from the original inputted training sets to the ongoing use of the model.

This is Swansons third startup. His first company was Sometrics, a virtual currency platform and in-game payments provider. It wasacquired by American Expressin 2011. After that, he founded DataScience.com, a cloud workspace platform that wasacquired by Oraclein 2018. Swanson also held AI leadership roles at AWS and Oracle.

Swanson is joined byBadar Ahmed, a former engineering leader at Oracle and DataScience, andDaryan Dehghanpisheh, a former leader at AWS. The company has 25 employees, up from 15 when the company raised its $13.5 million seed round in December.

The Series A round was led by Evolution Equity Partners, with participation from Salesforce Ventures and existing investors Acrew Capital, Boldstart Ventures, Knollwood Capital, and Pelion Ventures. The startup has raised a total of $48.5 million to date.

See original here:
Seattle startup that helps companies protect their AI and machine learning code raises $35M - GeekWire

Making machine learning accessible to all @theU – @theU

Many call this the age of information said Rajive Ganguli, the Malcolm McKinnon Professor of Mining Engineering at the University of Utah. It is perhaps more accurate to call it the age of data since not everyone has the ability to truly gain from all the data they collect. Many are either lost in the data or misled by it. Yet, the promise of being informed by data remains.

Ganguli, who is also the College of Mines and Earth Sciences associate dean, is launching UteAnalytics, a free analytics software which makes artificial intelligence (AI) or machine learning (ML) accessible to all.

Founder of the ai.sys group at the U, Ganguli said that as long as a client knows their data, they can use UteAnalytics to understand better the problems they are trying to solve. The research groups mission is to seek insight from data, models systems and to develop computational tools for education and research.

At various points in time, Ganguli has developed ML tools that his students could use in class. Years ago, it occurred to him that more could benefit from ML if only his workflow and tools were more user-friendly. Graduate student Lewis Oduro brought his vision to tuition by leveraging the numerous public domain ML tools available to programmers and converting them into Windows-based software.

The tool is problem agnostic, Ganguli said. Hence it can have a broad group of users. I have used it for a variety of projects I am involved in, including mining, atmospheric sciences/air quality and COVID/hospital admissions.

PHOTO CREDIT: Rajive Ganguli

Lewis Oduro (right) and Rajive Ganguli (left).

He reports that tens of subject matter experts (SMEs) who are non-coders have already subscribed to receive the software in advance of its formal release. Many are professionals across a broad spectrum of fields from social science to business, along with scientists and engineers.

Designed to empower the domain expert, UteAnalytics allows a client to clean their data and conduct exploratory data analysis in various ways.The software also allows users to estimate the effect of each input on the output, as well as develop models in advance of predicting on a new dataset.

Daniel Mendoza, who holds faculty appointments in the Department of Atmospheric Sciences and elsewhere at the U, is an early adopter of the software. Through his work with air quality monitors on UTA trains and electric buses in Salt Lake Valley, he and his team have successfully collected more than 8 years of data for particulate matter and ozone levels, and recently, for nitrogen oxides.

When we look at neighborhood-specific data we can drill in and really see some social justice impacts, Mendoza reported last year. Today, he is using UteAnalytics to quickly and efficiently analyze the temperature data that well be collecting in real-time from our mobile and stationary sensors. UA gives researchers the power to look at data in a very streamlined way without endless hours of coding. The included tools facilitate a thorough interpretation of data and save time without compromising reliability.

The difference that dataassisted by UteAnalytics tools make in Mendozas work on air quality is most recently seen in the Urban Heat Watch campaign, involving citizen scientists who are helping collect data along the streets of Salt Lake Valley. As one of the top three urban heat islands in the nation, the Salt Lake City metropolitan area features a groundbreaking monitoring programnowhere else the world does an initiative exist at the density and scale than in Utahs capital city and environs.

UteAnalytics is just the latest deliverable for Ganguli, who has led approximately $13 million in projects as primary investigator. He is currently involved in several projects in five different countries U.S., Denmark/Greenland, Mongolia, Saudi Arabia and Mexico on topics ranging from ML to training.

Meanwhile, graduate student Lewis Oduro, who defended his thesis this past spring, has since taken a job near Phoenix, Arizona as a mining engineer at Freeport-McMoRan, a leading international mining company. A native of Ghana, Oduro said of his mentor, He gave me the chance to work under him and provided me with the kind of relationship only evident between a father and a son.

Under Gangulis tutelage and support, Oduro was the principal player in building UteAnalytics as desktop software used for data analytics and building predictive ML models.

I will forever be indebted to him and to the entire faculty at the University of Utahs Mining Engineering Department, the young scientist said on his LinkedIN page.

Visit link:
Making machine learning accessible to all @theU - @theU