Archive for the ‘Machine Learning’ Category

Investigation of the effectiveness of a classification method based on improved DAE feature extraction for hepatitis C … – Nature.com

In this subsection, we evaluate the feature extraction effect of the IDAE by conducting experiments on the Hepatitis C dataset with different configurations to test its generalization ability. We would like to investigate the following two questions:

How effective is IDAE in classifying the characteristics of hepatitis C ?

If the depth of the neural network is increased, can IDAE mitigate the gradient explosion or gradient vanishing problem while improving the classification of hepatitis C disease ?

Does an IDAE of the same depth tend to converge more easily than other encoders on the hepatitis C dataset ?

Firstly, out of public health importance, Hepatitis C (HCV) is a global public health problem due to the fact that its chronic infection may lead to serious consequences such as cirrhosis and liver cancer, and Hepatitis C is highly insidious, leading to a large number of undiagnosed cases.It is worth noting that despite the wide application of traditional machine learning and deep learning algorithms in the healthcare field, especially in the research of acute conditions such as cancer, however, there is a significant lack of in-depth exploration of chronic infectious diseases, such as hepatitis C. In addition, the complex biological attributes of the hepatitis C virus and the significant individual differences among patients together give rise to the challenge of multilevel nonlinear correlation among features. Therefore, the application of deep learning methods to the hepatitis C dataset is not only an important way to validate the efficacy of such algorithms, but also an urgent research direction that needs to be put into practice to fill the existing research gaps.

The Helmholtz Center for Infection Research, the Institute of Clinical Chemistry at the Medical University of Hannover, and other research organizations provided data on people with hepatitis C, which was used to compile the information in this article. The collection includes demographic data, such as age, as well as test results for blood donors and hepatitis C patients. By examining the dataset, we can see that the primary features are the quantity of different blood components and liver function, and that the only categorical feature in the dataset is gender. Table 1 shows the precise definition of these fields.

This essay investigates the categorisation issue. The Table 2 lists the description and sample size of the five main classification labels. In the next training, in order to address the effect of sample imbalance on the classification effect, the model will be first smote32 sampled and then trained using the smote sampled samples. With a sample size of 400 for each classification.

The aim of this paper is to investigate whether IDAE can extract more representative and robust features, and we have chosen a baseline model that includes both traditional machine learning algorithms and various types of autoencoders, which will be described in more detail below:

SVM: support vector machines are used to achieve optimal classification of data by constructing maximally spaced classification hyperplanes and use kernel functions to deal with nonlinear problems, aiming to seek to identify decision boundaries that maximize spacing in the training data.

KNN: the K Nearest Neighbors algorithm determines the class or predictive value of a new sample by calculating its distance from each sample in the training set through its K nearest neighbors.

RF: random forests utilize random feature selection and Bootstrap sampling techniques to construct and combine the prediction results of multiple decision trees to effectively handle classification and regression problems.

AE: autoencoder is a neural network structure consisting of an encoder and a decoder that learns a compact, low-dimensional feature representation of the data through a autoreconfiguration process of the training data, and is mainly used for data dimensionality reduction, feature extraction, and generative learning tasks.

DAE: denoising autoencoder is a autoencoder variant that excels at extracting features from noisy inputs, revealing the underlying structure of the data and learning advanced features by reconstructing the noise-added inputs to improve network robustness, and whose robust features have a gainful effect on the downstream tasks, which contributes to improving the model generalization ability.

SDAE: stacked denoising autoencoder is a multilayer neural network structure consisting of multiple noise-reducing autoencoder layers connected in series, each of which applies noise to the input data during training and learns to reconstruct the undisturbed original features from the noisy data, thus extracting a more abstract and robust feature representation layer by layer.

DIUDA: the main feature of Dual Input Unsupervised Denoising Autoencoder is that it receives two different types of input data at the same time, and further enhances the generalization ability of the model and the understanding of the intrinsic structure of the data by fusing the two types of inputs for the joint learning and extraction of the feature representation.

In this paper, 80% of the Hepatitis C dataset is used as model training and the remaining 20% is used to test the model. Since the samples are unbalanced, this work is repeated with negative samples to ensure that the samples are balanced. For the autoencoder all methods, the learning rate is initialized to 0.001, the number of layers for both encoder and decoder are set to 3, the number of neurons for encoder is 10, 8, 5, the number of neurons for decoder is 5, 8, 10, and the MLP is initialized to 3 layers with the number of neurons 10, 8, 5, respectively, and furthermore all models are trained until convergence, with a maximum training epoch is 200. The machine learning methods all use the sklearn library, and the hyperparameters use the default parameters of the corresponding algorithms of the sklearn library.

To answer the first question, we classified the hepatitis C data after feature extraction using a modified noise-reducing auto-encoder and compared it using traditional machine learning algorithms such as SVM, KNN, and Random Forest with AE, DAE, SDAE, and DIUDA as baseline models. Each experiment was conducted 3 times to mitigate randomness. The average results for each metric are shown in Table 3.From the table, we can make the following observations.

The left figure shows the 3D visualisation of t-SNE with features extracted by DAE, and the right figure shows the 3D visualisation of t-SNE with features extracted by IDAE.

Firstly, the IDAE shows significant improvement on the hepatitis C classification task compared to the machine learning algorithms, and also outperforms almost all machine learning baseline models on all evaluation metrics. These results validate the effectiveness of our proposed improved noise-reducing autoencoder on the hepatitis C dataset. Secondly, IDAE achieves higher accuracy on the hepatitis C dataset compared to the traditional autoencoders such as AE, DAE, SDAE and DIUDA, etc., with numerical improvements of 0.011, 0.013, 0.010, 0.007, respectively. other metrics such as the AUC-ROC and F1 scores, the values are improved by 0.11, 0.10, 0.06,0.04 and 0.13, 0.11, 0.042, 0.032. From Fig. 5, it can be seen that the IDAE shows better clustering effect and class boundary differentiation in the feature representation in 3D space, and both the experimental results and visual analyses verify the advantages of the improved model in classification performance. Both experimental results and visualisation analysis verify the advantages of the improved model in classification performance.

Finally, SVM and RF outperform KNN for classification in the Hepatitis C dataset due to the fact that SVM can handle complex nonlinear relationships through radial basis function (RBF) kernels. The integrated algorithm can also integrate multiple weak learners to indirectly achieve nonlinear classification. KNN, on the other hand, is based on linear measures such as Euclidean distance to construct decision boundaries, which cannot effectively capture and express the essential laws of complex nonlinear data distributions, leading to poor classification results.

In summary, these results demonstrate the superiority of the improved noise-reducing autoencoder in feature extraction of hepatitis C data. It is also indirectly verified by the effect of machine learning that hepatitis C data features may indeed have complex nonlinear relationships.

To answer the second question, we analyze in this subsection the performance variation of different autoencoder algorithms at different depths. To perform the experiments in the constrained setting, we used a fixed learning rate of 0.001. The number of neurons in the encoder and decoder was kept constant and the number of layers added to the encoder and decoder was set to {1, 2, 3, 4, 5, 6}. Each experiment was performed 3 times and the average results are shown in Fig. 6, we make the following observations:

Effects of various types of autoencoders at different depths.

Under different layer configurations, the IDAE proposed in this study shows significant advantages over the traditional AE, DAE, SDAE and SDAE in terms of both feature extraction and classification performance. The experimental data show that the deeper the number of layers, the greater the performance improvement, when the number of layers of the encoder reaches 6 layers, the accuracy improvement effect of IDAE is 0.112, 0.103 , 0.041, 0.021 ,the improvement effect of AUC-ROC of IDAE is 0.062, 0.042, 0.034,0.034, and the improvement effect of F1 is 0.054, 0.051, 0.034,0.028 in the order of the encoder.

It is worth noting that conventional autocoders often encounter the challenges of overfitting and gradient vanishing when the network is deepened, resulting in a gradual stabilisation or even a slight decline in their performance on the hepatitis C classification task, which is largely attributed to the excessive complexity and gradient vanishing problems caused by the over-deep network structure, which restrict the model from finding the optimal solution. The improved version of DAE introduces residual neural network, which optimises the information flow between layers and solves the gradient vanishing problem in deep learning by introducing directly connected paths, and balances the model complexity and generalisation ability by flexibly expanding the depth and width of the network. Experimental results show that the improved DAE further improves the classification performance with appropriate increase in network depth, and alleviates the overfitting problem at the same depth. Taken together, the experimental results reveal that the improved DAE does mitigate the risk of overfitting at the same depth as the number of network layers deepens, and also outperforms other autoencoders in various metrics.

To answer the third question, in this subsection we analyse the speed of model convergence for different autoencoder algorithms. The experiments were also performed by setting the number of layers added to the encoder and decoder to {3, 6}, with the same number of neurons in each layer, and performing each experiment three times, with the average results shown in Fig. 7, where we observe the following conclusions: The convergence speed of the IDAE is better than the other autoencoder at different depths again. Especially, the contrast is more obvious at deeper layers. This is due to the fact that the chain rule leads to gradient vanishing and overfitting problems, and its convergence speed will have a decreasing trend; whereas the IDAE adds direct paths between layers by incorporating techniques such as residual connectivity, which allows the signal to bypass the nonlinear transforms of some layers and propagate directly to the later layers. This design effectively mitigates the problem of gradient vanishing as the depth of the network increases, allowing the network to maintain a high gradient flow rate during training, and still maintain a fast convergence speed even when the depth increases. In summary, when dealing with complex and high-dimensional data such as hepatitis C-related data, the IDAE is able to learn and extract features better by continuously increasing the depth energy, which improves the model training efficiency and overall performance.

Comparison of model convergence speed for different layers of autoencoders.

Excerpt from:
Investigation of the effectiveness of a classification method based on improved DAE feature extraction for hepatitis C ... - Nature.com

Machine Learning Uncovers New Ways to Kill Bacteria With Non-Antibiotic Drugs – ScienceAlert

Human history was forever changed with the discovery of antibiotics in 1928. Infectious diseases such as pneumonia, tuberculosis and sepsis were widespread and lethal until penicillin made them treatable.

Surgical procedures that once came with a high risk of infection became safer and more routine. Antibiotics marked a triumphant moment in science that transformed medical practice and saved countless lives.

But antibiotics have an inherent caveat: When overused, bacteria can evolve resistance to these drugs. The World Health Organization estimated that these superbugs caused 1.27 million deaths around the world in 2019 and will likely become an increasing threat to global public health in the coming years.

New discoveries are helping scientists face this challenge in innovative ways. Studies have found that nearly a quarter of drugs that aren't normally prescribed as antibiotics, such as medications used to treat cancer, diabetes and depression, can kill bacteria at doses typically prescribed for people.

Understanding the mechanisms underlying how certain drugs are toxic to bacteria may have far-reaching implications for medicine. If nonantibiotic drugs target bacteria in different ways from standard antibiotics, they could serve as leads in developing new antibiotics.

But if nonantibiotics kill bacteria in similar ways to known antibiotics, their prolonged use, such as in the treatment of chronic disease, might inadvertently promote antibiotic resistance.

In our recently published research, my colleagues and I developed a new machine learning method that not only identified how nonantibiotics kill bacteria but can also help find new bacterial targets for antibiotics.

Numerous scientists and physicians around the world are tackling the problem of drug resistance, including me and my colleagues in the Mitchell Lab at UMass Chan Medical School. We use the genetics of bacteria to study which mutations make bacteria more resistant or more sensitive to drugs.

When my team and I learned about the widespread antibacterial activity of nonantibiotics, we were consumed by the challenge it posed: figuring out how these drugs kill bacteria.

To answer this question, I used a genetic screening technique my colleagues recently developed to study how anticancer drugs target bacteria. This method identifies which specific genes and cellular processes change when bacteria mutate. Monitoring how these changes influence the survival of bacteria allows researchers to infer the mechanisms these drugs use to kill bacteria.

I collected and analyzed almost 2 million instances of toxicity between 200 drugs and thousands of mutant bacteria. Using a machine learning algorithm I developed to deduce similarities between different drugs, I grouped the drugs together in a network based on how they affected the mutant bacteria.

My maps clearly showed that known antibiotics were tightly grouped together by their known classes of killing mechanisms. For example, all antibiotics that target the cell wall the thick protective layer surrounding bacterial cells were grouped together and well separated from antibiotics that interfere with bacteria's DNA replication.

Intriguingly, when I added nonantibiotic drugs to my analysis, they formed separate hubs from antibiotics. This indicates that nonantibiotic and antibiotic drugs have different ways of killing bacterial cells. While these groupings don't reveal how each drug specifically kills antibiotics, they show that those clustered together likely work in similar ways.

The last piece of the puzzle whether we could find new drug targets in bacteria to kill them came from the research of my colleague Carmen Li.

She grew hundreds of generations of bacteria that were exposed to different nonantibiotic drugs normally prescribed to treat anxiety, parasite infections and cancer.

Sequencing the genomes of bacteria that evolved and adapted to the presence of these drugs allowed us to pinpoint the specific bacterial protein that triclabendazole a drug used to treat parasite infections targets to kill the bacteria. Importantly, current antibiotics don't typically target this protein.

Additionally, we found that two other nonantibiotics that used a similar mechanism as triclabendazole also target the same protein. This demonstrated the power of my drug similarity maps to identify drugs with similar killing mechanisms, even when that mechanism was yet unknown.

Our findings open multiple opportunities for researchers to study how nonantibiotic drugs work differently from standard antibiotics. Our method of mapping and testing drugs also has the potential to address a critical bottleneck in developing antibiotics.

Searching for new antibiotics typically involves sinking considerable resources into screening thousands of chemicals that kill bacteria and figuring out how they work. Most of these chemicals are found to work similarly to existing antibiotics and are discarded.

Our work shows that combining genetic screening with machine learning can help uncover the chemical needle in the haystack that can kill bacteria in ways researchers haven't used before.

There are different ways to kill bacteria we haven't exploited yet, and there are still roads we can take to fight the threat of bacterial infections and antibiotic resistance.

Mariana Noto Guillen, Ph.D. Candidate in Systems Biology, UMass Chan Medical School

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read the original post:
Machine Learning Uncovers New Ways to Kill Bacteria With Non-Antibiotic Drugs - ScienceAlert

Imbalanced Learn: the Python library for rebuilding ML datasets – DataScientest

As mentioned earlier, one of the great advantages of Imbalanced Learn is its native integration with scikit-learn: a Python library commonly used for machine learning.

This integration makes it very easy for users to incorporate Imbalanced Learns functionality into their learning pipelines, combining resampling techniques with scikit-learn estimators to build robust and balanced models.

In addition, it can also be integrated with other Machine Learning frameworks and tools such as TensorFlow, PyTorch, and other popular libraries.

This broad compatibility allows users to exploit the advanced functionality in a variety of environments and architectures, offering greatly increased flexibility and adaptability.

Machine Learning researchers and engineers are able to apply advanced resampling techniques in areas such as computer vision, natural language processing, and other applications requiring deep neural network architectures.

With the move towards distributed architectures and edge computing environments, the integration of Imbalanced Learn into cloud and edge solutions has also become essential.

Compatible libraries and Kubernetes orchestration tools can facilitate the deployment and management of balanced models, enabling efficient scaling and real-time execution in diverse and dynamic environments.

More:
Imbalanced Learn: the Python library for rebuilding ML datasets - DataScientest

A secure approach to generative AI with AWS | Amazon Web Services – AWS Blog

Generative artificial intelligence (AI) is transforming the customer experience in industries across the globe. Customers are building generative AI applications using large language models (LLMs) and other foundation models (FMs), which enhance customer experiences, transform operations, improve employee productivity, and create new revenue channels.

FMs and the applications built around them represent extremely valuable investments for our customers. Theyre often used with highly sensitive business data, like personal data, compliance data, operational data, and financial information, to optimize the models output. The biggest concern we hear from customers as they explore the advantages of generative AI is how to protect their highly sensitive data and investments. Because their data and model weights are incredibly valuable, customers require them to stay protected, secure, and private, whether thats from their own administrators accounts, their customers, vulnerabilities in software running in their own environments, or even their cloud service provider from having access.

At AWS, our top priority is safeguarding the security and confidentiality of our customers workloads. We think about security across the three layers of our generative AI stack:

Each layer is important to making generative AI pervasive and transformative.

With the AWS Nitro System, we delivered a first-of-its-kind innovation on behalf of our customers. The Nitro System is an unparalleled computing backbone for AWS, with security and performance at its core. Its specialized hardware and associated firmware are designed to enforce restrictions so that nobody, including anyone in AWS, can access your workloads or data running on your Amazon Elastic Compute Cloud (Amazon EC2) instances. Customers have benefited from this confidentiality and isolation from AWS operators on all Nitro-based EC2 instances since 2017.

By design, there is no mechanism for any Amazon employee to access a Nitro EC2 instance that customers use to run their workloads, or to access data that customers send to a machine learning (ML) accelerator or GPU. This protection applies to all Nitro-based instances, including instances with ML accelerators like AWS Inferentia and AWS Trainium, and instances with GPUs like P4, P5, G5, and G6.

The Nitro System enables Elastic Fabric Adapter (EFA), which uses the AWS-built AWS Scalable Reliable Datagram (SRD) communication protocol for cloud-scale elastic and large-scale distributed training, enabling the only always-encrypted Remote Direct Memory Access (RDMA) capable network. All communication through EFA is encrypted with VPC encryption without incurring any performance penalty.

The design of the Nitro System has been validated by the NCC Group, an independent cybersecurity firm. AWS delivers a high level of protection for customer workloads, and we believe this is the level of security and confidentiality that customers should expect from their cloud provider. This level of protection is so critical that weve added it in our AWS Service Terms to provide an additional assurance to all of our customers.

From day one, AWS AI infrastructure and services have had built-in security and privacy features to give you control over your data. As customers move quickly to implement generative AI in their organizations, you need to know that your data is being handled securely across the AI lifecycle, including data preparation, training, and inferencing. The security of model weightsthe parameters that a model learns during training that are critical for its ability to make predictionsis paramount to protecting your data and maintaining model integrity.

This is why it is critical for AWS to continue to innovate on behalf of our customers to raise the bar on security across each layer of the generative AI stack. To do this, we believe that you must have security and confidentiality built in across each layer of the generative AI stack. You need to be able to secure the infrastructure to train LLMs and other FMs, build securely with tools to run LLMs and other FMs, and run applications that use FMs with built-in security and privacy that you can trust.

At AWS, securing AI infrastructure refers to zero access to sensitive AI data, such as AI model weights and data processed with those models, by any unauthorized person, either at the infrastructure operator or at the customer. Its comprised of three key principles:

The Nitro System fulfills the first principle of Secure AI Infrastructure by isolating your AI data from AWS operators. The second principle provides you with a way to remove administrative access of your own users and software to your AI data. AWS not only offers you a way to achieve that, but we also made it straightforward and practical by investing in building an integrated solution between AWS Nitro Enclaves and AWS Key Management Service (AWS KMS). With Nitro Enclaves and AWS KMS, you can encrypt your sensitive AI data using keys that you own and control, store that data in a location of your choice, and securely transfer the encrypted data to an isolated compute environment for inferencing. Throughout this entire process, the sensitive AI data is encrypted and isolated from your own users and software on your EC2 instance, and AWS operators cannot access this data. Use cases that have benefited from this flow include running LLM inferencing in an enclave. Until today, Nitro Enclaves operate only in the CPU, limiting the potential for larger generative AI models and more complex processing.

We announced our plans to extend this Nitro end-to-end encrypted flow to include first-class integration with ML accelerators and GPUs, fulfilling the third principle. You will be able to decrypt and load sensitive AI data into an ML accelerator for processing while providing isolation from your own operators and verified authenticity of the application used for processing the AI data. Through the Nitro System, you can cryptographically validate your applications to AWS KMS and decrypt data only when the necessary checks pass. This enhancement allows AWS to offer end-to-end encryption for your data as it flows through generative AI workloads.

We plan to offer this end-to-end encrypted flow in the upcoming AWS-designed Trainium2 as well as GPU instances based on NVIDIAs upcoming Blackwell architecture, which both offer secure communications between devices, the third principle of Secure AI Infrastructure. AWS and NVIDIA are collaborating closely to bring a joint solution to market, including NVIDIAs new NVIDIA Blackwell GPU platform, which couples NVIDIAs GB200 NVL72 solution with the Nitro System and EFA technologies to provide an industry-leading solution for securely building and deploying next-generation generative AI applications.

Today, tens of thousands of customers are using AWS to experiment and move transformative generative AI applications into production. Generative AI workloads contain highly valuable and sensitive data that needs the level of protection from your own operators and the cloud service provider. Customers using AWS Nitro-based EC2 instances have received this level of protection and isolation from AWS operators since 2017, when we launched our innovative Nitro System.

At AWS, were continuing that innovation as we invest in building performant and accessible capabilities to make it practical for our customers to secure their generative AI workloads across the three layers of the generative AI stack, so that you can focus on what you do best: building and extending the uses of the generative AI to more areas. Learn more here.

Anthony Liguori is an AWS VP and Distinguished Engineer for EC2

Colm MacCrthaigh is an AWS VP and Distinguished Engineer for EC2

Continued here:
A secure approach to generative AI with AWS | Amazon Web Services - AWS Blog

AI has a lot of terms. We’ve got a glossary for what you need to know – Quartz

Nvidia CEO Jensen Huang. Photo: Justin Sullivan ( Getty Images )

Lets start with the basics for a refresher. Generative artificial intelligence is a category of AI that uses data to create original content. In contrast, classic AI could only offer predictions based on data inputs, not brand new and unique answers using machine learning. But generative AI uses deep learning, a form of machine learning that uses artificial neural networks (software programs) resembling the human brain, so computers can perform human-like analysis.

Generative AI isnt grabbing answers out of thin air, though. Its generating answers based on data its trained on, which can include text, video, audio, and lines of code. Imagine, say, waking up from a coma, blindfolded, and all you can remember is 10 Wikipedia articles. All of your conversations with another person about what you know are based on those 10 Wikipedia articles. Its kind of like that except generative AI uses millions of such articles and a whole lot more.

View post:
AI has a lot of terms. We've got a glossary for what you need to know - Quartz