Archive for the ‘Machine Learning’ Category

Learning grammars of molecules to build them in the lab – The Hindu

Researchers generate molecular structures using machine learning algorithms, trained on smaller datasets

Researchers generate molecular structures using machine learning algorithms, trained on smaller datasets

We think of molecules as occurring in nature. Large macromolecules lead us to the basis of life. The twentieth century gave us new materials synthesised in the lab. We can now have designer molecules, where we formulate a wish list of properties for material (say, desired tensile strength as well as flexibility) and seek to not merely discover, but also construct, molecules that exhibit such properties. Generating molecules computationally involves the use of Artificial Intelligence (AI) and machine learning algorithms that require large datasets to train on. Moreover, the molecules thus designed may be hard to synthesise. So, the challenge is to circumvent these shortfalls.

Now, researchers from Massachusetts Institute of Technology (MIT) and International Business Machines (IBM) have together devised a method to generate molecules computationally which combines the power of machine learning with what are called graph grammars. This approach requires much smaller datasets (for example, about 100 datasets in the place of 81,000, as the researchers mention) and builds up the molecules in a bottom-up approach. The group has demonstrated this method on naphthalene diisocyanate molecule in a paper that has been reviewed and accepted for presentation at the International Conference on Learning Representations (ICLR 2022).

Artificial intelligence (AI) techniques, especially the use of machine learning algorithms, are in vogue today to find new molecular structures. These methods require tens of thousands of samples to train the neural networks. Also, the designed molecules may not be physically synthesisable. Ensuring synthesisability in these methods may need the incorporation of chemical knowledge, and extracting such knowledge from datasets is a significant challenge.

Chemical datasets with required properties may be very small in number. For instance, some researchers reported in 2019 that datasets on polyurethane property prediction have as few as 20 samples.

If we surmount all these challenges, there is a further problem with typical machine learning algorithms, which is that we cannot explain their results. That is, after discovering a molecule, we cannot figure out how we came up with it. The implication is that if we slightly change the desired properties, we may need to search all over again. Explainable AI is considered one of the grand challenges of contemporary AI research.

One alternative to such deep learning methods is the use of formal grammars. Grammar, in the context of languages, provides rules for how sentences can be constructed from words. We can design chemical grammars that specify rules for constructing molecules from atoms. In the last few years, several research teams have built such grammars. While this approach is hopeful, it calls for extensive expertise in chemistry, and after the grammar is built, incorporating properties from datasets, or optimisation, is hard.

Here, the researchers use mathematical objects called graph grammars for this purpose.

What mathematicians call graphs are networks or webs with nodes and edges between them. In this approach, a molecule is represented as a graph where the nodes are strings of atoms and edges are chemical bonds. A grammar for such structures tells us how to replace a string in a node with a whole molecular structure. Thus, parsing a structure means contracting some substructure; we keep doing this repeatedly until we get a single node.

The model uses machine learning techniques to learn graph grammars from datasets. The algorithm takes as input a set of molecular structures and a set of evaluation metrics (for example, synthesisability).

The grammar is constructed bottom-up, creating rules by contractions; choosing which structures to contract is based on the learning component, a neural network which builds on the chemical information. The algorithm simultaneously performs multiple, randomised searches to obtain multiple grammars as candidates. It still needs to evaluate them, and this is done using the input metrics.

While the method has been demonstrated for use in building molecules, the applications could be far reaching, beyond chemistry.

(The writer is a computer scientist, formerly with The Institute of Mathematical Sciences, Chennai, and currently visiting professor at Azim Premji University, Bengaluru.)

AI techniques used earlier required tens of thousands of samples to train the neural networks. Also, the designed molecules were not always physically synthesisable.

Read the original:
Learning grammars of molecules to build them in the lab - The Hindu

Does this artificial intelligence think like a human? – MIT News

In machine learning, understanding why a model makes certain decisions is often just as important as whether those decisions are correct. For instance, a machine-learning model might correctly predict that a skin lesion is cancerous, but it could have done so using an unrelated blip on a clinical photo.

While tools exist to help experts make sense of a models reasoning, often these methods only provide insights on one decision at a time, and each must be manually evaluated. Models are commonly trained using millions of data inputs, making it almost impossible for a human to evaluate enough decisions to identify patterns.

Now, researchers at MIT and IBM Research have created a method that enables a user to aggregate, sort, and rank these individual explanations to rapidly analyze a machine-learning models behavior. Their technique, called Shared Interest, incorporates quantifiable metrics that compare how well a models reasoning matches that of a human.

Shared Interest could help a user easily uncover concerning trends in a models decision-making for example, perhaps the model often becomes confused by distracting, irrelevant features, like background objects in photos. Aggregating these insights could help the user quickly and quantitatively determine whether a model is trustworthy and ready to be deployed in a real-world situation.

In developing Shared Interest, our goal is to be able to scale up this analysis process so that you could understand on a more global level what your models behavior is, says lead author Angie Boggust, a graduate student in the Visualization Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL).

Boggust wrote the paper with her advisor, Arvind Satyanarayan, an assistant professor of computer science who leads the Visualization Group, as well as Benjamin Hoover and senior author Hendrik Strobelt, both of IBM Research. The paper will be presented at the Conference on Human Factors in Computing Systems.

Boggust began working on this project during a summer internship at IBM, under the mentorship of Strobelt. After returning to MIT, Boggust and Satyanarayan expanded on the project and continued the collaboration with Strobelt and Hoover, who helped deploy the case studies that show how the technique could be used in practice.

Human-AI alignment

Shared Interest leverages popular techniques that show how a machine-learning model made a specific decision, known as saliency methods. If the model is classifying images, saliency methods highlight areas of an image that are important to the model when it made its decision. These areas are visualized as a type of heatmap, called a saliency map, that is often overlaid on the original image. If the model classified the image as a dog, and the dogs head is highlighted, that means those pixels were important to the model when it decided the image contains a dog.

Shared Interest works by comparing saliency methods to ground-truth data. In an image dataset, ground-truth data are typically human-generated annotations that surround the relevant parts of each image. In the previous example, the box would surround the entire dog in the photo. When evaluating an image classification model, Shared Interest compares the model-generated saliency data and the human-generated ground-truth data for the same image to see how well they align.

The technique uses several metrics to quantify that alignment (or misalignment) and then sorts a particular decision into one of eight categories. The categories run the gamut from perfectly human-aligned (the model makes a correct prediction and the highlighted area in the saliency map is identical to the human-generated box) to completely distracted (the model makes an incorrect prediction and does not use any image features found in the human-generated box).

On one end of the spectrum, your model made the decision for the exact same reason a human did, and on the other end of the spectrum, your model and the human are making this decision for totally different reasons. By quantifying that for all the images in your dataset, you can use that quantification to sort through them, Boggust explains.

The technique works similarly with text-based data, where key words are highlighted instead of image regions.

Rapid analysis

The researchers used three case studies to show how Shared Interest could be useful to both nonexperts and machine-learning researchers.

In the first case study, they used Shared Interest to help a dermatologist determine if he should trust a machine-learning model designed to help diagnose cancer from photos of skin lesions. Shared Interest enabled the dermatologist to quickly see examples of the models correct and incorrect predictions. Ultimately, the dermatologist decided he could not trust the model because it made too many predictions based on image artifacts, rather than actual lesions.

The value here is that using Shared Interest, we are able to see these patterns emerge in our models behavior. In about half an hour, the dermatologist was able to make a confident decision of whether or not to trust the model and whether or not to deploy it, Boggust says.

In the second case study, they worked with a machine-learning researcher to show how Shared Interest can evaluate a particular saliency method by revealing previously unknown pitfalls in the model. Their technique enabled the researcher to analyze thousands of correct and incorrect decisions in a fraction of the time required by typical manual methods.

In the third case study, they used Shared Interest to dive deeper into a specific image classification example. By manipulating the ground-truth area of the image, they were able to conduct a what-if analysis to see which image features were most important for particular predictions.

The researchers were impressed by how well Shared Interest performed in these case studies, but Boggust cautions that the technique is only as good as the saliency methods it is based upon. If those techniques contain bias or are inaccurate, then Shared Interest will inherit those limitations.

In the future, the researchers want to apply Shared Interest to different types of data, particularly tabular data which is used in medical records. They also want to use Shared Interest to help improve current saliency techniques. Boggust hopes this research inspires more work that seeks to quantify machine-learning model behavior in ways that make sense to humans.

This work is funded, in part, by the MIT-IBM Watson AI Lab, the United States Air Force Research Laboratory, and the United States Air Force Artificial Intelligence Accelerator.

See more here:
Does this artificial intelligence think like a human? - MIT News

Stanford center uses AI and machine learning to expand data on women’s and children’s health, director says – The Stanford Daily

Stanfords Center for Artificial Intelligence in Medicine and Imaging (AIMI) is increasing engagement around the use of artificial intelligence (AI) and machine learning to build a better understanding of data on womens and childrens health, according to AIMI Director and radiology professor Curt Langlotz.

Langlotz explained that, while AIMI initially focused on applying AI to medical imaging, it has since expanded its focus to applications of AI for other types of data, such as electronic health records.

Specifically, the center conducts interdisciplinary machine learning research that optimizes how data of all forms are used to promote health, Langlotz said during a Monday event hosted by the Maternal and Child Health Research Institute (MCHRI). And that interdisciplinary flavor is in our DNA.

The center now has over 140 affiliated faculty across 20 departments, primarily housed in the engineering department and the school of medicine at Stanford, according to Langlotz.

AIMI has four main pillars: building an infrastructure for data science research, facilitating interdisciplinary collaborations, engaging the community and providing funding.

The center provides funding predominantly through a series of grant programs. Langlotz noted that the center awarded seven $75,000 grants in 2019 to fund mostly imaging projects, but it has since diversified funding to go toward projects investigating other forms of data, such as electronic health records. AIMI also collaborated with the Human-Centered Institute for Artificial Intelligence (HAI) in 2021 to give out six $200,000 grants, he added.

Outside of funding, AIMI hosts a virtual symposium on technology and health annually and has a health-policy committee that informs policymakers on the intersection between AI and healthcare. Furthermore, the center pairs industry partners with laboratories to work on larger research projects of mutual interest as part of the only industry affiliate program for the school of medicine, Langlotz added.

Industry often has expertise that we dont, so they may have expertise on bringing products to markets as they may know what customers are looking for, Langlotz said. And if were building these kinds of algorithms, we really would like them to ultimately reach patients.

Heike Daldrup-Link, a professor of radiology and pediatrics, and Alison Callahan, a research scientist at the Center for Biomedical Informatics, shared their research funded by the AIMI Center that rests at the intersection of computer science and medicine.

Daldrup-Links research involves analyzing childrens responses to lymphoma cancer therapy with a model that examines tumor sites using positron emission tomography (PET) scans. These scans reveal the metabolic processes occurring within tissues and organs, according toDaldrup-Link. The scans also serve as a good source to build algorithms because there are at least 270,000 scans per year from lymphoma patients, resulting in a large amount of available data.

Callahan is building AI models to extract information from electronic health records to learn more about pregnancy and postnatal health outcomes. She explained that much of the health data available from records is currently unstructured, meaning it does not conform to a database or simple model. Still, AI methods can really shine in extracting valuable information from unstructured content like clinical texts or notes, she said.

Callahan and Daldrup-Link are just two examples of researchers who use AI and machine learning methods to produce novel research on womens and childrens health. Developing new methods such as these are important in solving complex problems related to the field of healthcare, according to Langlotz.

If youre working on difficult and interesting applied problems that are clinically important, youre likely to encounter the need to develop new and interesting methods, Langlotz said. And thats proven true for us.

Original post:
Stanford center uses AI and machine learning to expand data on women's and children's health, director says - The Stanford Daily

AI and machine learning are the future of retail: Survey – ITP.net

Artificial intelligence and machine learning are changing the way retail works as it creates knowledge out of data that retailers can turn into action.

Sixty-five percent of decision makers at retail companies and organisations said AI and ML are mission-critical technologies, according to a survey sponsored by Rackspace Technology.

The technologies provide an opportunity to enhance customer experiences, improve revenue growth potential, undertake rapid innovation and create smart operations all of which can help businesses to stand out from the competition.

Fifty-eight percent of respondents in the retail space said AI and ML technologies are a high priority for their industry.

Sixty-nine percent reported AI and ML had a positive impact on brand awareness and on brand reputation (67 percent), as well as on revenue generation (72 percent) and on expense reduction (72 percent).

Meanwhile, 75 percent of respondents in retail say they are employing AI and ML as part of their business strategy, IT strategy or both.

Some 68 percent of retail respondents are allocating between 6 percent and 10 percent of their budget to AI and ML projects.

The technology is being used by retailers in an increasingly wide variety of contexts, including improving the speed and efficiency of processes (47 percent), personalising content and understanding customers (43 percent), increasing revenue (41 percent), gaining competitive edge (42 percent) and predicting performance (32 percent), and understanding marketing effectiveness (42 percent).

In an indication of the increasing maturity of the technologies, 66 percent of retail respondents said their AI/ML projects have gone past the experimentation stage and are now either in the optimising/innovating or formalising states of implementation.

There are however challenges when it comes to AI and ML adoption. Thirty-four percent of retail respondents cite difficulties aligning AI and ML strategies to the business.

From a talent perspective, more than half 61 percent of retail respondents said they have necessary AI and ML skills within their organisation.

At the same time, more than half of all respondents say that bolstering internal skills, hiring talent and improving both internal and external training are on their agenda.

Comparing departments, 69 percent of retail respondents say IT staff grasp AI and ML benefits while 46 percent in sales, 45 percent in R&D, 44 percent in senior management and boards, 41 percent in customer service and operations and only 34 percent in marketing departments understand the benefits of these technologies.

Original post:
AI and machine learning are the future of retail: Survey - ITP.net

Privacy And Cybersecurity Risks In Transactions Impacts From Artificial Intelligence And Machine Learning, Addressing Security Incidents And Other…

To print this article, all you need is to be registered or login on Mondaq.com.

Cyberattacks. Data breaches. Regulatory investigations. Emergingtechnology. Privacy rights. Data rights. Compliance challenges. Therapidly evolving privacy and cybersecurity landscape has created aplethora of new considerations and risks for almost everytransaction. Companies that engage in corporate transactions andM&A counsel alike should ensure that they are aware of andappropriately manage the impact of privacy and cybersecurity riskson their transactions. To that point, in this article we provide anoverview of privacy and cybersecurity diligence, discuss the globalspread of privacy and cybersecurity requirements, provide insightsrelated to the emerging issues of artificial intelligence andmachine learning and discuss the impact of cybersecurity incidentson transactions before, during and after a transaction.

There is a common misunderstanding that privacy matters only forcompanies that are steeped in personal information and thatcybersecurity matters only for companies with a business modelgrounded in tech or data. While privacy issues may not be the mostcritical issues facing a company, all companies must addressprivacy issues because all companies have, at the very least,personal information about employees. And as recent publicizedcybersecurity incidents have demonstrated, no company, regardlessof industry, is immune from cybersecurity risks.

Privacy and cybersecurity are a Venn diagram of legal concepts:each has its own considerations, and for certain topics theyoverlap. This construct translates into how privacy andcybersecurity need to be addressed in M&A: each stands alone,and they often intermingle. Accordingly, they must both beaddressed and considered together.

Privacy requirements in the U.S. are a patchwork of federal andstate laws, with several comprehensive privacy laws now in effector soon to be in effect at the state level. Notably, while itdoesn't presently apply in full to personnel andbusiness-to-business personal data, the California Consumer PrivacyAct covers all residents of the state of California, not justconsumers (despite confusingly calling residents"consumers" in the law). Further, there are specificlaws, such as the Illinois Biometric Information Privacy Act andthe Telephone Consumer Protection Act, that add further, morespecific privacy considerations for certain business activities.And while there is an assortment of laws with a wide variety ofenforcement mechanisms from private rights of action to regulatorycivil penalties or even disgorgement of IP, one consistent trend isthe increasing potential for financial liability that can befall anon-compliant entity.

Laws in the U.S. related to cybersecurity compliance are not ascommon as laws related to responding to and notifying of a databreach. In recent years, specific laws and regulations have largelyfocused on the healthcare and financial services industries.However, legislative and regulatory activity is expanding in thisspace, requiring increasingly specific technological,administrative and governance safeguards for cybersecurity programswell beyond these two industries. Additionally, while breachresponse and notification where sensitive personal data is impactedhas been a well-established legal requirement for several yearsnow, increasingly complex cyber-attacks on private and publicentities has expanded the focus of cybersecurity incident reportingrequirements and enterprise cybersecurity risk considerations.

What Does This All Mean for Diligence?

For the buy side, identifying the specifics of what data, datauses and applicable laws are relevant to the target company ispivotal to appropriately understanding the array of risks that maybe present in the transaction. Equally, at least basictechnological cybersecurity diligence is important to understandthe risks of the transaction and potential future integration. Forthe sell side, entities should be prepared to address their data,data uses and privacy and cybersecurity obligations in diligencerequests.

Separately, privacy and cybersecurity diligence should not focussolely on the risks created by past business activity but alsoconsider future intentions for the data, systems and company'sbusiness model. If an entity is looking to make an acquisitionbecause it will be able to capitalize on the data that the acquiredentity has, then diligence should ensure that those intended useswon't be legally or contractually problematic. This issue isbest known earlier than later in the transaction, as it may impactthe value of the target or even the desire to move ahead.

In the event that diligence uncovers concerns, some privacy andcybersecurity risks will warrant closing conditions and/or specialindemnities to meet the risk tolerance of the acquiring entity. Inintense situations, such as where a data breach happens or isidentified during a transaction, there may even be a pricerenegotiation. Understanding the depth and presence of these risksshould be front of mind for any entity considering a sale to allowfor timely identification and remediation and in some instances tounderstand how persistent risks may impact the transaction if itmoves ahead. For all of these situations, privacy and cybersecurityspecialists are critical to the process.

The prevalence of global business, even for small entities thatmay have overseas vendors or IT support, creates additional layersof considerations for privacy and cybersecurity diligence.

Privacy and cybersecurity laws have existed in certainjurisdictions for years or even decades. In others, the expandedcreation of, access to and use of digital data, along withexemplars like the European Union (EU) General Data ProtectionRegulation, have caused a profound uptick in comprehensive privacyand cybersecurity laws. Depending on how you count, there are closeto or over 100 countries with such laws currently or soon to be inplace. This proliferation and dispersion of legal requirementsmeans a compounding of risk considerations for diligence.

Common themes in recently enacted and proposed global privacyand cybersecurity laws include data localization, appointed companyrepresentatives, restrictions on use and retention, enumeratedrights for individuals and significant penalties. Moreover, asidefrom comprehensive laws that address privacy and cybersecurity,other laws are emerging that are topic-specific. For example, theEU has a rather complex proposed law related to the use ofartificial intelligence. It is critical to ensure that theappropriate team is in place to diligence privacy and cybersecurityfor global entities and to help companies take appropriaterisk-based approaches to understanding the global complianceposture. It can be difficult to strike a balance in diligencepriorities due to both the growing number of new global laws andthe lack of many (or any) historical examples of enforcement forthese jurisdictions. But robust fact-finding paired with continueddiscussions on risk tolerance and business objectives, and carefulconsideration of commercial terms, will help.

As mentioned, artificial intelligence is a hot topic for privacyand cybersecurity laws. One of the biggest diligence risks relatedto artificial intelligence and machine learning (AI/ML) is notidentifying that it's being used. AI/ML is a technicallyadvanced concept, but its use is far more prevalent than may beimmediately understood when looking at the nature of an entity.Anything from assessing weather impacts on crop production todetermining who is approved for certain medical benefits caninvolve AI/ML. The unlimited potential for AI/ML applicationcreates a variety of diligence considerations.

Where AI/ML is trained or used on personal data, there can besignificant legal risks. The origin of training data needs to beunderstood, and diligence should ensure that the legal support forusing that data is sound. In fact, the legal ability to use allinvolved data should be assessed. Companies commonly treat all dataas traditional proprietary information. But privacy laws complicatethe traditional property-law concepts, and even if laws permit theuse of data, contracts may prohibit it. Recent legal actions haveshown the magnitude of penalties a company can face for wronglyusing data when developing AI/ML. Notably, in 2021 the FTCdetermined that a company had wrongly used photos and videos fortraining facial recognition AI. As part of the settlement, the U.S.Federal Trade Commission ordered that all models and algorithmsdeveloped with the use of the photos and videos be deleted. If acompany's primary offering is an AI/ML tool, such an ordercould have a material impact on the company.

Additionally, the use of AI/ML may not result in the intendedoutput. Despite efforts to use properly sourced data and avoidnegative outcomes, studies have shown that bias or other integrityissues can arise from AI/ML. This is not to say the technologycannot be accurate, but it does demonstrate that when performingdiligence it is crucial to understand the risks that may be presentfor the purposes and uses of AI/ML.

Security incidents have been the topic of many a headline overthe past few years. Some of these incidents are the result of thegrowing trend of ransomware or other cyber extortions, includingdata theft extortions or even denial-of-service extortion. Theidentification of a data security may well have a serious impact ona transaction. Moreover, transactions can be impacted by datasecurity incidents occurring before, during and after atransaction. Below we outline some key considerations for each.

An Incident Happened BEFORE a Transaction Started

An Incident Happens DURING a Transaction

An Incident Happens AFTER a Transaction

While far from the totality of privacy and cybersecurityconsiderations for transactions, these topics should help establisha baseline understanding of what to look for and how to approachprivacy and cybersecurity in the current legal environment.

The content of this article is intended to provide a generalguide to the subject matter. Specialist advice should be soughtabout your specific circumstances.

View post:
Privacy And Cybersecurity Risks In Transactions Impacts From Artificial Intelligence And Machine Learning, Addressing Security Incidents And Other...