Archive for the ‘Machine Learning’ Category

Undergraduate Researchers Help Unlock Lessons of Machine Learning and AI – College of Natural Sciences

Brain-Machine Interface

AI also intersects with language in other research areas. Nihita Sarma, a computer sciencethird-year student and member of Deans Scholars and Turing Scholars, researches theintersection of neuroscience and machine learning to understand language in the brain, workingwith Michael Mauk, professor of neuroscience, and Alexander Huth, an assistant professor ofcomputer science and neuroscience.

As research subjects listen to podcasts, they lie in an MRI machine and readings track their brainactivity. These customized-to-the-subject readings are then used to train machine learningmodels called encoding models, and Sarma then passes them through decoding models.

My research is taking those encodings and trying to backtrack and figure out based on thisneural representation based on the brain activity that was going on at that moment whatcould the person inside the MRI machine possibly have been thinking or listening to at thatmoment? Sarma said.

Along with gaining a better understanding of how language is represented in the brain, Sarmasaid the research has possible applications for a noninvasive communication tactic for peopleunable to speak or sign.

We would be able to decode what theyre thinking or what theyre trying to say, and allow themto communicate with the outside world, Sarma said.

Read more here:
Undergraduate Researchers Help Unlock Lessons of Machine Learning and AI - College of Natural Sciences

Machine Learning Accelerates the Simulation of Dynamical Fields – Eos

Editors Highlights are summaries of recent papers by AGUs journal editors. Source: Journal of Advances in Modeling Earth Systems

Accurately simulating and appropriately representing the aerosol-cloud-precipitation system poses significant challenges in weather and climate models. These challenges are particularly daunting due to knowledge gaps in crucial processes that occur at scales smaller than typical large-eddy simulation model grid sizes (e.g., 100 meters). Particle-resolved direct numerical simulation (PR-DNS) models offer a solution by resolving small-scale turbulent eddies and tracking individual particles. However, it requires extensive computational resources, limiting its use to small-domain simulations and limited number of physical processes.

Zhang et al. [2024] develop the PR-DNS surrogate models using the Fourier neural operator (FNO), which affords improved computational performance and accuracy. The new solver achieves a two orders of magnitude reduction in computational cost, especially for high-resolution simulations, and exhibits excellent generalization, allowing for different initial conditions and zero-shot super resolution without retraining. These findings highlight the FNO method as a promising tool to simulate complex fluid dynamics problems with high accuracy, computational efficiency, and generalization capabilities, enhancing our ability to model the aerosol-cloud-precipitation system and develop digital twins for similarly high-resolution measurements.

Citation: Zhang, T., Li, L., Lpez-Marrero, V., Lin, M., Liu, Y., Yang, F., et al. (2024). Emulator of PR-DNS: Accelerating dynamical fields with neural operators in particle-resolved direct numerical simulation. Journal of Advances in Modeling Earth Systems, 16, e2023MS003898. https://doi.org/10.1029/2023MS003898

Jiwen Fan, Editor, JAMES

Read more:
Machine Learning Accelerates the Simulation of Dynamical Fields - Eos

Inter hospital external validation of interpretable machine learning based triage score for the emergency department … – Nature.com

Study design and setting

This retrospective and validation study was executed across from 3 ED in Korea (A, B and C). A, B and C are tertiary hospitals located in a metropolitan city in Korea. Respectively, the hospital has approximately 2000, 1000, and 1000 inpatient beds. Approximately more than 80,000, 90,000 and 50,000 patients visit the ED annually. There are 16, 20 and 7 specialists working at each institution, respectively. All data were mapped to the Observational Medical Outcome Partnership Common Data Model (OMOP-CDM) for the multicenter study. This study was approved by the Samsung Medical Center Institutional Review Board (2023-02-036), and a waiver of informed consent was granted for EHR data collection and analysis because of the retrospective and de-identified nature of the data. All methods were performed in accordance with the relevant guidelines and regulations.

Initially, ED patients from 2016 to 2017 were included for each hospital. Patient older than 18 with disease patients were included. We also excluded patient with left without being seen or death on arrival/cardiopulmonary resuscitation patients. We split into two cohort: development (70%) cohort for training the interpretable ML model and test (30%) for evaluation from each hospital.

We extracted data from each hospitals electronic medical records system which all patient information was deidentified. Candidate input variables were considered with available features at the stage of ED triage including demographic characteristics such as age, gender, administrative variables including time of ED visit and clinical variables such as severity index, consciousness, and initial vital sign. Comorbidities were also obtained from hospital diagnosis records in the preceding 5years before patients emergency visit and compared for each hospital. They were extracted from International Statistical Classification of Diseases and Related Health Problems, Tenth Revision(ICD-10). The list and description of candidate predictors and comorbidities are given in the supplementary Tables6 and 7.

Emergency patients with semi-acute conditions typically undergo surgical procedure or are admitted to Intensive care unit (ICU) following emergency room treatment and given the imperative for patients to survive. Our primary outcome was 2-day mortality which was the target feature for analysis to build the interpretable ML model for each hospital.

For the multicenter study, we adopted OMOP CDM from the research network Observational Health Data Sciences and Informatics (OHDSI)28 for standardized structure and vocabularies to map emergency department data based on Systematized Nomenclature of MedicineClinical Terms (SNOMED-CT) and Logical Observation Identifiers Names and Codes (LOINC) as example shown Supplementary Fig.1. Extract, Transformation and Load (ETL) process was performed with structured query language. Each ED care and diagnosis related information was mapped into proper CDM tables as shown in Fig.2. For example, patient demographics and vital sign are mapped to Person and Measurement table, respectively. After transformation was completed into CDM format, all hospital can get the same structure and vocabularies, for executing same research query. All details of transformation and code are accessible on Gitgub29.

Table mapping for converting clinical to common data model tables. CDM: common data model; ED: Emergency department.

AutoScore Framework is a machine learning-based clinical score generator, consisting of six modules developed from Singapore12. Module 1 uses a random forest for ranking variables according to their importance. Module 2 transforms variables by categorizing continuous variables to improve interpretation with quantile information. Module 3 makes scores for each variable based on a logistic regression coefficient. Module 4 selects which variables could be included in the scoring model. In Module 5, clinical domain knowledge is incorporated to the score and cutoff points can be defined when categorizing continuous variables. Module 6 evaluates the performance of the score in a separate test dataset. The AutoScore framework provides a systematic and automated approach to develop score automatically, combining of advantage of machine learning for discriminating and the strength of logistic regression in its interpretability. For the overall score generation, We considered weighted average scores across all institutions. For each institutionsi, a weight({w}_{i})was formulated as ({w}_{i})=(left(sqrt{{(AUC}_{i})} times {N}_{i}^{3}right))/({sum }_{i=1}^{M}sqrt{{(AUC}_{i})} times {N}_{i}^{3}))100%where({N}_{i})was the sample size,({AUC}_{i}) was the AUC value obtained based on the validation set, andMwas the total number of institutions. Overall score was calculated with weighted score based on ({w}_{i}).

We defined our new novel framework CDM Autoscore for ED, combination of CDM based standardized format and autoscore based interpretable framework shown in Fig.3. The analysis and preparation code using CDM format was also shared on GitHub29.

Overall process of CDM Autoscore for ED. Each Institutions conducted Extract, Transformation and Load process for converting local data into CDM format. Algorithms from each of institution were derived using interpretable machine learning framework and validated inter-and intra- institutionally. EMR: Electronic medical records; ETL: Extract, transformation and Load; OMOP CDM: Observational Medical Outcome Partnership Common Data Model.

Categorical features were expressed as frequency and percentages and continuous features were expressed as means and standard deviations. Comparison tests for each hospital were performed with analysis of variance and chi-square tests at 5% significance levels. Standardized mean difference (SMD) was also calculated for comparing each hospital. Two types of validations for this study were conducted. First, we executed internal-institutional validation for each hospitals score. We also performed intra-institutional validation pair-wisely for the external validation. Area under the curve in the receiver operating characteristic (AUROC) and 95% confidence interval (CI) with 1000 times of bootstrap was reported. Other metrics including accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were also reported. SMOTE was conducted for handling the imbalance problem. Twice of minority was oversampled and same number of majorities according to the number of minority was sampled with fixed seed number.

Read more:
Inter hospital external validation of interpretable machine learning based triage score for the emergency department ... - Nature.com

HEAL: A framework for health equity assessment of machine learning performance – Google Research

Posted by Mike Schaekermann, Research Scientist, Google Research, and Ivor Horn, Chief Health Equity Officer & Director, Google Core

Health equity is a major societal concern worldwide with disparities having many causes. These sources include limitations in access to healthcare, differences in clinical treatment, and even fundamental differences in the diagnostic technology. In dermatology for example, skin cancer outcomes are worse for populations such as minorities, those with lower socioeconomic status, or individuals with limited healthcare access. While there is great promise in recent advances in machine learning (ML) and artificial intelligence (AI) to help improve healthcare, this transition from research to bedside must be accompanied by a careful understanding of whether and how they impact health equity.

Health equity is defined by public health organizations as fairness of opportunity for everyone to be as healthy as possible. Importantly, equity may be different from equality. For example, people with greater barriers to improving their health may require more or different effort to experience this fair opportunity. Similarly, equity is not fairness as defined in the AI for healthcare literature. Whereas AI fairness often strives for equal performance of the AI technology across different patient populations, this does not center the goal of prioritizing performance with respect to pre-existing health disparities.

In Health Equity Assessment of machine Learning performance (HEAL): a framework and dermatology AI model case study, published in The Lancet eClinicalMedicine, we propose a methodology to quantitatively assess whether ML-based health technologies perform equitably. In other words, does the ML model perform well for those with the worst health outcomes for the condition(s) the model is meant to address? This goal anchors on the principle that health equity should prioritize and measure model performance with respect to disparate health outcomes, which may be due to a number of factors that include structural inequities (e.g., demographic, social, cultural, political, economic, environmental and geographic).

The HEAL framework proposes a 4-step process to estimate the likelihood that an ML-based health technology performs equitably:

The final steps output is termed the HEAL metric, which quantifies how anticorrelated the ML models performance is with health disparities. In other words, does the model perform better with populations that have the worse health outcomes?

This 4-step process is designed to inform improvements for making ML model performance more equitable, and is meant to be iterative and re-evaluated on a regular basis. For example, the availability of health outcomes data in step (2) can inform the choice of demographic factors and brackets in step (1), and the framework can be applied again with new datasets, models and populations.

With this work, we take a step towards encouraging explicit assessment of the health equity considerations of AI technologies, and encourage prioritization of efforts during model development to reduce health inequities for subpopulations exposed to structural inequities that can precipitate disparate outcomes. We should note that the present framework does not model causal relationships and, therefore, cannot quantify the actual impact a new technology will have on reducing health outcome disparities. However, the HEAL metric may help identify opportunities for improvement, where the current performance is not prioritized with respect to pre-existing health disparities.

As an illustrative case study, we applied the framework to a dermatology model, which utilizes a convolutional neural network similar to that described in prior work. This example dermatology model was trained to classify 288 skin conditions using a development dataset of 29k cases. The input to the model consists of three photos of a skin concern along with demographic information and a brief structured medical history. The output consists of a ranked list of possible matching skin conditions.

Using the HEAL framework, we evaluated this model by assessing whether it prioritized performance with respect to pre-existing health outcomes. The model was designed to predict possible dermatologic conditions (from a list of hundreds) based on photos of a skin concern and patient metadata. Evaluation of the model is done using a top-3 agreement metric, which quantifies how often the top 3 output conditions match the most likely condition as suggested by a dermatologist panel. The HEAL metric is computed via the anticorrelation of this top-3 agreement with health outcome rankings.

We used a dataset of 5,420 teledermatology cases, enriched for diversity in age, sex and race/ethnicity, to retrospectively evaluate the models HEAL metric. The dataset consisted of store-and-forward cases from patients of 20 years or older from primary care providers in the USA and skin cancer clinics in Australia. Based on a review of the literature, we decided to explore race/ethnicity, sex and age as potential factors of inequity, and used sampling techniques to ensure that our evaluation dataset had sufficient representation of all race/ethnicity, sex and age groups. To quantify pre-existing health outcomes for each subgroup we relied on measurements from public databases endorsed by the World Health Organization, such as Years of Life Lost (YLLs) and Disability-Adjusted Life Years (DALYs; years of life lost plus years lived with disability).

However, while the model was likely to perform equitably across age groups for cancer conditions specifically, we discovered that it had room for improvement across age groups for non-cancer conditions. For example, those 70+ have the poorest health outcomes related to non-cancer skin conditions, yet the model didn't prioritize performance for this subgroup.

For holistic evaluation, the HEAL metric cannot be employed in isolation. Instead this metric should be contextualized alongside many other factors ranging from computational efficiency and data privacy to ethical values, and aspects that may influence the results (e.g., selection bias or differences in representativeness of the evaluation data across demographic groups).

As an adversarial example, the HEAL metric can be artificially improved by deliberately reducing model performance for the most advantaged subpopulation until performance for that subpopulation is worse than all others. For illustrative purposes, given subpopulations A and B where A has worse health outcomes than B, consider the choice between two models: Model 1 (M1) performs 5% better for subpopulation A than for subpopulation B. Model 2 (M2) performs 5% worse on subpopulation A than B. The HEAL metric would be higher for M1 because it prioritizes performance on a subpopulation with worse outcomes. However, M1 may have absolute performances of just 75% and 70% for subpopulations A and B respectively, while M2 has absolute performances of 75% and 80% for subpopulations A and B respectively. Choosing M1 over M2 would lead to worse overall performance for all subpopulations because some subpopulations are worse-off while no subpopulation is better-off.

Accordingly, the HEAL metric should be used alongside a Pareto condition (discussed further in the paper), which restricts model changes so that outcomes for each subpopulation are either unchanged or improved compared to the status quo, and performance does not worsen for any subpopulation.

The HEAL framework, in its current form, assesses the likelihood that an ML-based model prioritizes performance for subpopulations with respect to pre-existing health disparities for specific subpopulations. This differs from the goal of understanding whether ML will reduce disparities in outcomes across subpopulations in reality. Specifically, modeling improvements in outcomes requires a causal understanding of steps in the care journey that happen both before and after use of any given model. Future research is needed to address this gap.

The HEAL framework enables a quantitative assessment of the likelihood that health AI technologies prioritize performance with respect to health disparities. The case study demonstrates how to apply the framework in the dermatological domain, indicating a high likelihood that model performance is prioritized with respect to health disparities across sex and race/ethnicity, but also revealing the potential for improvements for non-cancer conditions across age. The case study also illustrates limitations in the ability to apply all recommended aspects of the framework (e.g., mapping societal context, availability of data), thus highlighting the complexity of health equity considerations of ML-based tools.

This work is a proposed approach to address a grand challenge for AI and health equity, and may provide a useful evaluation framework not only during model development, but during pre-implementation and real-world monitoring stages, e.g., in the form of health equity dashboards. We hold that the strength of the HEAL framework is in its future application to various AI tools and use cases and its refinement in the process. Finally, we acknowledge that a successful approach towards understanding the impact of AI technologies on health equity needs to be more than a set of metrics. It will require a set of goals agreed upon by a community that represents those who will be most impacted by a model.

The research described here is joint work across many teams at Google. We are grateful to all our co-authors: Terry Spitz, Malcolm Pyles, Heather Cole-Lewis, Ellery Wulczyn, Stephen R. Pfohl, Donald Martin, Jr., Ronnachai Jaroensri, Geoff Keeling, Yuan Liu, Stephanie Farquhar, Qinghan Xue, Jenna Lester, Can Hughes, Patricia Strachan, Fraser Tan, Peggy Bui, Craig H. Mermel, Lily H. Peng, Yossi Matias, Greg S. Corrado, Dale R. Webster, Sunny Virmani, Christopher Semturs, Yun Liu, and Po-Hsuan Cameron Chen. We also thank Lauren Winer, Sami Lachgar, Ting-An Lin, Aaron Loh, Morgan Du, Jenny Rizk, Renee Wong, Ashley Carrick, Preeti Singh, Annisah Um'rani, Jessica Schrouff, Alexander Brown, and Anna Iurchenko for their support of this project.

Go here to see the original:
HEAL: A framework for health equity assessment of machine learning performance - Google Research

Expert on how machine learning could lead to improved outcomes in urology – Urology Times

In this video, Glenn T. Werneburg, MD, PhD, shares the take-home message from the abstracts "Machine learning algorithms demonstrate accurate prediction of objective and patient-reported response to botulinum toxin for overactive bladder and outperform expert humans in an external cohort and "Machine learning algorithms predict urine culture bacterial resistance to first line antibiotic therapy at the time of sample collection, which were presented at the Society of Urodynamics, Female Pelvic Medicine & Urogenital Reconstruction 2024 Winter Meeting in Fort Lauderdale, Florida. Werneburg is a urology resident at Glickman Urological & Kidney Institute at Cleveland Clinic, Cleveland, Ohio.

We're very much looking forward to being able to clinically implement these algorithms, both on the OAB side and the antibiotic resistance side. For the OAB, if we can identify who would best respond to sacral neuromodulation, and who would best respond to onabotulinumtoxinA injection, then we're helping patients achieve an acceptable outcome faster. We're improving their incontinence or their urgency in a more efficient way. So we're enthusiastic about this. Once we can implement this clinically, we believe it's going to help us in this way. It's the same for the antibiotic resistance algorithms. When we can get these into the hands of clinicians, we'll be able to have a good suggestion in terms of which is the best antibiotic to use for this patient at this time. And in doing so, we hope to be able to improve our antibiotic stewardship. Ideally, we would use an antibiotic with the narrowest spectrum that would still cover the infecting organism, and in doing so, it reduces the risk for resistance. So if that same patient requires an antibiotic later on in his or her lifetime, chances areand we'd have to determine this with data and experimentsif we're implementing a narrower spectrum antibiotic to treat an infection, they're going to be less likely to be resistant to other antibiotics down the line.

This transcription was edited for clarity.

Excerpt from:
Expert on how machine learning could lead to improved outcomes in urology - Urology Times