Machine learning results: pay attention to what you don’t see – STAT
Even as machine learning and artificial intelligence are drawing substantial attention in health care, overzealousness for these technologies has created an environment in which other critical aspects of the research are often overlooked.
Theres no question that the increasing availability of large data sources and off-the-shelf machine learning tools offer tremendous resources to researchers. Yet a lack of understanding about the limitations of both the data and the algorithms can lead to erroneous or unsupported conclusions.
Given that machine learning in the health domain can have a direct impact on peoples lives, broad claims emerging from this kind of research should not be embraced without serious vetting. Whether conducting health care research or reading about it, make sure to consider what you dont see in the data and analyses.
advertisement
One key question to ask is: Whose information is in the data and what do these data reflect?
Common forms of electronic health data, such as billing claims and clinical records, contain information only on individuals who have encounters with the health care system. But many individuals who are sick dont or cant see a doctor or other health care provider and so are invisible in these databases. This may be true for individuals with lower incomes or those who live in rural communities with rising hospital closures. As University of Toronto machine learning professor Marzyeh Ghassemi said earlier this year:
Even among patients who do visit their doctors, health conditions are not consistently recorded. Health data also reflect structural racism, which has devastating consequences.
Data from randomized trials are not immune to these issues. As a ProPublica report demonstrated, black and Native American patients are drastically underrepresented in cancer clinical trials. This is important to underscore given that randomized trials are frequently highlighted as superior in discussions about machine learning work that leverages nonrandomized electronic health data.
In interpreting results from machine learning research, its important to be aware that the patients in a study often do not depict the population we wish to make conclusions about and that the information collected is far from complete.
It has become commonplace to evaluate machine learning algorithms based on overall measures like accuracy or area under the curve. However, one evaluation metric cannot capture the complexity of performance. Be wary of research that claims to be ready for translation into clinical practice but only presents a leader board of tools that are ranked based on a single metric.
As an extreme illustration, an algorithm designed to predict a rare condition found in only 1% of the population can be extremely accurate by labeling all individuals as not having the condition. This tool is 99% accurate, but completely useless. Yet, it may outperform other algorithms if accuracy is considered in isolation.
Whats more, algorithms are frequently not evaluated based on multiple hold-out samples in cross-validation. Using only a single hold-out sample, which is done in many published papers, often leads to higher variance and misleading metric performance.
Beyond examining multiple overall metrics of performance for machine learning, we should also assess how tools perform in subgroups as a step toward avoiding bias and discrimination. For example, artificial intelligence-based facial recognition software performed poorly when analyzing darker-skinned women. Many measures of algorithmic fairness center on performance in subgroups.
Bias in algorithms has largely not been a focus in health care research. That needs to change. A new study found substantial racial bias against black patients in a commercial algorithm used by many hospitals and other health care systems. Other work developed algorithms to improve fairness for subgroups in health care spending formulas.
Subjective decision-making pervades research. Who decides what the research question will be, which methods will be applied to answering it, and how the techniques will be assessed all matter. Diverse teams are needed not just because they yield better results. As Rediet Abebe, a junior fellow of Harvards Society of Fellows, has written, In both private enterprise and the public sector, research must be reflective of the society were serving.
The influx of so-called digital data thats available through search engines and social media may be one resource for understanding the health of individuals who do not have encounters with the health care system. There have, however, been notable failures with these data. But there are also promising advances using online search queries at scale where traditional approaches like conducting surveys would be infeasible.
Increasingly granular data are now becoming available thanks to wearable technologies such as Fitbit trackers and Apple Watches. Researchers are actively developing and applying techniques to summarize the information gleaned from these devices for prevention efforts.
Much of the published clinical machine learning research, however, focuses on predicting outcomes or discovering patterns. Although machine learning for causal questions in health and biomedicine is a rapidly growing area, we dont see a lot of this work yet because it is new. Recent examples of it include the comparative effectiveness of feeding interventions in a pediatric intensive care unit and the effectiveness of different types of drug-eluting coronary artery stents.
Understanding how the data were collected and using appropriate evaluation metrics will also be crucial for studies that incorporate novel data sources and those attempting to establish causality.
In our drive to improve health with (and without) machine learning, we must not forget to look for what is missing: What information do we not have about the underlying health care system? Why might an individual or a code be unobserved? What subgroups have not been prioritized? Who is on the research team?
Giving these questions a place at the table will be the only way to see the whole picture.
Sherri Rose, Ph.D., is associate professor of health care policy at Harvard Medical School and co-author of the first book on machine learning for causal inference, Targeted Learning (Springer, 2011).
See the article here:
Machine learning results: pay attention to what you don't see - STAT
- Machine learning-random forest model was used to construct gene signature associated with cuproptosis to predict the prognosis of gastric cancer -... - February 5th, 2025 [February 5th, 2025]
- Machine learning for predicting severe dengue in Puerto Rico - Infectious Diseases of Poverty - BioMed Central - February 5th, 2025 [February 5th, 2025]
- Panoramic radiographic features for machine learning based detection of mandibular third molar root and inferior alveolar canal contact - Nature.com - February 5th, 2025 [February 5th, 2025]
- AI and machine learning: revolutionising drug discovery and transforming patient care - Roche - February 5th, 2025 [February 5th, 2025]
- Development of a machine learning model related to explore the association between heavy metal exposure and alveolar bone loss among US adults... - February 5th, 2025 [February 5th, 2025]
- Identification of therapeutic targets for Alzheimers Disease Treatment using bioinformatics and machine learning - Nature.com - February 5th, 2025 [February 5th, 2025]
- A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine... - February 5th, 2025 [February 5th, 2025]
- Performance prediction and optimization of a high-efficiency tessellated diamond fractal MIMO antenna for terahertz 6G communication using machine... - February 5th, 2025 [February 5th, 2025]
- How machine learning and AI can be harnessed for mission-based lending - ImpactAlpha - January 27th, 2025 [January 27th, 2025]
- Machine learning meta-analysis identifies individual characteristics moderating cognitive intervention efficacy for anxiety and depression symptoms -... - January 27th, 2025 [January 27th, 2025]
- Using robotics to introduce AI and machine learning concepts into the elementary classroom - George Mason University - January 27th, 2025 [January 27th, 2025]
- Machine learning to identify environmental drivers of phytoplankton blooms in the Southern Baltic Sea - Nature.com - January 27th, 2025 [January 27th, 2025]
- Why Most Machine Learning Projects Fail to Reach Production and How to Beat the Odds - InfoQ.com - January 27th, 2025 [January 27th, 2025]
- Exploring the intersection of AI and climate physics: Machine learning's role in advancing climate science - Phys.org - January 27th, 2025 [January 27th, 2025]
- 5 Questions with Jonah Berger: Using Artificial Intelligence and Machine Learning in Litigation - Cornerstone Research - January 27th, 2025 [January 27th, 2025]
- Modernizing Patient Support: Harnessing Advanced Automation, Artificial Intelligence and Machine Learning to Improve Efficiency and Performance of... - January 27th, 2025 [January 27th, 2025]
- Param Popat Leads the Way in Transforming Machine Learning Systems - Tech Times - January 27th, 2025 [January 27th, 2025]
- Research on noise-induced hearing loss based on functional and structural MRI using machine learning methods - Nature.com - January 27th, 2025 [January 27th, 2025]
- Machine learning is bringing back an infamous pseudoscience used to fuel racism - ZME Science - January 27th, 2025 [January 27th, 2025]
- How AI and Machine Learning are Redefining Customer Experience Management - Customer Think - January 27th, 2025 [January 27th, 2025]
- Machine Learning Data Catalog Software Market Strategic Insights and Key Innovations: Leading Companies and... - WhaTech - January 27th, 2025 [January 27th, 2025]
- How AI and Machine Learning Will Influence Fintech Frontend Development in 2025 - Benzinga - January 27th, 2025 [January 27th, 2025]
- The Nvidia AI interview: Inside DLSS 4 and machine learning with Bryan Catanzaro - Eurogamer - January 22nd, 2025 [January 22nd, 2025]
- The wide use of machine learning VFX techniques on Here - befores & afters - January 22nd, 2025 [January 22nd, 2025]
- .NET Core: Pioneering the Future of AI and Machine Learning - TechBullion - January 22nd, 2025 [January 22nd, 2025]
- Development and validation of a machine learning-based prediction model for hepatorenal syndrome in liver cirrhosis patients using MIMIC-IV and eICU... - January 22nd, 2025 [January 22nd, 2025]
- A comparative study on different machine learning approaches with periodic items for the forecasting of GPS satellites clock bias - Nature.com - January 22nd, 2025 [January 22nd, 2025]
- Machine learning based prediction models for the prognosis of COVID-19 patients with DKA - Nature.com - January 22nd, 2025 [January 22nd, 2025]
- A scoping review of robustness concepts for machine learning in healthcare - Nature.com - January 22nd, 2025 [January 22nd, 2025]
- How AI and machine learning led to mind blowing progress in understanding animal communication - WHYY - January 22nd, 2025 [January 22nd, 2025]
- 3 Predictions For Predictive AI In 2025 - The Machine Learning Times - January 22nd, 2025 [January 22nd, 2025]
- AI and Machine Learning - WEF report offers practical steps for inclusive AI adoption - SmartCitiesWorld - January 22nd, 2025 [January 22nd, 2025]
- Learnings from a Machine Learning Engineer Part 3: The Evaluation | by David Martin | Jan, 2025 - Towards Data Science - January 22nd, 2025 [January 22nd, 2025]
- Google AI Research Introduces Titans: A New Machine Learning Architecture with Attention and a Meta in-Context Memory that Learns How to Memorize at... - January 22nd, 2025 [January 22nd, 2025]
- Improving BrainMachine Interfaces with Machine Learning ... - eeNews Europe - January 22nd, 2025 [January 22nd, 2025]
- Powered by machine learning, a new blood test can enable early detection of multiple cancers - Medical Xpress - January 15th, 2025 [January 15th, 2025]
- Mapping the Edges of Mass Spectral Prediction: Evaluation of Machine Learning EIMS Prediction for Xeno Amino Acids - Astrobiology News - January 15th, 2025 [January 15th, 2025]
- Development of an interpretable machine learning model based on CT radiomics for the prediction of post acute pancreatitis diabetes mellitus -... - January 15th, 2025 [January 15th, 2025]
- Understanding the spread of agriculture in the Western Mediterranean (6th-3rd millennia BC) with Machine Learning tools - Nature.com - January 15th, 2025 [January 15th, 2025]
- "From 'Food Rules' to Food Reality: Machine Learning Unveils the Ultra-Processed Truth in Our Grocery Carts" - American Council on Science... - January 15th, 2025 [January 15th, 2025]
- AI and Machine Learning in Business Market is Predicted to Reach $190.5 Billion at a CAGR of 32% by 2032 - EIN News - January 15th, 2025 [January 15th, 2025]
- QT Imaging Holdings Introduces Machine Learning-Enabled Image Interpolation Algorithm to Substantially Reduce Scan Time - Business Wire - January 15th, 2025 [January 15th, 2025]
- Global Tiny Machine Learning (TinyML) Market to Reach USD 3.4 Billion by 2030 - Key Drivers and Opportunities | Valuates Reports - PR Newswire UK - January 15th, 2025 [January 15th, 2025]
- Machine learning in mental health getting better all the time - Nature.com - January 15th, 2025 [January 15th, 2025]
- Signature-based intrusion detection using machine learning and deep learning approaches empowered with fuzzy clustering - Nature.com - January 15th, 2025 [January 15th, 2025]
- Machine learning and multi-omics in precision medicine for ME/CFS - Journal of Translational Medicine - January 15th, 2025 [January 15th, 2025]
- Exploring the influence of age on the causes of death in advanced nasopharyngeal carcinoma patients undergoing chemoradiotherapy using machine... - January 15th, 2025 [January 15th, 2025]
- 3D Shape Tokenization - Apple Machine Learning Research - January 9th, 2025 [January 9th, 2025]
- Machine Learning Used To Create Scalable Solution for Single-Cell Analysis - Technology Networks - January 9th, 2025 [January 9th, 2025]
- Robotics: machine learning paves the way for intuitive robots - Hello Future - January 9th, 2025 [January 9th, 2025]
- Machine learning-based estimation of crude oil-nitrogen interfacial tension - Nature.com - January 9th, 2025 [January 9th, 2025]
- Machine learning Nomogram for Predicting endometrial lesions after tamoxifen therapy in breast Cancer patients - Nature.com - January 9th, 2025 [January 9th, 2025]
- Staying ahead of the automation, AI and machine learning curve - Creamer Media's Engineering News - January 9th, 2025 [January 9th, 2025]
- Machine Learning and Quantum Computing Predict Which Antibiotic To Prescribe for UTIs - Consult QD - January 9th, 2025 [January 9th, 2025]
- Machine Learning, Innovation, And The Future Of AI: A Conversation With Manoj Bhoyar - International Business Times UK - January 9th, 2025 [January 9th, 2025]
- AMD's FSR 4 will use machine learning but requires an RDNA 4 GPU, promises 'a dramatic improvement in terms of performance and quality' - PC Gamer - January 9th, 2025 [January 9th, 2025]
- Explainable artificial intelligence with UNet based segmentation and Bayesian machine learning for classification of brain tumors using MRI images -... - January 9th, 2025 [January 9th, 2025]
- Understanding the Fundamentals of AI and Machine Learning - Nairobi Wire - January 9th, 2025 [January 9th, 2025]
- Machine learning can help blood tests have a separate normal for each patient - The Hindu - January 1st, 2025 [January 1st, 2025]
- Artificial Intelligence and Machine Learning Programs Introduced this Spring - The Flash Today - January 1st, 2025 [January 1st, 2025]
- Virtual reality-assisted prediction of adult ADHD based on eye tracking, EEG, actigraphy and behavioral indices: a machine learning analysis of... - January 1st, 2025 [January 1st, 2025]
- Open source machine learning systems are highly vulnerable to security threats - TechRadar - December 22nd, 2024 [December 22nd, 2024]
- After the PS5 Pro's less dramatic changes, PlayStation architect Mark Cerny says the next-gen will focus more on CPUs, memory, and machine-learning -... - December 22nd, 2024 [December 22nd, 2024]
- Accelerating LLM Inference on NVIDIA GPUs with ReDrafter - Apple Machine Learning Research - December 22nd, 2024 [December 22nd, 2024]
- Machine learning for the prediction of mortality in patients with sepsis-associated acute kidney injury: a systematic review and meta-analysis - BMC... - December 22nd, 2024 [December 22nd, 2024]
- Machine learning uncovers three osteosarcoma subtypes for targeted treatment - Medical Xpress - December 22nd, 2024 [December 22nd, 2024]
- From Miniatures to Machine Learning: Crafting the VFX of Alien: Romulus - Animation World Network - December 22nd, 2024 [December 22nd, 2024]
- Identification of hub genes, diagnostic model, and immune infiltration in preeclampsia by integrated bioinformatics analysis and machine learning -... - December 22nd, 2024 [December 22nd, 2024]
- This AI Paper from Microsoft and Novartis Introduces Chimera: A Machine Learning Framework for Accurate and Scalable Retrosynthesis Prediction -... - December 18th, 2024 [December 18th, 2024]
- Benefits and Challenges of Integrating AI and Machine Learning into EHR Systems - Healthcare IT Today - December 18th, 2024 [December 18th, 2024]
- The History Of AI: How Machine Learning's Evolution Is Reshaping Everything Around Us - SlashGear - December 18th, 2024 [December 18th, 2024]
- AI and Machine Learning to Enhance Pension Plan Governance and the Investor Experience: New CFA Institute Research - Fintech Finance - December 18th, 2024 [December 18th, 2024]
- Address Common Machine Learning Challenges With Managed MLflow - The New Stack - December 18th, 2024 [December 18th, 2024]
- Machine Learning Used To Classify Fossils Of Extinct Pollen - Offworld Astrobiology Applications? - Astrobiology News - December 18th, 2024 [December 18th, 2024]
- Machine learning model predicts CDK4/6 inhibitor effectiveness in metastatic breast cancer - News-Medical.Net - December 18th, 2024 [December 18th, 2024]
- New Lockheed Martin Subsidiary to Offer Machine Learning Tools to Defense Customers - ExecutiveBiz - December 18th, 2024 [December 18th, 2024]
- How Powerful Will AI and Machine Learning Become? - International Policy Digest - December 18th, 2024 [December 18th, 2024]
- ChatGPT-Assisted Machine Learning for Chronic Disease Classification and Prediction: A Developmental and Validation Study - Cureus - December 18th, 2024 [December 18th, 2024]
- Blood Tests Are Far From Perfect But Machine Learning Could Change That - Inverse - December 18th, 2024 [December 18th, 2024]
- Amazons AGI boss: You dont need a PhD in machine learning to build with AI anymore - Fortune - December 18th, 2024 [December 18th, 2024]