Researchers find AI is bad at predicting GPA, grit, eviction, job training, layoffs, and material hardship – VentureBeat
A paper coauthored by over 112 researchers across 160 data and social science teams found that AI and statistical models, when used to predict six life outcomes for children, parents, and households, werent very accurate even when trained on 13,000 data points from over 4,000 families. They assert that the work is a cautionary tale on the use of predictive modeling, especially in the criminal justice system and social support programs.
Heres a setting where we have hundreds of participants and a rich data set, and even the best AI results are still not accurate, said study co-lead author Matt Salganik, a professor of sociology at Princeton and interim director of the Center for Information Technology Policy at the Woodrow Wilson School of Public and International Affairs. These results show us that machine learning isnt magic; there are clearly other factors at play when it comes to predicting the life course.
The study, which was published this week in the journal Proceedings of the National Academy of Sciences, is the fruit of the Fragile Families Challenge, a multi-year collaboration that sought to recruit researchers to complete a predictive task by predicting the same outcomes using the same data. Over 457 groups applied, of which 160 were selected to participate, and their predictions were evaluated with an error metric that assessed their ability to predict held-out data (i.e., data held by the organizer and not available to the participants).
The Challenge was an outgrowth of the Fragile Families Study (formerly Fragile Families and Child Wellbeing Study) based at Princeton, Columbia University, and the University of Michigan, which has been studying a cohort of about 5,000 children born in 20 large American cities between 1998 and 2000. Its designed to oversample births to unmarried couples in those cities, and to address four questions of interest to researchers and policymakers:
When we began, I really didnt know what a mass collaboration was, but I knew it would be a good idea to introduce our data to a new group of researchers: data scientists, said Sara McLanahan, the William S. Tod Professor of Sociology and Public Affairs at Princeton. The results were eye-opening.
The Fragile Families Study data set consists of modules, each of which is made up of roughly 10 sections, where each section includes questions about a topic asked of the childrens parents, caregivers, teachers, and the children themselves. For example, a mother who recently gave birth might be asked about relationships with extended kin, government programs, and marriage attitudes, while a 9-year-old child might be asked about parental supervision, sibling relationships, and school. In addition to the surveys, the corpus contains the results of in-home assessments, including psychometric testing, biometric measurements, and observations of neighborhoods and homes.
The goal of the Challenge was to predict the social outcomes of children aged 15 years, which encompasses 1,617 variables. From the variables, six were selected to be the focus:
Contributing researchers were provided anonymized background data from 4,242 families and 12,942 variables about each family, as well as training data incorporating the six outcomes for half of the families. Once the Challenge was completed, all 160 submissions were scored using the holdout data.
In the end, even the best of the over 3,000 models submitted which often used complex AI methods and had access to thousands of predictor variables werent spot on. In fact, they were only marginally better than linear regression and logistic regression, which dont rely on any form of machine learning.
Either luck plays a major role in peoples lives, or our theories as social scientists are missing some important variable, added McLanahan. Its too early at this point to know for sure.
Measured by the coefficient of determination, or the correlation of the best models predictions with the ground truth data, material hardship i.e., whether 15-year-old childrens parents suffered financial issues was .23, or 23% accuracy. GPA predictions were 0.19 (19%), while grit, eviction, job training, and layoffs were 0.06 (6%), 0.05 (5%), and 0.03 (3%), respectively.
The results raise questions about the relative performance of complex machine-learning models compared with simple benchmark models. In the Challenge, the simple benchmark model with only a few predictors was only slightly worse than the most accurate submission, and it actually outperformed many of the submissions, concluded the studys coauthors. Therefore, before using complex predictive models, we recommend that policymakers determine whether the achievable level of predictive accuracy is appropriate for the setting where the predictions will be used, whether complex models are more accurate than simple models or domain experts in their setting, and whether possible improvement in predictive performance is worth the additional costs to create, test, and understand the more complex model.
The research team is currently applying for grants to continue studies in this area, and theyve also published 12 of the teams results in a special issue of a journal called Socius, a new open-access journal from the American Sociological Association. In order to support additional research, all the submissions to the Challenge including the code, predictions, and narrative explanations will be made publicly available.
The Challenge isnt the first to expose the predictive shortcomings of AI and machine learning models. The Partnership on AI, a nonprofit coalition committed to the responsible use of AI, concluded in its first-ever report last year that algorithms are unfit to automate the pre-trial bail process or label some people as high-risk and detain them. The use of algorithms in decision making for judges has been known to produce race-based unfair results that are more likely to label African-American inmates as at risk of recidivism.
Its well-understood that AI has a bias problem. For instance, word embedding, a common algorithmic training technique that involves linking words to vectors, unavoidably picks up and at worst amplifies prejudices implicit in source text and dialogue. A recent study by the National Institute of Standards and Technology (NIST) found that many facial recognition systems misidentify people of color more often than Caucasian faces. And Amazons internal recruitment tool which was trained on resumes submitted over a 10-year period was reportedly scrapped because it showed bias against women.
A number of solutions have been proposed, from algorithmic tools to services that detect bias by crowdsourcing large training data sets.
In June 2019, working with experts in AI fairness, Microsoft revised and expanded the data sets it uses to train Face API, a Microsoft Azure API that provides algorithms for detecting, recognizing, and analyzing human faces in images. Last May, Facebook announced Fairness Flow, which automatically sends a warning if an algorithm is making an unfair judgment about a person based on their race, gender, or age. Google recently released the What-If Tool, a bias-detecting feature of the TensorBoard web dashboard for its TensorFlow machine learning framework. Not to be outdone, IBM last fall released AI Fairness 360, a cloud-based, fully automated suite that continually provides [insights] into how AI systems are making their decisions and recommends adjustments such as algorithmic tweaks or counterbalancing data that might lessen the impact of prejudice.
Continued here:
Researchers find AI is bad at predicting GPA, grit, eviction, job training, layoffs, and material hardship - VentureBeat
- Combining multi-parametric MRI radiomics features with tumor abnormal protein to construct a machine learning-based predictive model for prostate... - July 2nd, 2025 [July 2nd, 2025]
- New insight into viscosity prediction of imidazolium-based ionic liquids and their mixtures with machine learning models - Nature - July 2nd, 2025 [July 2nd, 2025]
- Implementing partial least squares and machine learning regressive models for prediction of drug release in targeted drug delivery application -... - July 2nd, 2025 [July 2nd, 2025]
- Advanced analysis of defect clusters in nuclear reactors using machine learning techniques - Nature - July 2nd, 2025 [July 2nd, 2025]
- Machine learning analysis of kinematic movement features during functional tasks to discriminate chronic neck pain patients from asymptomatic controls... - July 2nd, 2025 [July 2nd, 2025]
- Enhanced machine learning models for predicting three-year mortality in Non-STEMI patients aged 75 and above - BMC Geriatrics - July 2nd, 2025 [July 2nd, 2025]
- Modeling seawater intrusion along the Alabama coastline using physical and machine learning models to evaluate the effects of multiscale natural and... - July 2nd, 2025 [July 2nd, 2025]
- A comprehensive study based on machine learning models for early identification Mycoplasma pneumoniae infection in segmental/lobar pneumonia - Nature - July 2nd, 2025 [July 2nd, 2025]
- Identifying ovarian cancer with machine learning DNA methylation pattern analysis - Nature - July 2nd, 2025 [July 2nd, 2025]
- High-isolation dual-band MIMO antenna for next-generation 5G wireless networks at 28/38 GHz with machine learning-based gain prediction - Nature - July 2nd, 2025 [July 2nd, 2025]
- Sony and AMD want to focus on machine learning for the PS6 - Instant Gaming News - July 2nd, 2025 [July 2nd, 2025]
- How Machine Learning is Reshaping the Future of Sports Betting? - London Daily News - July 2nd, 2025 [July 2nd, 2025]
- An interpretable machine learning model for predicting depression in middle-aged and elderly cancer patients in China: a study based on the CHARLS... - July 2nd, 2025 [July 2nd, 2025]
- These Eight Projects Showcase the Power of Machine Learning on the Edge - Hackster.io - June 29th, 2025 [June 29th, 2025]
- Build Custom AI Tools for Your AI Agents that Combine Machine Learning and Statistical Analysis - MarkTechPost - June 29th, 2025 [June 29th, 2025]
- Check out these essential tips and trends for SEO in 2025 as AI and machine learning loom large - EdTech Innovation Hub - June 29th, 2025 [June 29th, 2025]
- Using machine learning to predict the severity of salmonella infection - Open Access Government - June 28th, 2025 [June 28th, 2025]
- How AI and machine learning are transforming drug discovery - Pharmaceutical Technology - June 28th, 2025 [June 28th, 2025]
- Capturing the complexity of human strategic decision-making with machine learning - Nature - June 26th, 2025 [June 26th, 2025]
- A framework to evaluate machine learning crystal stability predictions - Nature - June 24th, 2025 [June 24th, 2025]
- Machine learning revealed giant thermal conductivity reduction by strong phonon localization in two-angle disordered twisted multilayer graphene -... - June 24th, 2025 [June 24th, 2025]
- How AI and Machine Learning Are Powering the Next Generation of Pump Maintenance - Robotics Tomorrow - June 24th, 2025 [June 24th, 2025]
- Actuate Therapeutics Reports Positive Biomarker and Machine Learning Data from Phase 2 Elraglusib Trial in First-Line Treatment of Metastatic... - June 24th, 2025 [June 24th, 2025]
- Texas A&M Researchers Introduce a Two-Phase Machine Learning Method Named ShockCast for High-Speed Flow Simulation with Neural Temporal Re-Meshing -... - June 22nd, 2025 [June 22nd, 2025]
- Machine learning method helps bring diagnostic testing out of the lab - Medical Xpress - June 22nd, 2025 [June 22nd, 2025]
- Sebi proposes five-point rulebook for responsible use of AI, machine learning - The New Indian Express - June 22nd, 2025 [June 22nd, 2025]
- HAPIR: a refined Hallmark gene set-based machine learning approach for predicting immunotherapy response in cancer patients - Nature - June 20th, 2025 [June 20th, 2025]
- Machine learning boosts accuracy of point-of-care disease detection - News-Medical - June 20th, 2025 [June 20th, 2025]
- How AI and Machine Learning Are Transforming Food Poisoning Outbreak Detection - Food Poisoning News - June 20th, 2025 [June 20th, 2025]
- Evo 2 machine learning model enlists the power of AI in the fight against diseases - Medical Xpress - June 20th, 2025 [June 20th, 2025]
- Machine learning can predict which babies will be born with low birth weights - Medical Xpress - June 20th, 2025 [June 20th, 2025]
- Development and Validation of a Machine Learning Model for Identifying Novel HIV Integrase Inhibitors - Cureus - June 20th, 2025 [June 20th, 2025]
- IIT launches new online certificate programme in data science and machine learning for working profession - Times of India - June 20th, 2025 [June 20th, 2025]
- Calgary startup tackles referee abuse with microphones and machine learning - Yahoo - June 20th, 2025 [June 20th, 2025]
- New machine learning program accurately predicts who will stick with their exercise program - AOL.com - June 20th, 2025 [June 20th, 2025]
- Machine learning and generative AI: What are they good for in 2025? - MIT Sloan - June 4th, 2025 [June 4th, 2025]
- Researchers use machine learning to improve gene therapy - Stanford Report - June 4th, 2025 [June 4th, 2025]
- Machine learning for workpiece mass prediction using real and synthetic acoustic data - Nature - June 4th, 2025 [June 4th, 2025]
- Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Input Representations Matter - Apple Machine Learning Research - June 4th, 2025 [June 4th, 2025]
- Machine learning models for predicting severe acute kidney injury in patients with sepsis-induced myocardial injury - Nature - June 4th, 2025 [June 4th, 2025]
- A machine learning approach to carbon emissions prediction of the top eleven emitters by 2030 and their prospects for meeting Paris agreement targets... - June 4th, 2025 [June 4th, 2025]
- Augmentation of wastewater-based epidemiology with machine learning to support global health surveillance - Nature - June 4th, 2025 [June 4th, 2025]
- Analysis of a nonsteroidal anti inflammatory drug solubility in green solvent via developing robust models based on machine learning technique -... - June 4th, 2025 [June 4th, 2025]
- Your DNA Is a Machine Learning Model: Its Already Out There - Towards Data Science - June 4th, 2025 [June 4th, 2025]
- Development and validation of a risk prediction model for kinesiophobia in postoperative lung cancer patients: an interpretable machine learning... - June 4th, 2025 [June 4th, 2025]
- Predicting long-term patency of radiocephalic arteriovenous fistulas with machine learning and the PREDICT-AVF web app - Nature - June 4th, 2025 [June 4th, 2025]
- How Machine Learning and Cascade Learning Open Doors of Advanced Automation - Supply & Demand Chain Executive - June 4th, 2025 [June 4th, 2025]
- New Hydrogenation Reaction Mechanism for Superhydride Revealed by Machine Learning - Asia Research News | - June 4th, 2025 [June 4th, 2025]
- AI experiences rapid adoption, but with mixed outcomes Highlights from VotE: AI & Machine Learning - S&P Global - June 4th, 2025 [June 4th, 2025]
- IIPE introduces online M.Tech in Data Science and Machine Learning for working professionals - India Today - June 4th, 2025 [June 4th, 2025]
- Introducing Windows ML: The future of machine learning development on Windows - Windows Blog - May 19th, 2025 [May 19th, 2025]
- Settlement strategies and their driving mechanisms of Neolithic settlements using machine learning approaches: a case study in Zhejiang Province -... - May 19th, 2025 [May 19th, 2025]
- MyWear revolutionizes real-time health monitoring with comparative analysis of machine learning - Nature - May 19th, 2025 [May 19th, 2025]
- Leveraging stacking machine learning models and optimization for improved cyberattack detection - Nature - May 19th, 2025 [May 19th, 2025]
- Predicting land suitability for wheat and barley crops using machine learning techniques - Nature - May 10th, 2025 [May 10th, 2025]
- AI and Machine Learning - Ribeiro Preto adopts Optibus to optimise public bus system - Smart Cities World - May 10th, 2025 [May 10th, 2025]
- Childrens Hospital Los Angeles Leads Development of First Machine Learning Tool to Predict Risk of Cisplatin-Induced Hearing Loss - Business Wire - May 10th, 2025 [May 10th, 2025]
- Google is using machine learning to help Android users avoid unwanted and dangerous notifications - BetaNews - May 10th, 2025 [May 10th, 2025]
- London School of Emerging Technology (LSET) Concludes International Workshop on Emerging AI & Machine Learning Innovation - Barchart.com - May 10th, 2025 [May 10th, 2025]
- Thermal performance, entropy generation, and machine learning insights of AlO-TiO hybrid nanofluids in turbulent flow - Nature - May 10th, 2025 [May 10th, 2025]
- Predicting the efficacy of bevacizumab on peritumoral edema based on imaging features and machine learning - Nature - May 10th, 2025 [May 10th, 2025]
- How AI and machine learning are supercharging video conferencing tools - European CEO - May 10th, 2025 [May 10th, 2025]
- The need for a risk-based approach to AI and machine learning in healthcare - Health Tech World - May 10th, 2025 [May 10th, 2025]
- Integrated bioinformatics, machine learning, and molecular docking reveal crosstalk genes and potential drugs between periodontitis and systemic lupus... - May 10th, 2025 [May 10th, 2025]
- Adversarial Machine Learning in Detecting Inauthentic Behavior on Social Platforms - AiThority - May 10th, 2025 [May 10th, 2025]
- Exploring crop health and its associations with fungal soil microbiome composition using machine learning applied to remote sensing data - Nature - May 10th, 2025 [May 10th, 2025]
- Trust-based model and machine learning improve forest fire detection system - International Fire & Safety Journal - May 10th, 2025 [May 10th, 2025]
- A machine learning engineer shares the rsums that landed her jobs at Meta and X and what she'd change if she applied again - Business Insider Africa - May 5th, 2025 [May 5th, 2025]
- Recentive Analytics v. Fox: The Federal Circuit Provides Analysis on the Patent Eligibility of Machine Learning Claims - Mintz - May 5th, 2025 [May 5th, 2025]
- A machine learning engineer shares the rsums that landed her jobs at Meta and X and what she'd change if she applied again - Business Insider - May 5th, 2025 [May 5th, 2025]
- Enhancing urban resilience through machine learning-supported flood risk assessment: integrating flood susceptibility with building function... - May 5th, 2025 [May 5th, 2025]
- MicroAlgo Inc. Develops Classifier Auto-Optimization Technology Based on Variational Quantum Algorithms, Accelerating the Advancement of Quantum... - May 5th, 2025 [May 5th, 2025]
- Enhanced metal ion adsorption using ZnO-MXene nanocomposites with machine learning-based performance prediction - Nature - May 5th, 2025 [May 5th, 2025]
- Integrating SHAP analysis with machine learning to predict postpartum hemorrhage in vaginal births - BMC Pregnancy and Childbirth - May 5th, 2025 [May 5th, 2025]
- Machine learning provide new insights into how the brain responds to heroin use - News-Medical - May 2nd, 2025 [May 2nd, 2025]
- Machine Learning and AI in Basic HIV Research: From Big Data Analysis to Large Language Models - UNC Gillings School of Global Public Health - May 2nd, 2025 [May 2nd, 2025]
- Machine learning brings new insights to cells role in addiction, relapse - University of Cincinnati - May 2nd, 2025 [May 2nd, 2025]
- UH/UC Researchers Use Machine Learning to Map Brain Changes from Heroin Addiction - University of Houston - May 2nd, 2025 [May 2nd, 2025]
- Machine Learning Algorithm Predicts Shiba Inu Price In May You Should See This - The Crypto Update - May 2nd, 2025 [May 2nd, 2025]
- Seerist partners with SOCOM to enhance AI and machine learning for special operations - Defence Industry Europe - May 2nd, 2025 [May 2nd, 2025]