Archive for the ‘Machine Learning’ Category

DeepDive: estimating global biodiversity patterns through time using deep learning – Nature.com

Sepkoski, J. J. A factor analytic description of the phanerozoic marine fossil record. Paleobiology 7, 3653 (1981).

Article Google Scholar

Quental, T. B. & Marshall, C. R. Diversity dynamics: molecular phylogenies need the fossil record. Trends Ecol. Evol. 25, 434441 (2010).

Article PubMed Google Scholar

Ezard, T. H., Aze, T., Pearson, P. N. & Purvis, A. Interplay between changing climate and species ecology drives macroevolutionary dynamics. Science 332, 349351 (2011).

Article ADS CAS PubMed Google Scholar

Benton, M. J. Exploring macroevolution using modern and fossil data. Proc. R. Soc. B: Biol. Sci. 282, 20150569 (2015).

Article Google Scholar

Niklas, K. J. Measuring the tempo of plant death and birth. N. Phytol. 207, 254256 (2015).

Article Google Scholar

Rabosky, D. L. & Hurlbert, A. H. Species richness at continental scales is dominated by ecological limits. Am. Nat. 185, 572583 (2015).

Article PubMed Google Scholar

Harmon, L. J. & Harrison, S. Species diversity is dynamic and unbounded at local and continental scales. Am. Nat. 185, 584593 (2015).

Article PubMed Google Scholar

Sepkoski Jr, J. Phanerozoic overview of mass extinction. In Patterns and Processes in the History of Life: Report of the Dahlem Workshop on Patterns and Processes in the History of Life Berlin 1985, June 1621, 277295 (Springer, 1986).

Benton, M. J. & Emerson, B. C. How did life become so diverse? the dynamics of diversification according to the fossil record and molecular phylogenetics. Palaeontology 50, 2340 (2007).

Article Google Scholar

Alroy, J. Geographical, environmental and intrinsic biotic controls on phanerozoic marine diversification. Palaeontology 53, 12111235 (2010).

Article Google Scholar

Weber, M. G., Wagner, C. E., Best, R. J., Harmon, L. J. & Matthews, B. Evolution in a community context: on integrating ecological interactions and macroevolution. Trends Ecol. Evol. 32, 291304 (2017).

Article PubMed Google Scholar

Niklas, K. J., Tiffney, B. H. & Knoll, A. H. Patterns in vascular land plant diversification. Nature 303, 614 616 (1983).

Article Google Scholar

Foote, M., Miller, A., Raup, D. & Stanley, S.Principles of Paleontology (W. H. Freeman, 2007). https://books.google.ch/books?id=8TsDC2OOvbYC

Close, R., Benson, R., Saupe, E., Clapham, M. & Butler, R. The spatial structure of phanerozoic marine animal diversity. Science 368, 420424 (2020).

Article ADS CAS PubMed Google Scholar

Raja, N. B. et al. Colonial history and global economics distort our understanding of deep-time biodiversity. Nat. Ecol. Evol. 6, 145154 (2022).

Article PubMed Google Scholar

Smith, A. B. & McGowan, A. J. The ties linking rock and fossil records and why they are important for palaeobiodiversity studies. Geol. Soc. Lond. Spec. Publ. 358, 17 (2011).

Article ADS Google Scholar

Benson, R., Butler, R., Close, R., Saupe, E. & Rabosky, D. Biodiversity across space and time in the fossil record. Curr. Biol. 31, R1225R1236 (2021).

Article CAS PubMed Google Scholar

Smith, A. B. Largescale heterogeneity of the fossil record: implications for phanerozoic biodiversity studies. Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci. 356, 351367 (2001).

Article CAS Google Scholar

Alroy, J. Fair sampling of taxonomic richness and unbiased estimation of origination and extinction rates. Paleontol. Soc. Pap. 16, 5580 (2010).

Article Google Scholar

Chao, A. & Jost, L. Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size. Ecology 93, 25332547 (2012).

Article PubMed Google Scholar

Raup, D. Taxonomic diversity estimation using rarefaction. Paleobiology 1, 333342 (1975).

Article Google Scholar

Alroy, J. et al. Effects of sampling standardization on estimates of phanerozoic marine diversification. Proc. Natl Acad. Sci. 98, 62616266 (2001).

Article ADS CAS PubMed PubMed Central Google Scholar

Starrfelt, J. & Liow, L. H. How many dinosaur species were there? fossil bias and true richness estimated using a poisson sampling model. Philos. Trans. R. Soc. B: Biol. Sci. 371, 20150219 (2016).

Article Google Scholar

Flannery-Sutherland, J. T., Silvestro, D. & Benton, M. J. Global diversity dynamics in the fossil record are regionally heterogeneous. Nat. Commun. 13, 117 (2022).

Article Google Scholar

Chao, A. Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43, 783791 (1987).

Alroy, J. Limits to species richness in terrestrial communities. Ecol. Lett. 21, 17811789 (2018).

Article PubMed Google Scholar

Alroy, J. On four measures of taxonomic richness. Paleobiology 46, 158175 (2020).

Article Google Scholar

Close, R., Evers, S., Alroy, J. & Butler, R. How should we estimate diversity in the fossil record? testing richness estimators using sampling-standardised discovery curves. Methods Ecol. Evol. 9, 13861400 (2018).

Article Google Scholar

Close, R. et al. The apparent exponential radiation of phanerozoic land vertebrates is an artefact of spatial sampling biases. Proc. R. Soc. B 287, 20200372 (2020).

Article PubMed PubMed Central Google Scholar

Antell, G. T., Benson, R. B. & Saupe, E. E. Spatial standardization of taxon occurrence dataa call to action. Paleobiology https://doi.org/10.1017/pab.2023.36 (2024).

Dunne, E. M., Thompson, S. E., Butler, R. J., Rosindell, J. & Close, R. A. Mechanistic neutral models show that sampling biases drive the apparent explosion of early tetrapod diversity. Nat. Ecol. Evol. 7, 14801489 (2023).

Article PubMed PubMed Central Google Scholar

Hauffe, T., Pires, M. M., Quental, T. B., Wilke, T. & Silvestro, D. A quantitative framework to infer the effect of traits, diversity and environment on dispersal and extinction rates from fossils. Methods Ecol. Evol. 13, 12011213 (2022).

Article Google Scholar

Cermeo, P. et al. Post-extinction recovery of the phanerozoic oceans and biodiversity hotspots. Nature 607, 507511 (2022).

Article ADS PubMed PubMed Central Google Scholar

Hagen, O. et al. gen3sis: a general engine for eco-evolutionary simulations of the processes that shape earths biodiversity. PLoS Biol. 19, e3001340 (2021).

Article CAS PubMed PubMed Central Google Scholar

Hagen, O., Skeels, A., Onstein, R. E., Jetz, W. & Pellissier, L. Earth history events shaped the evolution of uneven biodiversity across tropical moist forests. Proc. Natl Acad. Sci. 118, e2026347118 (2021).

Article CAS PubMed PubMed Central Google Scholar

Vilhena, D. A. & Smith, A. B. Spatial bias in the marine fossil record. PLoS One 8, e74470 (2013).

Article ADS CAS PubMed PubMed Central Google Scholar

Raup, D. M. Taxonomic diversity during the phanerozoic: the increase in the number of marine species since the paleozoic may be more apparent than real. Science 177, 10651071 (1972).

Article ADS CAS PubMed Google Scholar

Raup, D. M. Species diversity in the phanerozoic: a tabulation. Paleobiology 2, 279288 (1976).

Article Google Scholar

Foote, M., Crampton, J. S., Beu, A. G. & Nelson, C. S. Aragonite bias, and lack of bias, in the fossil record: lithological, environmental, and ecological controls. Paleobiology 41, 245265 (2015).

Article Google Scholar

Silvestro, D., Salamin, N. & Schnitzler, J. Pyrate: a new program to estimate speciation and extinction rates from incomplete fossil data. Methods Ecol. Evol. 5, 11261131 (2014).

Article Google Scholar

Cantalapiedra, J. L. et al. The rise and fall of proboscidean ecological diversity. Nat. Ecol. Evol. 5, 12661272 (2021).

Article PubMed Google Scholar

Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533536 (1986).

Article ADS Google Scholar

Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 17351780 (1997).

Article CAS PubMed Google Scholar

Gers, F., Schmidhuber, J. & Cummins, F. Learning to forget: continual prediction with lstm. Neural Comput. 12, 24512471 (2000).

Article CAS PubMed Google Scholar

Gal, Y. & Ghahramani, Z. A theoretically grounded application of dropout in recurrent neural networks. Adv. Neural Inform. Process. Syst. 29, 19 (2016).

Gal, Y. & Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning 48, 10501059 (PMLR, 2016).

Silvestro, D. & Andermann, T. Prior choice affects ability of bayesian neural networks to identify unknowns. arXiv preprint arXiv:2005.04987 (2020).

Brusatte, S. L. et al. The extinction of the dinosaurs. Biol. Rev. 90, 628642 (2015).

Article PubMed Google Scholar

Dunne, E. M., Farnsworth, A., Greene, S. E., Lunt, D. J. & Butler, R. J. Climatic drivers of latitudinal variation in late triassic tetrapod diversity. Palaeontology 64, 101117 (2021).

Article Google Scholar

De Celis, A., Narvez, I., Arcucci, A. & Ortega, F. Lagersttte effect drives notosuchian palaeodiversity (crocodyliformes, notosuchia). Historical Biol. 33, 30313040 (2021).

Article Google Scholar

Cleary, T. J., Benson, R. B., Holroyd, P. A. & Barrett, P. M. Tracing the patterns of non-marine turtle richness from the triassic to the palaeogene: from origin to global spread. Palaeontology 63, 753774 (2020).

Article Google Scholar

Silvestro, D. et al. Fossil data support a pre-Cretaceous origin of flowering plants. Nat. Ecol. Evol. 5, 449457 (2021).

Leuenberger, C. & Wegmann, D. Bayesian computation and model selection without likelihoods. Genetics 184, 243252 (2010).

Article PubMed PubMed Central Google Scholar

Marjoram, P., Molitor, J., Plagnol, V. & Tavar, S. Markov chain monte carlo without likelihoods. Proc. Natl Acad. Sci. 100, 1532415328 (2003).

Article ADS CAS PubMed PubMed Central Google Scholar

Go here to see the original:
DeepDive: estimating global biodiversity patterns through time using deep learning - Nature.com

Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health … – Nature.com

Miller, M. A. & Hinman, A. R. In Vaccines, 6th edn (eds Plotkin, S. A., Orenstein, W. A., & Offit, P. A.) 14131426 (W.B. Saunders, 2013).

Ozawa, S. et al. Return on investment from childhood immunization in low- and middle-income countries, 201120. Health Aff. (Project Hope) 35, 199207. https://doi.org/10.1377/hlthaff.2015.1086 (2016).

Article Google Scholar

Bloom, D. E. In Hot Topics in Infection and Immunity in Children VII (eds Curtis, N., Finn, A., & Pollard, A. J.) 18 (Springer, 2011).

Sim, S. Y., Watts, E., Constenla, D., Brenzel, L. & Patenaude, B. N. Return on investment from immunization against 10 pathogens in 94 low- and middle-income countries, 201130. Health Aff. (Project Hope) 39, 13431353. https://doi.org/10.1377/hlthaff.2020.00103 (2020).

Article Google Scholar

Machingaidze, S., Wiysonge, C. S. & Hussey, G. D. Strengthening the expanded programme on immunization in Africa: Looking beyond 2015. PLoS Med. 10, e1001405 (2013).

Article PubMed PubMed Central Google Scholar

Masud, T. & Navaratne, K. V. The expanded program on immunization in Pakistan: Recommendations for improving performance. (2012).

WHO/UNICEF. Progress and challenges with achieving universal immunization coverage. (2020).

WHO and UNICEF: Progress and Challenges with Achieving Universal Immunization Coverage. (WHO/UNICEF Estimates of National Immunization Coverage, J., 2019).

UNICEF. Under Five Mortality. https://data.unicef.org/topic/child-survival/under-five-mortality/ (2023).

WHO/UNICEF. Estimates of National Immunization Coverage. http://www.who.int/news-room/fact-sheets/detail/immunization-coverage (2021).

Debie, A., Lakew, A. M., Tamirat, K. S., Amare, G. & Tesema, G. A. Complete vaccination service utilization inequalities among children aged 1223 months in Ethiopia: A multivariate decomposition analyses. Int. J. Equity Health 19, 65. https://doi.org/10.1186/s12939-020-01166-8 (2020).

Article PubMed PubMed Central Google Scholar

UNICEF. (2020).

Faisal, S. et al. Modeling the factors associated with incomplete immunization among children. Math. Probl. Eng. 2022 (2022).

Negussie, A., Kassahun, W., Assegid, S. & Hagan, A. K. Factors associated with incomplete childhood immunization in Arbegona district, southern Ethiopia: A case-control study. BMC Public Health 16, 27. https://doi.org/10.1186/s12889-015-2678-1 (2016).

Article PubMed PubMed Central Google Scholar

Nour, T. Y. et al. Predictors of immunization coverage among 1223 month old children in Ethiopia: Systematic review and meta-analysis. BMC Public Health 20, 1803. https://doi.org/10.1186/s12889-020-09890-0 (2020).

Article PubMed PubMed Central Google Scholar

Tesema, G. A., Tessema, Z. T., Tamirat, K. S. & Teshale, A. B. Complete basic childhood vaccination and associated factors among children aged 1223 months in East Africa: A multilevel analysis of recent demographic and health surveys. BMC Public Health 20, 1837. https://doi.org/10.1186/s12889-020-09965-y (2020).

Article PubMed PubMed Central Google Scholar

Skull, S. A., Ngeow, J. Y. Y., Hogg, G. & Biggs, B.-A. Incomplete immunity and missed vaccination opportunities in East African immigrants settling in Australia. J. Immigr. Minor. Health 10, 263268. https://doi.org/10.1007/s10903-007-9071-9 (2008).

Article PubMed Google Scholar

Adedokun, S. T., Uthman, O. A., Adekanmbi, V. T. & Wiysonge, C. S. Incomplete childhood immunization in Nigeria: A multilevel analysis of individual and contextual factors. BMC Public Health 17, 236. https://doi.org/10.1186/s12889-017-4137-7 (2017).

Article PubMed PubMed Central Google Scholar

Russo, G. et al. Vaccine coverage and determinants of incomplete vaccination in children aged 1223 months in Dschang, West Region, Cameroon: A cross-sectional survey during a polio outbreak. BMC Public Health 15, 630. https://doi.org/10.1186/s12889-015-2000-2 (2015).

Article PubMed PubMed Central Google Scholar

Mohamud Hayir, T. M., Magan, M. A., Mohamed, L. M., Mohamud, M. A. & Muse, A. A. Barriers for full immunization coverage among under 5 years children in Mogadishu, Somalia. J. Fam. Med. Prim. Care 9, 26642669. https://doi.org/10.4103/jfmpc.jfmpc_119_20 (2020).

Article Google Scholar

Kebede Kassaw, A. A. et al. Spatial distribution and machine learning prediction of sexually transmitted infections and associated factors among sexually active men and women in Ethiopia, evidence from EDHS 2016. BMC Infect. Dis. 23, 49. https://doi.org/10.1186/s12879-023-07987-6 (2023).

Article PubMed PubMed Central Google Scholar

DHS. Data Collection. https://www.dhsprogram.com/Data/.

Etana, B. & Deressa, W. Factors associated with complete immunization coverage in children aged 1223 months in Ambo Woreda, Central Ethiopia. BMC Public Health 12, 566. https://doi.org/10.1186/1471-2458-12-566 (2012).

Article PubMed PubMed Central Google Scholar

Kassahun, M. B., Biks, G. A. & Teferra, A. S. Level of immunization coverage and associated factors among children aged 1223 months in Lay Armachiho District, North Gondar Zone, Northwest Ethiopia: A community based cross sectional study. BMC. Res. Notes 8, 239. https://doi.org/10.1186/s13104-015-1192-y (2015).

Article PubMed PubMed Central Google Scholar

Sheikh, N. et al. Coverage, timelines, and determinants of incomplete immunization in Bangladesh. Trop. Med. Infect. Dis. 3, 72 (2018).

Article PubMed PubMed Central Google Scholar

Bugvi, A. S. et al. Factors associated with non-utilization of child immunization in Pakistan: Evidence from the Demographic and Health Survey 200607. BMC Public Health 14, 232. https://doi.org/10.1186/1471-2458-14-232 (2014).

Article PubMed PubMed Central Google Scholar

Tadesse, H., Deribew, A. & Woldie, M. Predictors of defaulting from completion of child immunization in south Ethiopia, May 2008A case control study. BMC Public Health 9, 150. https://doi.org/10.1186/1471-2458-9-150 (2009).

Article PubMed PubMed Central Google Scholar

Jani, J. V., De Schacht, C., Jani, I. V. & Bjune, G. Risk factors for incomplete vaccination and missed opportunity for immunization in rural Mozambique. BMC Public Health 8, 161. https://doi.org/10.1186/1471-2458-8-161 (2008).

Article PubMed PubMed Central Google Scholar

De, P. & Bhattacharya, B. N. Determinants of child immunization in fourless-developed states of North India. J. Child Health Care 6, 3450 (2002).

Article Google Scholar

Rahman, M. & Obaida-Nasrin, S. Factors affecting acceptance of complete immunization coverage of children under five years in rural Bangladesh. Salud pblica de mxico 52, 134140 (2010).

Article PubMed Google Scholar

Atnafu, A. et al. Prevalence and determinants of incomplete or not at all vaccination among children aged 1236 months in Dabat and Gondar districts, northwest of Ethiopia: Findings from the primary health care project. BMJ Open 10, e041163. https://doi.org/10.1136/bmjopen-2020-041163 (2020).

Article PubMed PubMed Central Google Scholar

Melaku, M. S., Nigatu, A. M. & Mewosha, W. Z. Spatial distribution of incomplete immunization among under-five children in Ethiopia: Evidence from 2005, 2011, and 2016 Ethiopian Demographic and health survey data. BMC Public Health 20, 1362. https://doi.org/10.1186/s12889-020-09461-3 (2020).

Article PubMed PubMed Central Google Scholar

Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 28252830 (2011).

MathSciNet Google Scholar

Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. (2016).

Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 5667. https://doi.org/10.1038/s42256-019-0138-9 (2020).

Article PubMed PubMed Central Google Scholar

Rawat, S., Rawat, A., Kumar, D. & Sabitha, A. S. Application of machine learning and data visualization techniques for decision support in the insurance sector. Int. J. Inf. Manag. Data Insights 1, 100012 (2021).

Google Scholar

Guo, Y. The 7 steps of machine learning (2017). towardsdatascience.com (2017).

Brownlee, J. Data Preparation for Machine Learning: Data Cleaning, Feature Selection, and Data Transforms in Python (Machine Learning Mastery, 2020).

Yu, L. & Liu, H. In Proceedings of the 20th International Conference on Machine Learning (ICML-03). 856863.

Bekele, W. T. Machine learning algorithms for predicting low birth weight in Ethiopia. BMC Med. Inform. Decis. Mak. 22, 232. https://doi.org/10.1186/s12911-022-01981-9 (2022).

Article PubMed PubMed Central Google Scholar

Bitew, F. H., Sparks, C. S. & Nyarko, S. H. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public Health Nutr. 112 (2021).

Chilyabanyama, O. N. et al. Performance of machine learning classifiers in classifying stunting among under-five children in Zambia. Children (Basel, Switzerland). https://doi.org/10.3390/children9071082 (2022).

Article PubMed PubMed Central Google Scholar

Emmanuel, M. Application of Machine Learning Methods in Analysis of Infant Mortality in Rwanda: Analysis of Rwanda Demographic Health Survey 201415 Dataset (University of Rwanda, 2021).

Fenta, H. M., Zewotir, T. & Muluneh, E. K. A machine learning classifier approach for identifying the determinants of under-five child undernutrition in Ethiopian administrative zones. BMC Med. Inform. Decis. Mak. 21, 112 (2021).

Article Google Scholar

Kananura, R. M. Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Ugandas rural and urban settings. PLoS Glob. Public Health 2, e0000430. https://doi.org/10.1371/journal.pgph.0000430 (2022).

Article PubMed PubMed Central Google Scholar

Saroj, R. K., Yadav, P. K., Singh, R. & Chilyabanyama, O. N. Machine learning algorithms for understanding the determinants of under-five mortality. BioData Min. 15, 20. https://doi.org/10.1186/s13040-022-00308-8 (2022).

Article PubMed PubMed Central Google Scholar

Tesfaye, B., Atique, S., Azim, T. & Kebede, M. M. Predicting skilled delivery service use in Ethiopia: Dual application of logistic regression and machine learning algorithms. BMC Med. Inform. Decis. Mak. 19, 209. https://doi.org/10.1186/s12911-019-0942-5 (2019).

Article PubMed PubMed Central Google Scholar

Bekkar, M., Djemaa, H. K. & Alitouche, T. A. Evaluation measures for models assessment over imbalanced data sets. J. Inf. Eng. Appl. 3, 1533 (2013).

Google Scholar

Yang, L. & Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415, 295316. https://doi.org/10.1016/j.neucom.2020.07.061 (2020).

Article Google Scholar

Kebede, S. D. et al. Prediction of contraceptive discontinuation among reproductive-age women in Ethiopia using Ethiopian Demographic and Health Survey 2016 Dataset: A machine learning approach. BMC Med. Inform. Decis. Mak. 23, 9. https://doi.org/10.1186/s12911-023-02102-w (2023).

Article PubMed PubMed Central Google Scholar

Wang, K. et al. Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput. Biol. Med. 137, 104813. https://doi.org/10.1016/j.compbiomed.2021.104813 (2021).

Article PubMed Google Scholar

Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017).

Li, Q., Zhang, Y., Kang, H., Xin, Y. & Shi, C. Mining association rules between stroke risk factors based on the Apriori algorithm. Technol. Health Care. 25, 197205. https://doi.org/10.3233/thc-171322 (2017).

Article PubMed Google Scholar

Chandir, S. et al. Using predictive analytics to identify children at high risk of defaulting from a routine immunization program: Feasibility study. JMIR Public Health Surveill. 4, e9681 (2018).

Article Google Scholar

Mutua, M. K., Kimani-Murage, E. & Ettarh, R. R. Childhood vaccination in informal urban settlements in Nairobi, Kenya: Who gets vaccinated?. BMC Public Health 11, 6. https://doi.org/10.1186/1471-2458-11-6 (2011).

Article PubMed PubMed Central Google Scholar

Landoh, D. E. et al. Predictors of incomplete immunization coverage among one to five years old children in Togo. BMC Public Health 16, 968. https://doi.org/10.1186/s12889-016-3625-5 (2016).

Article PubMed PubMed Central Google Scholar

Pavlopoulou, I. D., Michail, K. A., Samoli, E., Tsiftis, G. & Tsoumakas, K. Immunization coverage and predictive factors for complete and age-appropriate vaccination among preschoolers in Athens, Greece: A cross- sectional study. BMC Public Health 13, 908. https://doi.org/10.1186/1471-2458-13-908 (2013).

Article PubMed PubMed Central Google Scholar

Zewdie, A., Letebo, M. & Mekonnen, T. Reasons for defaulting from childhood immunization program: A qualitative study from Hadiya zone, Southern Ethiopia. BMC Public Health 16, 1240. https://doi.org/10.1186/s12889-016-3904-1 (2016).

Article PubMed PubMed Central Google Scholar

Tauil, M. D. C., Sato, A. P. S. & Waldman, E. A. Factors associated with incomplete or delayed vaccination across countries: A systematic review. Vaccine 34, 26352643. https://doi.org/10.1016/j.vaccine.2016.04.016 (2016).

View post:
Prediction of incomplete immunization among under-five children in East Africa from recent demographic and health ... - Nature.com

Optimization of wear parameters for ECAP-processed ZK30 alloy using response surface and machine learning … – Nature.com

Experimental results Microstructure evolution

The ZK30s AA and ECAPed conditions of the inverse pole figures (IPF) coloring patterns and associated band contrast maps (BC) are shown in Fig.2. High-angle grain boundaries (HAGBs) were colored black, while Low-angle grain boundaries (LAGBs) were colored white for AA condition, and it was colored red for 1P and Bc, as shown in Fig.2. The grain size distribution and misorientation angle distribution of the AA and ECAPed ZK30 samples is shown in Fig.3. From Fig.2a, it was clear that the AA condition revealed a bimodal structure where almost equiaxed refined grains coexist with coarse grains and the grain size was ranged between 3.4 up to 76.7m (Fig.3a) with an average grain size of 26.69m. On the other hand, low fraction of LAGBs as depicted in Fig.3b. Accordingly, the GB map (Fig.2b) showed minimal LAGBs due to the recrystallization process resulting from the annealing process. ECAP processing through 1P exhibited an elongated grain alongside refined grains and the grain size was ranged between 1.13 and 38.1m with an average grain size of 3.24m which indicated that 1P resulted in a partial recrystallization, as shown in Fig.2c,d. As indicated in Fig.2b 1P processing experienced a refinement in the average grain size of 87.8% as compared with the AA condition. In addition, from Fig.2b it was clear that ECAP processing via 1P resulted a significant increase in the grain aspect ratio due to the uncomplete recrystallization process. In terms of the LAGBs distribution, the GB maps of 1P condition revealed a significant increase in the LAGBs fraction (Fig.2d). A significant increase in the LAGBs density of 225% after processing via 1P was depicted compared to the AA sample (Fig.2c). Accordingly, the UFG structure resulted from ECAP processing through 1P led to increase the fraction of LAGBs which agreed with previous study35,36. Shana et al.35 reported that during the early passes of ECAP a generation and multiplication of dislocation is occur which is followed by entanglement of the dislocation forming the LAGBs and hence, the density of LAGBs was increased after processing through 1P. The accumulation of the plastic strain up to 4Bc revealed an almost UFG, which indicated that 4Bc led to a complete dynamic recrystallization (DRX) process (Fig.2e). The grain size was ranged between 0.23 up to 11.7m with average grain size of 1.94m (the average grain size was decreased by 92.7% compared to the AA condition). On the other hand, 4Bc revealed a decrease in the LAGBs density by 25.4% compared to 1P condition due to the dynamic recovery process. The decrease in the LAGBs density after processing through 4Bc was coupled with an increase in the HAGBs by 4.4% compared to 1P condition (Figs.2f, 3b). Accordingly, the rise of the HAGBs after multiple passes can be referred to the transfer of LAGBs into HAGBs during the DRX process.

IPF coloring maps and their corresponding BC maps, superimposed for the ZK30 billets in its AA condition (a,b), and ECAP processed through (c,d) 1P, (e,f) 4Bc (with HAGBs in black lines and LAGBs in white lines (AA) and red lines (1P, 4Bc).

Relative frequency of (a) grain size and (b) misorientation angle of all ZK30 samples.

Similar findings were reported in previous studies. Dumitru et al.36 reported that ECAP processing resulted in the accumulation and re-arrangement of dislocations which resulted in forming a subgrains and equiaxed grains with an UFG structure and a fully homogenous and equiaxed grain structure for ZK30 alloy was attained after the third pass. Furthermore, they reported that the LAGBs is transferred into HAGBs during the multiple passes which leads to the decrease in the LAGBs density. Figueiredo et al.37 reported that the grains evolved during the early passes of ECAP into a bimodal structure while further processing passes resulted in the achievement of a homogenous UFG structure. Zhou et al.38 reported that by increasing the processing passes resulted in generation of new grain boundaries which resulted in increasing the misorientation to accommodate the deformation and the Geometrically Necessary Dislocations (GNDs) generated a part of the total dislocations with a HAGBs, thus develop misorientations between the neighbor grains. Tong et al.39 reported that the fraction of LAGBs is decreased during multiple passes for MgZnCa alloy.

Figure4a displays X-ray diffraction (XRD) patterns of the AA-ZK30 alloy, 1P, and 4Bc extruded samples, revealing peaks corresponding to primary -Mg Phase, Mg7Zn3, and MgZn2 phases in all extruded alloys, with an absence of diffraction peaks corresponding to oxide inclusions. Following 1P-ECAP, the -Mg peak intensity exhibits an initial increase, succeeded by a decrease and fluctuations, signaling texture alterations in the alternative Bc route. The identification of the MgZn2 phase is supported by the equilibrium MgZn binary phase diagram40. However, the weakened peak intensity detected for the MgZn2 phase after the 4BcECAP process indicates that a significant portion of the MgZn2 dissolved into the Mg matrix, attributed to their poor thermal stability. Furthermore, the atomic ratio of Mg/Zn for this phase is approximately 2.33, leading to the deduction that the second phase is the Mg7Zn3 compound. This finding aligns with recent research on MgZn alloys41. Additionally, diffraction patterns of ECAP-processed samples exhibit peak broadening and shifting, indicative of microstructural adjustments during plastic deformation. These alterations undergo analysis for crystallite size and micro-strain using the modified Williamson and Hall (WH) method42, as illustrated in Fig.4b. After a single pass of ECAP, there is a reduction in crystallite size and an escalation in induced micro-strain. Subsequent to four passes-Bc, further reductions in crystallite size and heightened micro-strain (36nm and 1.94103, respectively) are observed. Divergent shearing patterns among the four processing routes, stemming from disparities in sample rotation, result in distinct evolutions of subgrain boundaries. Route BC, characterized by the most extensive angular range of slip, generates subgrain bands on two shearing directions, expediting the transition of subgrain boundaries into high-angle grain boundaries43,44. Consequently, dislocation density and induced micro-strains reach their top in route BC, potentially influenced by texture modifications linked to orientation differences in processing routes. Hence, as the number of ECAP passes increases, an intensive level of deformation is observed, leading to the existence of dynamic recrystallization and grain refinement, particularly in the ECAP 4-pass. This enhanced deformation effectively impedes grain growth. Consequently, the number of passes in the ECAP process is intricately linked to the equivalent strain, inducing grain boundary pinning, and resulting in the formation of finer grains. The grain refinement process can be conceptualized as a repetitive sequence of dynamic recovery and recrystallization in each pass. In the case of the 4Bc ECAP process, dynamic recrystallization dominates, leading to a highly uniform grain reduction and, causing the grain boundaries to become less distinct45. Figure4b indicates that microstructural features vary with ECAP processing routes, aligning well with grain size and mechanical properties.

(a) XRD patterns for the AA ZK30 alloy and after 1P and 4Bc ECAP processing, (b) variations of crystallite size and lattice strain as a function of processing condition using the WilliamsonHall method.

Figure5 shows the volume loss (VL) and average coefficient of friction (COF) for the AA and ECAPed ZK30 alloy. The AA billets exhibited the highest VL at all wear parameters compared to the ECAPed billets as shown in Fig.5. From Fig.5a it revealed that performing the wear test at applied load of 1N exhibited the higher VL compared to the other applied forces. In addition, increasing the applied force up to 3 N revealed lower VL compared to 1 N counterpart at all wear speeds. Further increase in the applied load up to 5 N revealed a notable decrease in the VL. Similar behavior was attained for the ECAP-processed billets through 1P (Fig.5c) and 4Bc (Fig.5e). The VL was improved by increasing the applied load for all samples as shown in Fig.5 which indicated an enhancement in the wear resistance. Increasing the applied load increases the strain hardening of ZK30 alloy that are in contact as reported by Yasmin et al.46 and Kori et al.47. Accordingly, increasing the applied load resulted in increasing the friction force, which in turn hinder the dislocation motion and resulted in higher deformation, so that ZK30 experienced strain hardening and hence, the resistance to abrasion is increased, leading to improving the wear resistance48. Furthermore, increasing the applied load leads to increase the surface in contact with wear ball and hence, increases gripping action of asperities, which help to reduces the wear rate of ZK30 alloy as reported by Thuong et al.48. Out of contrary, increasing the wear speed revealed increasing the VL of the AA billets at all wear loads. For the ECAPed billet processed through 1P, the wear speed of 125mm/s revealed the lowest VL while the wear speed of 250mm/s showed the highest VL (Fig.5c). Similar behaviour was recorded for the 4Bc condition. In addition, from Fig.5c, it was clear that 1P condition showed higher VL compared to 4Bc (Fig.5e) at all wear parameters, indicating that processing via multiple passes resulted in significant grain size refinement (Fig.2). Hence, higher hardness and better wear behavior were attained which agreed with previous study7. In addition, from Fig.5, it was clear that increasing the wear speed increased the VL. For the AA billets tested at 1N load the VL was 1.52106 m3. ECAP processing via 1P significantly improved the wear behavior as the VL was reduced by 85% compared to the AA condition. While compared to the AA condition, the VL improved by 99.8% while straining through 4Bc, which is accounted for by the considerable refinement that 4Bc provides. A similar trend was observed for the ECAPed ZK30 samples tested at a load of 3 and 5 N (Fig.5). Accordingly, the significant grain refinement after ECAP processing (Fig.2) increased the grain boundaries area; hence, a thicker oxide protective layer can be formed, leading to improve the wear resistance of the ECAPed samples. It is worth to mentioning here that, the grain refinement coupled with refining the secondary phase particle and redistribution resulted from processing through ECAP processing through multiple passes resulted in improving the hardness, wear behavior and mechanical properties according to HallPetch equation7,13,49. Similar findings were noted for the ZK30 billets tested at 3 N load, processing through 1P and 4Bc exhibited decreasing the VL by 85%, 99.85%, respectively compared to the AA counterpart. Similar finding was recorded for the findings of ZK30 billets which tested at 5 N load.

Volume loss of ZK30 alloy (a,c,e) and the average coefficient of friction (b,d,f) in its (a,b) AA, (c,d) 1P and (e,f) 4Bc conditions as a function of different wear parameters.

From Fig.5, it can be noticed that the COF curves revealed a notable fluctuation with implementing least square method to smoothing the data, confirming that the friction during the testing of ECAPed ZK30 alloy was not steady for such a time. The remarkable change in the COF can be attributed to the smaller applied load on the surface of the ZK30 samples. Furthermore, the results of Fig.5 revealed that ECAP processing reduced the COF, and hence, better wear behavior was attained. Furthermore, for all ZK30 samples, it was observed that the highest applied load (5 N) coupled with the lowest wear time (110s) exhibited better COF and better wear behavior was displayed. These findings agreed with Farhat et al.50, they reported that decreasing the grain size led to improve the COF and hence improve the wear behavior. Furthermore, they reported that a plastic deformation occurs due to the friction between contacted surface which resisted by the grain boundaries and fine secondary phases. In addition, the strain hardening resulted from ECAP processing leads to decrease the COF and improving the VL50. Sankuru et al.43 reported that ECAP processing foe pure Mg resulted in substantial grain refinement which was reflected in improving both microhardness and wear rate of the ECAPed billets. Furthermore, they found that increasing the number of passes up to 4Bc reduced the wear rate by 50% compared to the AA condition. Based on the applied load and wear velocity and distance, wear mechanism can be classified into mild wear and severe wear regimes49. Wear test parameters in the present study (load up to 5 N and speed up to 250mm/s) falls in the mild wear regime where the delamination wear and oxidation wear mechanisms would predominantly take place43,51.

The worn surface morphologies of the ZK30-AA billet and ECAPed billet processed through 4Bc are shown in Fig.6. From Fig.6 it can revealed that scores of wear grooves which aligned parallel to the wear direction have been degenerated on the worn surface in both AA (Fig.6a) and 4Bc (Fig.6b) conditions. Accordingly, the worn surface was included a combination of adhesion regions and a plastic deformation bands along the wear direction. Furthermore, it can be observed that the wear debris were adhered to the ZK30 worn surface which indicated that the abrasion wear mechanism had occur52. Lim et al.53 reported that hard particle between contacting surfaces scratches samples and resulted in removing small fragments and hence, wear process was occurred. In addition, from Fig.6a,b it can depicted that the wear grooves on the AA billet were much wider than the counterpart of the 4Bc sample and which confirmed the effectiveness of ECAP processing in improving the wear behavior of the ZK30 alloy. Based on the aforementioned findings it can be concluded that ECAP-processed billets exhibited enhanced wear behavior which can be attributed to the obtained UFG structure52.

SEM micrograph of the worn surface after the wear test: (ac) AA alloy; (b) ECAP-processed through 4Bc.

Several regression transformations approach and associations among variables that are independent have been investigated in order to model the wear output responses. The association between the supplied parameters and the resulting responses was modeled using quadratic regression. The models created in the course of the experiment are considered statistically significant and can be used to forecast the response parameters in relation to the input control parameters when the highest possible coefficient of regression of prediction (R2) is closer to 1. The regression Eqs.(9)(14) represent the predicted non-linear model of volume loss (VL) and coefficient and friction (COF) at different passes as a function of velocity (V) and applied load (P), with their associated determination and adjusted coefficients. The current studys adjusted R2 and correlation coefficient R2 values fluctuated between 95.67 and 99.97%, which is extremely near to unity.

$${text{AA }}left{ {begin{array}{*{20}l} {VL = + 1.52067 times 10^{ - 6} - 1.89340 times 10^{ - 9} P - 4.81212 times 10^{ - 11} V + 8.37361 times 10^{ - 12} P * V} hfill & {} hfill \ { - 2.91667E - 10 {text{P}}^{2} - 2.39989E - 14 {text{V}}^{2} } hfill & {(9)} hfill \ {frac{1}{{{text{COF}}}} = + 2.72098 + 0.278289P - 0.029873V - 0.000208 P times V + 0.047980 {text{P}}^{2} } hfill & {} hfill \ { + 0.000111 {text{V}}^{2} - 0.000622 {text{P}}^{2} times V + 6.39031 times 10^{ - 6} P times {text{V}}^{2} } hfill & {(10)} hfill \ end{array} } right.$$

$$1{text{ Pass }}left{ {begin{array}{*{20}l} {VL = + 2.27635 times 10^{ - 7} + 7.22884 times 10^{ - 10} P - 2.46145 times 10^{ - 11} V - 1.03868 times 10^{ - 11} P times V} hfill & {} hfill \ { - 1.82621 times 10^{ - 10} {text{P}}^{2} + 6.10694 times 10^{ - 14} {text{V}}^{2} } hfill & {} hfill \ { + 8.76819 times 10^{ - 13} P^{2} times V + 2.48691 times 10^{ - 14} P times V^{2} } hfill & {(11)} hfill \ {frac{1}{{{text{COF}}}} = - 0.383965 + 1.53600P + 0.013973V - 0.002899 P times V} hfill & {} hfill \ { - 0.104246 P^{2} - 0.000028 V^{2} } hfill & {(12)} hfill \ end{array} } right.$$

$$4{text{ Pass}}left{ {begin{array}{*{20}l} {VL = + 2.29909 times 10^{ - 8} - 2.29012 times 10^{ - 10} P + 2.46146 times 10^{ - 11} V - 6.98269 times 10^{ - 12} P times V } hfill & {} hfill \ { - 1.98249 times 10^{ - 11} {text{P}}^{2} - 7.08320 times 10^{ - 14} {text{V}}^{2} } hfill & {} hfill \ { + 3.23037 times 10^{ - 13} P^{2} * V + 1.70252 times 10^{ - 14} P times V^{2} } hfill & {(13)} hfill \ {frac{1}{{{text{COF}}}} = + 2.77408 - 0.010065P - 0.020097V - 0.003659 P times V} hfill & {} hfill \ { + 0.146561 P^{2} + 0.000099 V^{2} } hfill & {(14)} hfill \ end{array} } right.$$

The experimental data are plotted in Fig.7 as a function of the corresponding predicted values for VL and COF for zero pass, one pass, and four passes. The minimal output value is indicated by blue dots, which gradually change to the maximum output value indicated by red points. The effectiveness of the produced regression models was supported by the analysis of these maps, which showed that the practical and projected values matched remarkably well and that the majority of their intersection locations were rather close to the median line.

Comparison between VL and COF of experimental and predicted values of ZK30 at AA, 1P, and 4Bc.

As a consequence of wear characteristics (P and V), Fig.8 displays 3D response plots created using regression models to assess changes in VL and COF at various ECAP passes. For VL, the volume loss and applied load exhibit an inverse proportionality at various ECAP passes, which is apparent in Fig.8ac. It was observed that increasing the applied load in the wear process will minimize VL. So, the optimal amount of VL was obtained at an applied load of 5N. There is an inverse relation between V of the wear process and VL at different ECAP passes. There is a clear need to change wear speeds for bullets with varying numbers of passes. As a result, the increased number of passes will need a lower wear speed to minimize VL. The minimal VL at zero pass is 1.50085E06 m3 obtained at 5N and 250mm/s. Also, at a single pass, the optimal VL is 2.2266028E07 m3 obtained at 5 N and 148mm/s. Finally, the minimum VL at four passes is 2.07783E08 m3 at 5N and 64.5mm/s.

Three-dimensional plot of VL (ac) and COF (df) of ZK30 at AA, 1P, and 4Bc.

Figure8df presents the effect of wear parameters P and V on the COF for ECAPed ZK30 billets at zero, one, and four passes. There is an inverse proportionate between the applied load in the wear process and the coefficient of friction. As a result, the minimum optimum value of COF of the ZK30 billet at different process passes was obtained at 5 N. On the other hand, the speed used in the wear process decreased with the number of billet passes. The wear test rates for billets at zero, one, and four passes are 250, 64.5, and 64.5mm/s, respectively. The minimum COF at zero pass is 0.380134639, obtained at 5N and 250mm/s. At 5N and 64.5mm/s, the lowest COF at one pass is 0.220277466. Finally, the minimum COF at four passes is 0.23130154 at 5N and 64.5mm/s.

The previously mentioned modern ML algorithms have been used here to provide a solid foundation for analyzing the obtained data and gaining significant insights. The following section will give the results acquired by employing these approaches and thoroughly discuss the findings.

The correlation plots and correlation coefficients (Fig.9) between the input variables, force, and speed, and the six output variables (VL_P0, VL_P1, VL_P4, COF_P0, COF_P1, and COF_P4) for data preprocessing of ML models give valuable insights into the interactions between these variables. Correlation charts help to investigate the strength and direction of a linear relationship between model input and output variables. We can initially observe if there is a positive, negative, or no correlation between each two variables by inspecting the scatterplots. This knowledge aids in comprehending how changes in one variable effect changes in the other. In contrast, the correlation coefficient offers a numerical assessment of the strength and direction of the linear relationship. It ranges from 1 to 1, with near 1 indicating a strong negative correlation, close to 1 indicating a strong positive correlation, and close to 0 indicating no or weak association. It is critical to examine the size and importance of the correlation coefficients when examining the correlation between the force and speed input variables and the six output variables (VL_P0, VL_P1, VL_P4, COF_P0, COF_P1, and COF_P4). A high positive correlation coefficient implies that a rise in one variable is connected with an increase in the other. In contrast, a high negative correlation coefficient indicates that an increase in one variable is associated with an increase in the other. From Fig.9 it was clear that for all ZK30 billets, the both VL and COP were reversely proportional with the applied (in the range of 1-up to- 5N). Regarding the wear speed, the VL of both the AA and 1P conditions exhibited an inversed proportional with the wear speed while 4Bc exhibited a direct proportional with the wear speed (in the range of 64.5- up to- 250mm/s) despite of the COP for all samples revealed an inversed proportional with the wear speed. The VL of AA condition (P0) revealed strong negative correlation coefficient of 0.82 with the applied load while it displayed intermediate negative coefficient of 0.49 with the wear speed. For 1P condition, VL showed a strong negative correlation of 0.74 with the applied load whereas it showed a very weak negative correlation coefficient of 0.13 with the speed. Furthermore, the VL of 4Bc condition displayed a strong negative correlation of 0.99 with the applied load while it displayed a wear positive correlation coefficient of 0.08 with the speed. Similar trend was observed for the COF, the AA, 1P and 4Bc samples displayed intermediate negative coefficient of 0.047, 0.65 and 0.61, respectively with the applied load while it showed a weak negative coefficient of 0.4, 0.05 and 0.22, respectively with wear speed.

Correlation plots of input and output variables showcasing the strength and direction of relationships between each inputoutput variable using correlation coefficients.

Figure10 shows the predicted train and test VL values compared to the original data, indicating that the VL prediction model performed well utilizing the LR (Linear Regression) technique. The R2-score is a popular statistic for assessing the goodness of fit of a regression model. It runs from 0 to 1, with higher values indicating better performance. In this scenario, the R2-scores for both the training and test datasets range from 0.55 to 0.99, indicating that the ML model has established a significant correlation between the projected VL values and the actual data. This shows that the model can account for a considerable percentage of the variability in VL values.

Predicted train and predicted test VL versus actual data computed for different applied loads and number of passes of (a) 0P (AA), (b) 1P, and (c) 4Bc: evaluating the performance of the VL prediction best model achieved using LR algorithm.

The R2-scores for training and testing three distinct ML models for the output variables VL_P0, VL_P1, and VL_P4 are summarized in Fig.11. The R2-score, also known as the coefficient of determination, is a number ranging from 0 to 1 that indicates how well the model fits the data. For VL_P0, R2 for testing is 0.69, and that for training is 0.96, indicating that the ML model predicts the VL_P0 variable with reasonable accuracy on unknown data. On the other hand, the R2 value of 0.96 for training suggests that the model fits the training data rather well. In summary, the performance of the ML models changes depending on the output variables. With R2 values of 0.98 for both training and testing, the model predicts 'VL_P4' with great accuracy. However, the models performance for 'VL_P0' is reasonable, with an R2 score of 0.69 for testing and a high R2 score of 0.96 for training. The models performance for 'VL_P1' is relatively poor, with R2 values of 0.55 for testing and 0.57 for training. Additional assessment measures must be considered to understand the models' prediction capabilities well. Therefore, as presented in the following section, we did no-linear polynomial fitting with extracted equations that accurately link the output and input variables.

Result summary of ML train and test sets displaying R2-score for each model.

Furthermore, the data was subjected to polynomial fitting with first- and second-degree models (Fig.12). The fitting accuracy of the data was assessed using the R2-score, which ranged from 0.92 to 0.98, indicating a good fit. The following equations (Eqs.15 to 17) were extracted from fitting the experimental dataset of the volume loss at different conditions of applied load (P) and the speed (V) as follows:

$${text{VL}}_{text{P}}0 = 1.519e - 06{ } + { } - 2.417e - 09{text{ * P }} + { } - 3.077e - 11{ * }V$$

(15)

$$VL_{text{P}}1 = 2.299e - 07 - 5.446e - 10 * {text{P}} - 5.431e - 11 * V - 5.417e - 11 * {text{P}}^{2} + 2.921e - 12 * {text{P}} V + 1.357e - 13 * V^{2}$$

(16)

$$VL_{text{P}}4 = 2.433e - 08 - 6.200e - 10 * {text{P}} + 1.042e - 12 * V$$

(17)

Predicted versus actual (a) VL_P0 fitted to Eq.15 with R2-score of 0.92, (b) VL_P1 fitted to Eq.16 with R2-score of 0.96, (c) VL_P4 fitted to Eq.17 with R2-score of 0.98.

Figure13 depicts the predicted train and test coefficients of friction (COF) values placed against the actual data. The figure seeks to assess the performance of the best models obtained using the SVM (Support Vector Machine) and GPR (Gaussian Process Regression) algorithms for various applied loads and number of passes (0, 1P, and 4P). The figure assesses the accuracy and efficacy of the COF prediction models by showing the predicted train and test COF values alongside the actual data. By comparing projected and actual data points, we may see how closely the models match the true values. The ML models trained and evaluated on the output variables 'COF_P0', 'COF_P1', and 'COF_P4' using SVM and GPR algorithms show great accuracy and performance, as summarized in Fig.13. The R2 ratings for testing vary from 0.97 to 0.99, showing that the models efficiently capture the predicted variables' variability efficiently. Furthermore, the training R2 scores are consistently high at 0.99, demonstrating a solid fit to the training data. These findings imply that the ML models can accurately predict the values of 'COF_P0', 'COF_P1', and 'COF_P4' and generalize well to new unseen data.

Predicted train and predicted test COF versus actual data computed for different applied loads and number of passes of (a) 0P (AA), (b) 1P, and (c) 4Bc: evaluating the performance of the COF prediction best model achieved using SVM and GPR algorithms.

Figure14 presents a summary of the results obtained through machine learning modeling. The R2 values achieved for COF modeling using SVM and GPR are 0.99 for the training set and range from 0.97 to 0.99 for the testing dataset. These values indicate that the models have successfully captured and accurately represented the trends in the dataset.

Result summary of ML train and test sets displaying R2-score for each model.

The results of the RSM optimization carried out on the volume loss and coefficient of friction at zero pass (AA), along with the relevant variables, are shown in Appendix A-1. The red and blue dots represented the wear circumstance (P and V) and responses (VL and COF) for each of the ensuing optimization findings. The volume loss and coefficient of friction optimization objective were formed to in range, using minimize as the solution target, and the expected result of the desirability function was in the format of smaller-is-better attributes. The values of (A) P=5 N and (B) V=250mm/s were the optimal conditions for volume loss. Appendix A-1(a) shows that this resulted in the lowest volume loss value attainable of 1.50127E-6 m3. Also, the optimal friction coefficient conditions were (A) P=2.911 N and (B) V=250mm/s. This led to the lowest coefficient of friction value possible, which was 0.324575, as shown in Appendix A-1(b).

Appendix A-2 displays the outcomes of the RSM optimization performed on the volume loss and coefficient of friction at one pass, together with the appropriate variables. The volume loss and coefficient of friction optimization objectives were designed to be "in range," with "minimize" as the solution objective. It was anticipated that the intended function would provide "smaller-is-better" traits. The ideal conditions for volume loss were (A) P=4.95 N and (B) V=136.381mm/s. This yielded the lowest volume loss value feasible of 2.22725E-7 m3, as seen in Appendix A-2 (a). The optimal P and V values for the coefficient of friction were found to be (A) P=5 N and (B) V=64.5mm/s. As demonstrated in Appendix A-2 (b), this resulted in the lowest coefficient of friction value achievable, which was 0.220198.

Similarly, Appendix A-3 displays the outcomes of the RSM optimization performed on the volume loss and coefficient of friction at four passes, together with the appropriate variables. The volume loss and coefficient of friction optimization objectives were designed to be "in range," with "minimize" as the solution objective. The desired functions expected result would provide of "smaller-is-better" characteristics. The optimal conditions for volume loss were (A) P=5 N and (B) V=77.6915mm/s. This yielded the lowest volume loss value feasible of 2.12638E-8 m3, as seen in Appendix A-1 (a). The optimal P and V values for the coefficient of friction were found to be (A) P=4.95612 N and (B) V=64.9861mm/s. As seen in Appendix A-1(b), this resulted in the lowest coefficient of friction value achievable, which was 0.235109.

The most appropriate combination of wear-independent factors that contribute to the minimal feasible volume loss and coefficient of friction was determined using a genetic algorithm (GA). Based on genetic algorithm technique, the goal function for each response was determined by taking Eqs.(9)(14) and subjecting them to the wear boundary conditions, P and V. The following expression applies to the recommended functions for objective: Minimize (VL, COF), subjected to ranges of wear conditions: 1P5 (N), 64.5V250 (mm/s).

Figures15 and 16 show the GA optimization techniques performance in terms of fitness value and the running solver view, which were derived from MATLAB, together with the related wear requirements for the lowest VL and COF at zero pass. VL and COF were suggested to be minimized by Eqs.(9) and (10), which were then used as the function of fitness and exposed to the wear boundary limit. According to Fig.15a, the lowest value of VL that GA could find was 1.50085E6 m3 at P=5N and V=249.993mm/s. Furthermore, the GA yielded a minimum COF value of 0.322531 at P=2.91 N and V=250mm/s (Fig.15b).

Optimum VL (a) and COF (b) by GA at AA condition.

Optimum VL (a) and COF (b) by hybrid DOE-GA at AA condition.

The DOEGA hybrid analysis was carried out to enhance the GA outcomes. Wear optimal conditions of VL and COF at zero pass are used to determine the initial populations of hybrid DOEGA. The hybrid DOEGA yielded a minimum VL value of 1.50085E-6 m3 at a speed of 249.993mm/s and a load of 5N (Fig.16a). Similarly, at a 2.91 N and 250mm/s speed load, the hybrid DOEGA yielded a minimum COF (Fig.16b) of 0.322531.

The fitness function, as defined by Eqs.11 and 12, was the depreciation of VL and COF at a 1P, subject to the wear boundary condition. Figure17a,b display the optimal values of VL and COF by GA, which were 2.2266E7 m3 and 0.220278, respectively. The lowest VL measured at 147.313mm/s and 5 N. In comparison, 5 N and 64.5mm/s were the optimum wear conditions of COF as determined by GA. Hybrid DOEGA results of minimum VL and COF at a single pass were 2.2266 E-7 m3 and 0.220278, respectively, obtained at 147.313mm/s and 5 N for VL as shown in Fig.18a and 5 N and 64.5mm/s for COF as shown in Fig.18b.

Optimum VL (a) and COF (b) by GA at 1P condition.

Optimum VL (a) and COF (b) by hybrid DOE-GA at 1P condition.

Subject to the wear boundary condition, the fitness function was the minimization of VL and COF at four passes, as defined by Eqs.13 and 14. The optimum values of VL and COF via GA shown in Fig.19a,b were 2.12638E8 m3 and 0.231302, respectively. The lowest reported VL was 5 N and 77.762mm/s. However, GA found that the optimal wear conditions for COF were 5 N and 64.5mm/s. In Fig.20a,b, the hybrid DOEGA findings for the minimum VL and COF at four passes were 2.12638E8 m3 and 0.231302, respectively. These results were achieved at 77.762mm/s and 5 N for VL and 5 N and 64.5mm/s for COF.

Optimum VL (a) and COF (b) by GA at 4Bc condition.

Optimum VL (a) and COF (b) by hybrid DOE-GA at 4Bc condition.

A mathematical model whose input process parameters influence the quality of the output replies was solved using the multi-objective genetic algorithm (MOGA) technique54. In the current study, the multi-objective optimization using genetic algorithm (MOGA) as the objective function, regression models, was implemented using the GA Toolbox in MATLAB 2020 and the P and V are input wear parameter values served as the top and lower bounds, and the number of parameters was set to three. After that, the following MOGA parameters were selected: There were fifty individuals in the initial population, 300 generations in the generation, 20 migration intervals, 0.2 migration fractions, and 0.35 Pareto fractions. Constraint-dependent mutation and intermediary crossover with a coefficient of chance of 0.8 were used for optimization. The Pareto optimum, also known as a non-dominated solution, is the outcome of MOGA. It is a group of solutions that consider all of the objectives without sacrificing any of them55.

By addressing both as multi-objective functions was utilized to identify the lowest possible values of the volume loss and coefficient of friction at zero pass. Equations(9) and (10) were the fitness functions for volume loss and coefficient of friction at zero pass for ZK30. The Pareto front values for the volume loss and coefficient of friction at zero pass, as determined by MOGA, are listed in Table 2. The volume loss (Objective 1) and coefficient of friction (Objective 2) Pareto chart points at zero pass are shown in Fig.21. A friction coefficient reduction due to excessive volume loss was observed. As a result, giving up a decrease in the coefficient of friction can increase volume loss. For zero pass, the best volume loss was 1.50096E06 m3 with a sacrifice coefficient of friction of 0.402941. However, the worst volume loss was 1.50541E06 m3, with the best coefficient of friction being 0.341073.

The genetic algorithm was used for the multi-objective functions of minimal volume loss and coefficient of friction. The fitness functions for volume loss and coefficient of friction at one pass were represented by Eqs.(11) and (12), respectively. Table 3 displays the Pareto front points of volume loss and coefficient of friction at one pass. Figure22 presents the volume loss (Objective 1) and coefficient of friction (Objective 2) Pareto chart points for a single pass. It was discovered that the coefficient of friction decreases as the volume loss increases. As a result, the volume loss can be reduced at the expense of a higher coefficient of friction. The best volume loss for a single pass was 2.22699E07 m3, with the worst maximum coefficient of friction being 0.242371 and the best minimum coefficient of friction being 0.224776 at a volume loss of 2.23405E07 m3.

The multi-objective functions of minimal volume loss and coefficient of friction were handled by Eqs.(13) and (14), respectively, served as the fitness functions for volume loss and coefficient of friction at four passes. The Pareto front points of volume loss and coefficient of friction at four passes are shown in Table 4. The Pareto chart points for the volume loss (Objective 1) and coefficient of friction (Objective 2) for four passes are shown in Fig.23. It was shown that when the volume loss increases, the coefficient of friction lowers. The volume loss can be decreased as a result, however, at the expense of an increased coefficient of friction. The best minimum coefficient of friction was 0.2313046 at a volume loss of 2.12663E08 m3, and the best minimum volume loss was 2.126397E08 m3 at a coefficient of friction of 0.245145 for four passes. In addition, Table 5 compares wear response values at DOE, RSM, GA, hybrid RSM-GA, and MOGA.

This section proposed the optimal wear parameters of different responses, namely VL and COF of ZK30. The presented optimal wear parameters, such as P and V, are based on previous studies of ZK30 that recommended the applied load from one to 30 N and speed from 64.5 to 1000mm/s. Table 6 presents the optimal condition of the wear process of different responses by genetic algorithm (GA).

Table 7 displays the validity of wears regression model for VL under several circumstances. The wear models' validation was achieved under various load and speed conditions. The volume loss response models had the lowest error % between the practical and regression models and were the most accurate, based on the validation data. Table 7 indicates that the data unambiguously shows that the predictive molding performance has been validated, as shown by the reasonably high accuracy obtained, ranging from 69.7 to 99.9%.

Equations(15 to 17) provide insights into the relationship that links the volume loss with applied load and speed, allowing us to understand how changes in these factors affect the volume loss in the given system. The validity of this modeling was further examined using a new unseen dataset by which the prediction error and accuracy were calculated, as shown in Table 8. Table 8 shows that the data clearly demonstrates that the predictive molding performance has been validated, as evidenced by the obtained accuracy ranging from 69.7 to 99.9%, which is reasonably high.

Original post:
Optimization of wear parameters for ECAP-processed ZK30 alloy using response surface and machine learning ... - Nature.com

Machine learning approach predicts heart failure outcome risk – HealthITAnalytics.com

April 22, 2024 -Researchers from the University of Virginia (UVA) have developed a machine learning tool designed to assess and predict adverse outcome risks for patients with advanced heart failure with reduced ejection fraction (HFrEF), according to a recent study published in the American Heart Journal.

The research team indicated that risk models for HFrEF exist, but few are capable of addressing the challenge of missing data or incorporating invasive hemodynamic data, limiting their ability to provide personalized risk assessments for heart failure patients.

Heart failure is a progressive condition that affects not only quality of life but quantity as well, explained Sula Mazimba, MD, an associate professor of medicine at UVA and cardiologist at UVA Health, in the news release. "All heart failure patients are not the same. Each patient is on a spectrum along the continuum of risk of suffering adverse outcomes. Identifying the degree of risk for each patient promises to help clinicians tailor therapies to improve outcomes.

Outcomes like weakness, fatigue, swollen extremities and death are of particular concern for heart failure patients, and the risk model is designed to stratify the risk of these events.

The tool was built using anonymized data pulled from thousands of patients enrolled in heart failure clinical trials funded by the National Institutes of Health (NIH) National Heart, Lung and Blood Institute (NHLBI).

Patients in the training and validation cohorts were categorized into five risk groups based on left ventricular assist device (LVAD) implantation or transplantation, rehospitalization within six months of follow-up and death, if applicable.

To make the model robust in the presence of missing data, the researchers trained it to predict patients risk categories using either invasive hemodynamics alone or a feature set incorporating noninvasive hemodynamics data.

Prediction accuracy for each category was determined separately using area under the curve (AUC).

Overall, the model achieved high performance across all five categories. The AUCs ranged from 0.896 +/- 0.074 to 0.969 +/- 0.081 for the invasive hemodynamics feature set and 0.858 +/- 0.067 to 0.997 +/- 0.070 for the set incorporating all features.

The research team underscored that the inclusion of hemodynamic data significantly aided the models performance.

This model presents a breakthrough because it ingests complex sets of data and can make decisions even among missing and conflicting factors, said Josephine Lamp, a doctoral researcher in the UVA School of Engineerings Department of Computer Science. It is really exciting because the model intelligently presents and summarizes risk factors reducing decision burden so clinicians can quickly make treatment decisions.

The researchers have made their tool freely available online for researchers and clinicians in the hopes of driving personalized heart failure care.

In pursuit of personalized and precision medicine, other institutions are also turning to machine learning.

Last week, a research team from Clemson University shared how a deep learning tool can help researchers better understand how gene-regulatory network (GRN) interactions impact individual drug response.

GRNs map the interactions between genes, proteins and other elements. These insights are crucial for exploring how genetic variations influence a patients phenotypes such as drug response. However, many genetic variants linked to disease are in areas of DNA that dont directly code for proteins, creating a challenge for those investigating the role of these variants in individual health.

The deep learning-based Lifelong Neural Network for Gene Regulation (LINGER) tool helps address this by using single-cell multiome data to predict how GRNs work, which can shed light on disease drivers and drug efficacy.

View original post here:
Machine learning approach predicts heart failure outcome risk - HealthITAnalytics.com

Practical approaches in evaluating validation and biases of machine learning applied to mobile health studies … – Nature.com

In this section, we first describe how Ecological Momentary Assessments work and how they differentiate from assessments that are collected within a clinical environment. Second, we present the studies and ML use cases for each dataset. Next, we introduce the non-ML baseline heuristics and explain the ML preprocessing steps. Finally, we describe existing train-test-split approaches (cross-validation) and the splitting approaches at the user- and assessment levels.

Within this context, ecological means within the subjects natural environment", and momentary within this moment" and ideally, in real time16. Assessments collected in research or clinical environments may cause recall bias of the subjects answers and are not primarily designed to track changes in mood or behavior longitudinally. Ecological Momentary Assessments (EMA) thus increase validity and decrease recall bias. They are suitable for asking users in their daily environment about their state of being, which can change over time, by random or interval time sampling. Combining EMAs and mobile crowdsensing sensor measurements allows for multimodal analyses, which can gain new insights in, e.g., chronic diseases8,15. The datasets used within this work have EMA in common and are described in the following subsection.

From ongoing projects of our team, we are constantly collecting mHealth data as well as Ecological Momentary Assessments6,17,18,19. To investigate how the machine learning performance varies based on the splits, we wanted different datasets with different use cases. However, to increase comparability between the use cases, we created multi-class classification tasks.

We train each model using historical assessments, the oldest assessment was collected at time tstart, the latest historical assessment at time tlast. A current assessment is created and collected at time tnow, a future assessment at time tnext. Depending on the study design, the actual point of time tnext may be in some hours or in a few weeks from tnow. For each dataset and for each user, we want to predict a feature (synonym, a question of an assessment) at time tnext using the features at time tnow. This feature at time tnext is then called the target. For each use case, a model is trained using data between tstart and tlast, and given the input data from tnow, it predicts the target at tnext. Figure1 gives a schematic representation of the relevant points of time tstart,tlast,tnow, and tnext.

At time tstart, the first assessment is given; tlast is the last known assessment used for training, whereas tnow is the currently available assessment as input for the classifier and the target is predicted at time ttext.

To increase comparability between the approaches, we used the same model architecture with the same pseudo-random initialisation. The model is a Random Forest classifier with 100 trees and the Gini impurity as the splitting criterion. The whole coding was in Python 3.9, using mostly scikit-learn, pandas and Jupyter Notebooks. Details can be found on GitHub in the supplementary material.

For all datasets that we used in this study, we have ethical approvals (UNITI No. 20-1936-101, TYT No. 15-101-0204, Corona Check No. 71/20-me, and Corona Health No. 130/20-me). The following section provides an overview of the studies, the available datasets with characteristics, and then describes each use case in more detail. An brief overview is given in Table1 with baseline statistics for each dataset in Table2.

To provide some more background info about the studies: The analyses happen with all apps on the so-called EMA questionnaires (synonym: assessment), i.e., the questionnaires that are filled out multiple times in all apps and the respective studies. This can happen several times a day (e.g., for the tinnitus study TrackYourTinnitus (TYT)) or at weekly intervals (e.g., studies in the Corona Health (CH) app). Nevertheless, the analysis happens on the recurring questionnaires, which collect symptoms over time and in the real environment through unforeseen (i.e., random) notifications.

The TrackYourTinnitus (TYT) dataset has the most filled-out assessments with more than 110,000 questionnaires as by 2022-10-24. The Corona Check (CC) study has the most users. This is because each time an assessment is filled out, a new user can optionally be created. Notably, this app has the largest ratio of non-German users and the youngest user group with the largest standard deviation. The Corona Health (CH) app with its studies Mental health for adults, adolescents and physical health for adults has the highest proportion of German users because it was developed in collaboration with the Robert Koch Institute and was primarily promoted in Germany. Unification of treatments and Interventions for Tinnitus patients (UNITI) is a European Union-wide project, which overall aim is to deliver a predictive computational model based on existing and longitudinal data19. The dataset from the UNITI randomized controlled trial is described by Simoes et al.20.

With this app, it is possible to record the individual fluctuations in tinnitus perception. With the help of a mobile device, users can systematically measure the fluctuations of their tinnitus. Via the TYT website or the app, users can also view the progress of their own data and, if necessary, discuss it with their physician.

The ML task at hand is a classification task with target variable Tinnitus distress at time tnow and the questions from the daily questionnaire as the features of the problem. The targets values range in [0,1] on a continuous scale. To make it a classification task, we created bins with step size of 0.2 resulting in 5 classes. The features are perception, loudness, and stressfulness of tinnitus, as well as the current mood, arousal and stress level of a user, the concentration level while filling out the questionnaire, and perception of the worst tinnitus symptom. A detailed description of the features was already done in previous works21. Of note, the time delta of two assessments of one user at tnext and tnow varies between users. Its median value is 11 hours.

The overall goal of UNITI is to treat the heterogeneity of tinnitus patients on an individual basis. This requires understanding more about the patient-specific symptoms that are captured by EMA in real time.

The use case we created at UNITI is like that of TYT. The target variable encumbrance, coded as cumberness, which was also continuously recorded, was divided into an ordinal scale from 0 to 1 in 5 steps. Features also include momentary assessments of the user during completion, such as jawbone, loudness, movement, stress, emotion, and questions about momentary tinnitus. The data was collected using our mobile apps7. Here, of note: on average, the median time gap between two assessment is 24 hours for each user.

At the beginning of the COVID-19 pandemic, it was not easy to get initial feedback about an infection, given the lack of knowledge about the novel virus and the absence of widely available tests. To assist all citizens in this regard, we launched the mobile health app Corona Check together with the Bavarian State Office for Health and Food Safety22.

The Corona Check dataset predicts whether a user has a Covid infection based on a list of given symptoms23. It was developed in the early pandemic back in 2020 and helped people to get quick estimate for an infection without having an antigen test. The target variable has four classes: First, suspected coronavirus (COVID-19) case", second, symptoms, but no known contact with confirmed corona case", third, contact with confirmed corona case, but currently no symptoms", and last, neither symptoms nor contact".

The features are a list of Boolean variables, which were known at this time to be typically related with a Covid infection, such as fever, a sore throat, a runny nose, cough, loss of smell, loss of taste, shortness of breath, headache, muscle pain, diarrhea, and general weakness. Depending on the answers given by a user, the application programming interface returned one of the classes. The median time gap of two assessments for the same user is 8 hours on average with a much larger standard deviation of 24.6 days.

The last four use cases are all derived from a bigger Covid-related mHealth project called Corona Health6,24. The app was developed in collaboration with the Robert Koch-Institute and was primarily promoted in Germany, it includes several studies about the mental or physical health, or the stress level of a user. A user can download the app and then sign up for a study. He or she will then receive a baseline one-time questionnaire, followed by recurring follow-ups with between-study varying time gaps. The follow-up assessment of CHA has a total of 159 questions including a full PHQ9 questionnaire25. We then used the nine questions of PHQ9 as features at tnow to predict the level of depression for this user for tnext. Depression levels are ordinally scaled from None to Severe in a total of 5 classes. The median time gap of two assessments for the same user is 7.5 days. That is, the models predict the future in this time interval.

Similar to the adult cohort, the mental health of adolescents during the pandemic and its lock-downs is also captured by our app using EMA.

A lightweight version of the mental health questionnaire for adults was also offered to adolescents. However, this did not include a full PHQ9 questionnaire, so we created a different use case. The target variable to be classified on a 4-level ordinal scale is perceived dejection coming from the PHQ instruments, features are a subset of quality of live assessments and PHQ questions, such as concernment, tremor, comfort, leisure quality, lethargy, prostration, and irregular sleep. For this study, the median time gap of two follow up assessments is 7.3 days.

Analogous to the mental health of adults, this study aims to track how the physical health of adults changes during the pandemic period.

Adults had the option to sign up for a study with recurring assessments asking for their physical health. The target variable to be classified asks about the constraints in everyday life that arise due to physical pain at tnext. The features for this use case include aspects like sport, nutrition, and pain at tnow. The median time gap of two assessments for the same user is 14.0 days.

This additional study within the Corona Health app asks users about their stress level on a weekly basis. Both features and target are assessed on a five-level ordinal scale from never to very often. The target asks for the ability of stress management, features include the first nine questions of the perceived stress scale instrument26. The median time gap of two assessments for the same user on average is 7.0 days.

We also want to compare the ML approaches with a baseline heuristic (synonym: Baseline model). A baseline heuristic can be a simple ML model like a linear regression or a small Decision Tree, or alternatively, depending on the use case, it could also be a simple statement like The next value equals the last one". The typical approach for improving ML models is to estimate the generalization error of the model on a benchmark data set when compared to a baseline heuristic. However, it is often not clear, which baseline heuristic to consider, i.e.: The same model architecture as the benchmark model, but without tuned hyperparameters? A simple, intrinsically explainable model with or without hyperparameter tuning? A random guess? A naive guess, in which the majority class is predicted? Since we have approaches on a user-level (i.e., we consider users when splitting) and on an assessment-level (i.e., we ignore users when splitting), we also should create baseline heuristics on both levels. We additionally account for within-user variance in Ecological Momentary Assessments by averaging a users previously known assessments. Previously known here means that we calculate the mode or median of all assessments of a user that are older than the given timestamp. In total, this leads to four baseline heuristics (user-level latest, user-level average, assessment-level latest, assessment-level average) that do not use any machine learning but simple heuristics. On the assessment-level, the latest known target or the mean of all known targets so far is taken to predict the next target, no matter of the user-id of this assessment. On the user-level, either the last known, or median, or mode value of this user is taken to predict the target. This, in turn, leads to a cold-start problem for users that appear for the first time in a dataset. In this case, either the last known, or mode, or median of all assessments that are known so far are taken to predict the target.

Before the data and approaches could be compared, it was necessary to homogenize them. In order for all approaches to work on all data sets, at least the following information is necessary: Assessment_id, user_id, timestamp, features, and the target. Any other information such as GPS data, or additional answers to questions of the assessment, we did not include into the ML pipeline. Additionally, targets that were collected on a continuous scale, had to be binned into an ordinal scale of five classes. For an easier interpretation and readability of the outputs, we also created label encodings for each target. To ensure consistency of the pre-processing, we created helper utilities within Python to ensure that the same function was applied on each dataset. For missing values, we created a user-wise missing value treatment. More precisely, if a user skipped a question in an assessment, we filled the missing value with the mean or mode (mode = most common value) of all other answers of this user for this assessment. If a user had only one assessment, we filled it with the overall mean for this question.

For each dataset and for each script, we set random states and seeds to enhance reproducibility. For the outer validation set, we assigned the first 80 % of all users that signed up for a study to the train set, the latest 20% to the test set. To ensure comparability, the test users were the same for all approaches. We did not shuffle the users to simulate a deployment scenario where new users join the study. This would also add potential concept drift from the train to the test set and thus improve the simulation quality.

For the cross-validation within the training set, which we call internal validation, we chose a total of 5 folds with 1 validation fold. We then applied the four baseline heuristics (on user level and assessment level with either latest target or average target as prediction) to calculate the within-train-set performance standard deviation and the mean of the weighted F1 scores for each train fold. The mean and standard deviation of the weighted F1 score are then the estimator of the performance of our model in the test set.

We call one approach superior to another if the final score is higher. The final score to evaluate an approach is calculated as:

$${f}_{1}^{final}={f}_{1}^{test}-alpha {sigma }left({f}_{1}^{train}right)$$

(1)

If the standard deviation between the folds during training is large, the final score is lower. The test set must not contain any selection bias against the underlying population. The pre-factor of the standard deviation is another hyperparameter. The more important model robustness for the use case, the higher should be set.

Within cross-validation, there exist several approaches on how to split up the data into folds and validate them, such as the k-fold approach with k as the number of folds in the training set. Here, k1 folds form the training folds and one fold is the validation fold27. One can then calculate k performance scores and their standard deviation to get an estimator for the performance of the model in the test set, which itself is an estimator for the models performance after deployment (see also Fig.2).

Schematic visualisation of the steps required to perform a k-fold cross-validation, here with k=5.

In addition, there exist the following strategies: First, (repeated) stratified k-fold, in which the target distribution is retained in each fold, which can also be seen in Fig.3. After shuffling the samples, the stratified split can be repeated3. Second, leave-one-out cross-validation28, in which the validation fold contains only one sample while the model has been trained on all other samples. And third, leave-p-out cross-validation, in which (left(begin{array}{c}n\ pend{array}right)) train-test-pairs are created with n equals number of assessments (synonym sample)29.

While this approach retains the class distribution in each fold, it still ignores user groups. Each color represents a different class or user id.

These approaches, however, do not always focus on samples that might belong to our mHealth data peculiarities. To be more specific, they do not account for users (syn. groups, subjects) that generate daily assessments (syn. samples) with a high variance.

To precisely explain the splitting approaches, we would like to differentiate between the terms folds and sets. We call a chunk of samples (synonym: assessments, filled-out questionnaires) a set on the outer split of the data, for which we cut-off the final test set. However, within the training set, we then split further to create training and validation folds. That is, using the term fold, we are in the context of cross validation. When we use the term set, then we are in the outer split of the ML pipeline. Figure4 visualizes this approach. Following this, we define 4 different approaches to split the data. For one of them we ignore the fact that there are users, for the other three we do not. We call these approaches user-cut, average-user, user-wise and time-cut. All approaches have in common that the first 80 % of all users are always in the training set and the remaining 20 % are in the test set. A schematic visualization of the splitting approaches is shown in Fig.5. Within the training set, we then split on user-level for the approaches user-cut, average-user and user-wise, and on assessment-level for the approach time-cut.

In the second step, users are ordered by their study registration time, with the initial 80 % designated as training users and the remaining 20 % as test users. Subsequently, assessments by training users are allocated to the training set, and those by test users to the test set. Within the training set, user grouping dictates the validation approach: group-cross-validation is applied if users are declared as a group, otherwise, standard cross-validation is utilized. We compute the average f1 score, ({f}_{1}^{train}), from training folds and the f1 score on the test set, ({f}_{1}^{test}). The standard deviation of ({f}_{1}^{train},sigma ({f}_{1}^{train})), indicates model robustness. The hyperparameter adjusts the emphasis on robustness, with higher values prioritizing it. Ultimately, ({f}_{1}^{final}), which is a more precise estimate if group-cross-validation is applied, offers a refined measure of model performance in real-world scenarios.

Yellow means that this sample is part of the validation fold, green means it is part of a training fold. Crossed out means that the sample has been dropped in that approach because it does not meet the requirements. Users can be sorted by time to accommodate any concept drift.

In the following section, we will explain the splitting approaches in more detail. The time-cut approach ignores the fact of given groups in the dataset and simply creates validation folds based on the time the assessments arrive in the database. In this example, the month, in which a sample was collected, is known. More precisely, all samples from January until April are in the training set while May is in the test set. The user-cut approach shuffles all user ids and creates five data folds with distinct user-groups. It ignores the time dimension of the data, but provides user-distinct training and validation folds, which is like the GroupKFold cross-validation approach as implemented in scikit-learn30. The average-user approach is very similar to the user-cut approach. However, each answer of a user is replaced by the median or mode answer of this user up to the point in question to reduce within-user-variance. While all the above-mentioned approaches require only one single model to be trained, the user-wise approach requires as many models as distinct users are given in the dataset. Therefore, for each user, 80 % of his or her assessments are used to train a user-specific model, and the remaining 20% of the time-sorted assessments are used to test the model. This means that for this approach, we can directly evaluate on the test set as each model is user specific and we solved the cold-start problem by training the model on the first assessments of this user. If a user has less than 10 assessments, he or she is not evaluated on that approach.

Approval for the UNITI randomized controlled trial and the UNITI app was obtained by the Ethics Committee of the University Clinic of Regensburg (ethical approval No. 20-1936-101). All users read and approved the informed consent before participating in the study. The study was carried out in accordance with relevant guidelines and regulations. The procedures used in this study adhere to the tenets of the Declaration of Helsinki. The Track Your Tinnitus (TYT) study was approved by the Ethics Committee of the University Clinic of Regensburg (ethical approval No. 15-101-0204). The Corona Check (CH) study was approved by the Ethics Committee of the University of Wrzburg (ethical approval no. 71/20-me) and the universitys data protection officer and was carried out in accordance with the General Data Protection Regulations of the European Union. The procedures used in the Corona Health (CH) study were in accordance with the 1964 Helsinki declaration and its later amendments and was approved by the ethics committee of the University of Wrzburg, Germany (No. 130/20-me). Ethical approvals include secondary use. The data from this study are available on request from the corresponding author. The data are not publicly available, as the informed consent of the participants did not provide for public publication of the data.

Further information on research design is available in theNature Portfolio Reporting Summary linked to this article.

Go here to read the rest:
Practical approaches in evaluating validation and biases of machine learning applied to mobile health studies ... - Nature.com