Archive for the ‘Machine Learning’ Category

Application of machine learning techniques to predict bone m | CMAR – Dove Medical Press

Wen-Cai Liu,1,2 Ming-Xuan Li,2 Wen-Xing Qian,3 Zhi-Wen Luo,1,4 Wei-Jie Liao,1,4 Zhi-Li Liu,1,4 Jia-Ming Liu1,4

1Department of Orthopaedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, 330006, Peoples Republic of China; 2The First Clinical Medical College of Nanchang University, Nanchang, 330006, Peoples Republic of China; 3School of Computer and Information Technology, Beijing Jiaotong University, Beijing, 100044, Peoples Republic of China; 4Institute of Spine and Spinal Cord, Nanchang University, Nanchang, 330006, Peoples Republic of China

Correspondence: Jia-Ming LiuDepartment of Orthopaedic Surgery, The First Affiliated Hospital of Nanchang University, No. 17 Yongwaizheng Street, Donghu District, Nanchang, Jiangxi Province, Peoples Republic of ChinaTel/Fax +86-791-86319815Email [emailprotected]

Objective: This study aimed to develop and validate a machine learning model for predicting bone metastases (BM) in prostate cancer (PCa) patients.Methods: Demographic and clinicopathologic variables of PCa patients in the Surveillance, Epidemiology and End Results (SEER) database from 2010 to 2017 were retrospectively analyzed. We used six different machine learning algorithms, including Decision tree (DT), Random forest (RF), Multilayer Perceptron (MLP), Logistic regression (LR), Naive Bayes classifiers (NBC), and eXtreme gradient boosting (XGB), to build prediction models. External validation using data from 644 PCa patients of the First Affiliated Hospital of Nanchang University from 2010 to 2016. The performance of the models was evaluated using the area under receiver operating characteristic curve (AUC), accuracy score, sensitivity (recall rate) and specificity. A web predictor was developed based on the best performance model.Results: A total of 207,137 PCa patients from SEER were included in this study. Of whom, 6725 (3.25%) developed BM. Gleason score, Prostate-specific antigen (PSA) value, T, N stage and age were found to be the risk factors of BM. The XGB model offered the best predictive performance among these 6 models (AUC: 0.962, accuracy: 0.884, sensitivity (recall rate): 0.906, and specificity: 0.879). An XGB model-based web predictor was developed to predict BM in PCa patients.Conclusion: This study developed a machine learning model and a web predictor for predicting the risk of BM in PCa patients, which may help physicians make personalized clinical decisions and treatment strategy for patients.

Keywords: prostate cancer, bone metastasis, machine learning, prediction model, SEER

Prostate cancer (PCa) is the most common non-skin cancer among men globally, with approximately 1.6 million cases and 366,000 deaths reported each year.1,2 Metastatic prostate cancer has important clinical implications, and metastatic disease may occur on the occasion of the initial clinical diagnosis.3,4 Bone metastases (BM) take up a great proportion of patients with metastases, accounting for approximately 16.7% of all metastatic cases, and these patients had significantly reduced 5-year survival and quality of life.5,6

Recently, prostate-specific membrane antigen (PSMA) ligands have presented good results in the diagnosis and treatment of BM from PCa.7 However, this method is still at the stage of clinical trial and is not acceptable for most patients due to the radiation injury and the high cost.8,9 The current method for detecting BM in PCa patients is mainly bone scan. But it just suggested for patients with suspected skeletal-related events (SRE) due to the severe radiation injury. In addition, the median time to SRE has been reported to be 5 months after bone metastases.10 An assisted decision-making system that can help determining which cancer patients should receive bone scan will provide help for addressing these issues. And with the development of precision medicine, the diagnosis and treatment of cancer should be individualized. The Surveillance, Epidemiology and End Results (SEER) database is a publicly obtainable, federally funded cancer reporting system that brings raw material to our insights into complex diseases.11 With the increasing level of computer hardware, machine learning can facilitate the diagnosis of cancer metastasis by processing and analyzing large, heterogeneous and complex clinical data and building predictive models. And there are already studies that have been conducted in this area with good results.1214

Therefore, in this study, we aim to build a prediction model to evaluate the risk of BM in PCa patients based on machine learning techniques, and develop a web-based predictor that can be easily manipulated by physicians and patients. This study may provide some help for clinicians to make personalized decisions for the treatment of patients with PCa BM.

The study is carried out based on the SEER database. Patients data were obtained from the SEER Research Plus Data, 18 Registries, Nov 2019 Sub (20002017) and downloaded using SEER*stat 8.3.9 software. Patients who were diagnosed with PCa from 2010 to 2017 were included in this study. Exclusion criteria were as follows:1 PCa was not the first tumor.2 The information of race, grade, Prostate-specific antigen (PSA), Gleason score, T, N stage, metastatic status and marital status missed or unknown. Additionally, 644 PCa patients from the First Affiliated Hospital of Nanchang University from 2010 to 2016 were included for external validation. The case screening process is shown in Figure 1.

Figure 1 Flow diagram of the study population selected from the Surveillance, Epidemiology, and End Results (SEER) database and the First Affiliated Hospital of Nanchang University. According to the inclusion and exclusion criteria, a total of 207,137 patients of SEER were included in this study, and they were randomly cut into the training and internal test sets in a 7:3 ratio. Data from the First Affiliated Hospital of Nanchang University as an external test set.

Eight variables from the SEER database that may affect BM in PCa patients were selected in this study, including age at diagnosis, race, grade, PSA value, Gleason score, T, N stage and marital status. Besides, we only included patients with PSA values between 0.1 and 98.0ng/mL. Because no specific values over 98.0 ng/mL were provided by the SEER database. All patients enrolled in this study were staged using the 7th edition of the AJCC TNM staging system and relevant guidelines of the SEER program.

All statistical analysis was performed in Python (version 3.8, Python Software Foundation) and SPSS (version 26, IBM, USA).15 All machine learning algorithms were built based on scikit-learn (version 0.24.1). The patient data were randomly sliced into training and internal test sets in a ratio of 7:3 using python. Training set is used to build the model, and the internal test set is used for model validation and evaluation.

In order to determine the variables included in the machine learning model, we conducted a univariate analysis to compare these variables between patients with and without BM. The Chi-square test was utilized for categorical data, and the Wilcoxon rank-sum test was used for continuous non-normally distributed data. Variables with a P < 0.05 in univariate analysis were enclosed within the construction of machine learning models and multivariate logistic regression was performed to identify the risk factor for BM. Meanwhile, based on the Permutation Importance principle,16 we performed feature importance analysis on the variables in each machine learning model. Correlation analysis was performed on the screened variables to test whether the variables would affect each other.

This study used six different machine learning algorithms to model the data, including Decision tree (DT), Random Forest (RF), Multilayer Perceptron (MLP), Logistic regression (LR), Naive Bayes classifiers (NBC) and eXtreme gradient boosting (XGB).1722 The ML algorithms were trained by using Python software to predict the BM in PCa patients. Model parameter settings are detailed in the Supplementary Materials. Parameter settings link is accessible from https://share.streamlit.io/liuwencaincu/prostate-cancer/main/prostate.py.Then, the predictive power of the machine learning models was evaluated in internal ten-fold cross-validation of the train set, internal test set and external test set. The area under the receiver operating characteristic curve (AUC), the sensitivity (recall rate), specificity, and accuracy score were calculated. The best-performing model was selected to build a web predictor.

A total of 207,137 PCa patients from SEER database were included in the study. Of whom, 6725 (3.25%) developed BM and 200,412 (96.75%) had no BM. All demographic and clinicopathological characteristics of these patients were demonstrated in detail in Table 1. All patients were randomly cut into a training set (n = 144,995) and an internal test set (n = 62,142) in a ratio of 7:3. External validation was conducted by using data of 644 PCa patients from the First Affiliated Hospital of Nanchang University. The details of the training and test sets are shown in Table 2.

Table 1 Clinical and Pathological Characteristics of Study Population

Table 2 Clinical and Pathological Characteristics of Training Set and Test Set

Based on the univariate analysis, age, race, grade, T, N stage, PSA value, Gleason score and marital status were significantly associated with the BM in PCa patients (P < 0.05) (Table 3). Variables with a P values <0.05 between these two groups were selected for multivariate logistic regression analysis. Based on the analysis, T, N stage, Gleason score and PSA value were found to be the independent risk factors for BM in PCa patients (Table 3).

Table 3 Univariate Analysis and Multivariate Logistic Regression Analysis of Variables

Correlation tests were performed among different variables identified from the univariate analysis. The correlation heat map showed that variables were not significantly correlated with each other (Figure 2), which indicated that they were mutually independent.

Figure 2 Results of correlation analysis between all variables. The heat map shows the correlation between the variables.

The importance of features in each machine learning model for predicting BM is shown in Figure 3. Although the importance of features varied slightly among different machine learning algorithms, PSA value, Gleason score and N stage ranked at the top three of five models, which had similarities with the results of multivariate logistic analysis. In contrast, marital status ranked at the last place in most algorithms, but it also made some contributions to the models. In the XGB model, the importance of features were sorted in descending order by PSA value, Gleason score, N stage, T stage, age, grade, race and marital status.

Figure 3 Feature importance of different models. The plot shows the ranking of the relevant importance of features in all models.

The predictive performance of different models was compared using the internal ten-fold cross-validation of training set, internal and external test sets, which were detailed in Figures 46 and Table 4. Among these models, the XGB model showed the best performance with an average AUC of 0.951 in the internal ten-fold cross-validation (Figure 4). In the internal test set, the XGB model gained the best score with an AUC of 0.955, an accuracy of 0.881, a sensitivity (recall rate) of 0.905 and a specificity of 0.880. In the external test set, the XGB model also showed excellent performance with an AUC of 0.962, an accuracy of 0.884, a sensitivity (recall rate) of 0.906 and a specificity of 0.879 (Figures 5 and 6 and Table 4). Meanwhile, the prediction results of different models were presented with a heat map in Figure 7.

Table 4 Comparison Prediction Performances of Different Models for Bone Metastasis

Figure 4 Ten-fold cross-validation results of different machine learning models in the training set.

Abbreviations: DT, Decision tree; LR, Logistic regression; MLP, Multilayer Perceptron; NBC, Naive Bayes classification; RF, Random Forest; XGB, eXtreme gradient boosting.

Figure 5 The roc curves of different machine learning models in internal test set and external test set.

Figure 6 Prediction performances of different models.

Figure 7 Prediction results of the different models. The heat map shows the predicted results of all models versus the actual situation in internal test set and external test set. Each column in the heat map represents the models predicted results of bone metastases for all patients in the dataset. Dark colors represent bone metastases cases and light colors are non-bone metastases.

A web predictor based on the best predictive performance of the XGB model was developed to predict BM in PCa patients. The risk of BM from PCa could be easily predicted by simply setting the variables in the sidebar of the web page (https://share.streamlit.io/liuwencaincu/prostate-cancer/main/prostate.py) (Figure 8).

Figure 8 The machine learning model-based web predictor for predicting bone metastases in prostate cancer patients.

Prostate cancer is the second leading cause of cancer-related death among men in the world. Bone-related events usually occur after bone metastases, which will result in reduced patient quality of life and survivorship.23 The current diagnostic method for BM from PCa is bone scan and prostate-specific membrane antigen examination combined with nuclear imaging. However, there is radiation injury and high cost for the bone scan, and not all PCa patients are recommended for BM screening.2426 Tissue biopsy is another method for the diagnosis. But it may increase the risk of further tumor invasion. Although skeletal-related events are considered to be a sign of BM, it would not be reasonable for screening BM from PCa because it may delay the treatment. Thus, it makes sense to develop a model to provide early attention and screening PCa patients at a high risk of BM. In this study, we built a predictive model using machine learning technologies to predict BM in PCa patients and identify patients at a high risk of BM.

With the development of the computer technique, machine learning technology has been widely used in different fields. And it also shows great promise for application in the biomedical science.13,27 Studies have already used machine learning technology to predict the development of diseases.12,28 In this study, several widely used machine learning algorithms were developed and validated to predict the risk of BM in PCa patients. After the comparison of algorithms with several evaluation indicators, the XGB algorithm-based prediction model showed the best performance among these models. The XGB model achieved better performance than others probably because it uses a number of strategies to prevent overfitting, exploits the second-order derivatives of the loss function and supports parallelization, and has a fast data processing speed.29 These results can provide clinicians with more accurate prediction outcomes and help them to make personalized decisions for the treatment of PCa.

In the present study, four risk factors, including T, N stage, PSA and Gleason Score, were screened out of the pre-selected eight factors by univariate and multivariate analyses. They were in high agreement with the models feature importance ranking (Table 4, Figure 3). However, machine learning algorithms consider that other variables that are not statistically significant can also make some contribution to the prediction. This may be because the algorithms can make predictions by exploring intrinsic connections between data that cannot be discovered through traditional statistical methods.

Studies have found that PCa patients with high PSA values have a higher likelihood of developing BM. It is recommended that PCa patients with PSA >20ng/mL need a bone scan to check for BM.30,31 In this study, we also found that PSA values were important in predicting BM from PCa. It has been shown that advanced age and clinical stage were both correlated with the BM from PCa and poor prognosis.32,33 Chen et al34 found that age greater than 70 years was the threshold for significantly higher risk of BM for PCa patients. However, Stolzenbach et al35 found that age and race were not informative in predicting the progression of PCa metastasis, which was in line with the results of our study. Patients diagnosed at the T4 stage had the highest risk of BM. The same trend was seen in Grade. These results were consistent with the findings of Lu et al and Guo et al.32,36 Patients with regional lymphoma metastases tend to be more likely to develop BM than those with N0 stage.37 The Gleason score also played an important role in the prediction model. The degree of BM in patients varies greatly with the change of Gleason score.38 It was reported that Gleason score greater than 6 had a high specificity (88.9%) for the diagnosis of BM.39 In addition, unmarried male patients had a higher risk of BM from PCa, which was in line with previous studies.36,40 Our model adequately incorporates various risk factors that may affect BM in PCa patients and achieves excellent predictive performance.

Based on machine learning algorithms and the huge amount of data in the SEER database, a model was constructed to predict BM in PCa patients and a web page predictor was developed. However, there are still some limitations in this study. First, the population in this study was obtained from the SEER database and externally validated using data just from a single center, which will be limitations for the application. Second, due to the inherent black-box properties of machine learning algorithms, it may pose some difficulties for the interpretation of the model. Third, the SEER database just reports the initial diagnostic information of PCa patients, and further therapeutic information is missing. We cannot access this information for further analysis.

In conclusion, we developed a prediction model to predict the risk of BM in PCa patients based on the XGB algorithm with machine learning techniques, and developed a web predictor in this study. And those at high risk of BM were recommended for further detailed screening based on the web predictor. This may help physicians to individualize the treatment of BM in patients with PCa.

The datasets generated and/or analyzed during the current study are available in the SEER database (https://seer.cancer.gov/).

We received permission to access the research data file in the SEER program from the National Cancer Institute, US. Approval was waived by the local ethics committee, as SEER data is publicly available and de-identified. This study was approved by the Ethics Committee of the First Affiliated Hospital of Nanchang University, and cases from the First Affiliated Hospital of Nanchang University signed written informed consent form. This study followed the guidelines outlined in the Declaration of Helsinki.

This work is supported by the Department of Science and Technology Program of Jiangxi Province, China (No. 20202BBGL73015, 20203BBG73045) and the project of Jiangxi Provincial Health Commission (No. 20161024).

The authors report no conflicts of interest in this work.

1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70(1):730. doi:10.3322/caac.21590

2. Steele CB, Li J, Huang B, Weir HK. Prostate cancer survival in the United States by race and stage (20012009): findings from the concord2 study. Cancer. 2017;123:51605177. doi:10.1002/cncr.31026

3. McDougall JA, Bansal A, Goulart BH, et al. The clinical and economic impacts of skeletal-related events among medicare enrollees with prostate cancer metastatic to bone. Oncologist. 2016;21(3):320. doi:10.1634/theoncologist.2015-0327

4. Nrgaard M, Jensen A, Jacobsen JB, Cetin K, Fryzek JP, Srensen HT. Skeletal related events, bone metastasis and survival of prostate cancer: a population based cohort study in Denmark (1999 to 2007). J Urol. 2010;184(1):162167. doi:10.1016/j.juro.2010.03.034

5. Sanjaya I, Mochtar CA, Umbas R. Correlation between low Gleason score and prostate specific antigen levels with incidence of bone metastases in prostate cancer patients: when to omit bone scans? Asian Pac J Cancer Prev. 2013;14(9):49734976. doi:10.7314/APJCP.2013.14.9.4973

6. Saad F, Lipton A, Cook R, Chen YM, Smith M, Coleman R. Pathologic fractures correlate with reduced survival in patients with malignant bone disease. Cancer. 2007;110(8):18601867. doi:10.1002/cncr.22991

7. Haberkorn U, Eder M, Kopka K, Babich JW, Eisenhut M. New strategies in prostate cancer: prostate-specific membrane antigen (PSMA) ligands for diagnosis and therapy. CliNl Cancer Res. 2016;22(1):915. doi:10.1158/1078-0432.CCR-15-0820

8. Fendler WP, Rahbar K, Herrmann K, Kratochwil C, Eiber M. 177Lu-PSMA radioligand therapy for prostate cancer. J Nucl Med. 2017;58(8):11961200. doi:10.2967/jnumed.117.191023

9. Barrio M, Fendler WP, Czernin J, Herrmann K. Prostate specific membrane antigen (PSMA) ligands for diagnosis and therapy of prostate cancer. Expert Rev Mol Diagn. 2016;16(11):11771188. doi:10.1080/14737159.2016.1243057

10. Farooki A, Leung V, Tala H, Tuttle RM. Skeletal-related events due to bone metastases from differentiated thyroid cancer. J Clin Endocrinol Metab. 2012;97(7):24332439. doi:10.1210/jc.2012-1169

11. Doll KM, Rademaker A, Sosa JA. Practical guide to surgical data sets: surveillance, epidemiology, and end results (SEER) database. JAMA Surg. 2018;153(6):588589. doi:10.1001/jamasurg.2018.0501

12. Liu WC, Li ZQ, Luo ZW, Liao WJ, Liu ZL, Liu JM. Machine learning for the prediction of bone metastasis in patients with newly diagnosed thyroid cancer. Cancer Med. 2021;10(8):28022811. doi:10.1002/cam4.3776

13. Goecks J, Jalili V, Heiser LM, Gray JW. How machine learning will transform biomedicine. Cell. 2020;181(1):92101. doi:10.1016/j.cell.2020.03.022

14. Darcy AM, Louie AK, Roberts LW. Machine learning and the profession of medicine. JAMA. 2016;315(6):551552. doi:10.1001/jama.2015.18421

15. Oliphant TE. Python for scientific computing. Comput Sci Eng. 2007;9(3):1020. doi:10.1109/MCSE.2007.58

16. Altmann A, Toloi L, Sander O, Lengauer T. Permutation importance: a corrected feature importance measure. Bioinformatics. 2010;26(10):13401347. doi:10.1093/bioinformatics/btq134

17. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H. Xgboost: extreme gradient boosting. R Package Version 04-2. 2015;1(4):14.

18. Qi Y. Random forest for bioinformatics. In: Ensemble Machine Learning. Springer; 2012:307323.

19. Tang J, Deng C, Huang G-B. Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst. 2015;27(4):809821. doi:10.1109/TNNLS.2015.2424995

20. Sperandei S. Understanding logistic regression analysis. Biochem Med. 2014;24(1):1218. doi:10.11613/BM.2014.003

21. Myles AJ, Feudale RN, Liu Y, Woody NA, Brown SD. An introduction to decision tree modeling. J Chemom. 2004;18(6):275285. doi:10.1002/cem.873

22. Rish I, editor. An empirical study of the naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence; 2001.

23. Fornetti J, Welm AL, Stewart SA. Understanding the bone in cancer metastasis. Bone Miner Res. 2018;33(12):20992113. doi:10.1002/jbmr.3618

24. Kuten J, Fahoum I, Savin Z, et al. Head-to-head comparison of 68Ga-PSMA-11 with 18F-PSMA-1007 PET/CT in staging prostate cancer using histopathology and immunohistochemical analysis as a reference standard. J Nucl Med. 2020;61(4):527532. doi:10.2967/jnumed.119.234187

25. Afshar-Oromieh A, Babich JW, Kratochwil C, et al. The rise of PSMA ligands for diagnosis and therapy of prostate cancer. J Nucl Med. 2016;57(Supplement3):79S89S. doi:10.2967/jnumed.115.170720

26. Lenzo NP, Meyrick D, Turner JH. Review of gallium-68 PSMA PET/CT imaging in the management of prostate cancer. Diagnostics. 2018;8(1):16. doi:10.3390/diagnostics8010016

27. Camacho DM, Collins KM, Powers RK, Costello JC, Collins JJ. Next-generation machine learning for biological networks. Cell. 2018;173(7):15811592. doi:10.1016/j.cell.2018.05.015

28. Zhu J, Zheng J, Li L, et al. Application of machine learning algorithms to Predict central lymph node metastasis in T1-T2, non-invasive, and clinically node negative papillary thyroid carcinoma. Front Med. 2021;8. doi:10.3389/fmed.2021.635771

29. Ogunleye A, Wang Q-G. XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans Comput Biol Bioinform. 2019;17(6):21312140. doi:10.1109/TCBB.2019.2911071

30. Greene KL, Albertsen PC, Babaian RJ, et al. Prostate specific antigen best practice statement: 2009 update. J Urol. 2013;189(1):S2S11. doi:10.1016/j.juro.2012.11.014

31. Network NCC. NCCN clinical practice guidelines in oncology. Prostate cancer V. 4; 2011. Available from: http://www.nccn.org/professionals/physician_gls/pdf/prostate.pdf. Accessed November 17, 2021.

32. Guo X, Zhang C, Guo Q, et al. The homogeneous and heterogeneous risk factors for the morbidity and prognosis of bone metastasis in patients with prostate cancer. Cancer Manag Res. 2018;10:1639. doi:10.2147/CMAR.S168579

33. Briganti A, Suardi N, Gallina A, et al. Predicting the risk of bone metastasis in prostate cancer. Cancer Treat Rev. 2014;40(1):311. doi:10.1016/j.ctrv.2013.07.001

34. Chen S, Wang L, Qian K, et al. Establishing a prediction model for prostate cancer bone metastasis. Int J Biol Sci. 2019;15(1):208. doi:10.7150/ijbs.27537

35. Stolzenbach LF, Rosiello G, Deuker M, et al. The impact of race and age on distribution of metastases in patients with prostate cancer. J Urol. 2020;204(5):962968. doi:10.1097/JU.0000000000001131

36. Lu YJ, Duan WM. Establishment and validation of a novel predictive model to quantify the risk of bone metastasis in patients with prostate cancer. Transl Androl Urol. 2021;10(1):310325. doi:10.21037/tau-20-1133

37. Wilczak W, Wittmer C, Clauditz T, et al. Marked prognostic impact of minimal lymphatic tumor spread in prostate cancer. Eur Urol. 2018;74(3):376386. doi:10.1016/j.eururo.2018.05.034

38. Battisti V, Maders LD, Bagatini MD, et al. Ectonucleotide pyrophosphatase/phosphodiesterase (E-NPP) and adenosine deaminase (ADA) activities in prostate cancer patients: influence of Gleason score, treatment and bone metastasis. Biomed Pharmacother. 2013;67(3):203208. doi:10.1016/j.biopha.2012.12.004

39. Zaman MU, Fatima N, Sajjad Z. Metastasis on bone scan with low prostate specific antigen ( 20 ng/mL) and Gleasons score (< 8) in newly diagnosed Pakistani males with prostate cancer: should we follow Western guidelines. Asian Pac J Cancer Prev. 2011;12(6):15291532.

40. Vaarala MH, Hirvikoski P, Kauppila S, Paavonen TK. Identification of androgen-regulated genes in human prostate. Mol Med Rep. 2012;6(3):466472. doi:10.3892/mmr.2012.956

Read the original:
Application of machine learning techniques to predict bone m | CMAR - Dove Medical Press

SD Times Open-Source Project of the Week: KServe – SDTimes.com

KServe is a tool for serving machine learning models on Kubernetes. It encapsulates the complexity of tasks like autoscaling, networking, health checking, and server configuration. This allows users to provide their machine learning deployments with features like GPU Autoscaling, Scale to Zero, and Canary Rollouts.

Created by IBM and Bloombergs Data Science and Compute Infrastructure team, KServe was previously known as KFServing. It was inspired when IBM presented the idea to serve machine learning models in a serverless way using Knative. Together Bloomberg and IBM met at the Kubeflow Contributor Summit 2019, and at the time, Kubeflow didnt have a model serving component so the companies worked together on a new project to provide a model serving deployment solution.

The new project first debuted at KubeCon + CloudNativeCon North America in 2019. It was moved from the KubeFlow Serving Working Group into an independent organization in order to grow the project and broaden the contributor base. At this point the project became known as KServe.

KServe provides model explainability through integrations with Alibi, AI Explainability 360, and Captum. It also provides monitoring for models in production through integrations with Alibi-detect, AI Fairness 360, and Adversarial Robustness Toolbox (ART).

The project has been adopted by a number of organizations, including Nvidia, Cisco, Zillow, and more.

Read the original:
SD Times Open-Source Project of the Week: KServe - SDTimes.com

Your neighborhood matters: A machine-learning approach to the geospatial and social determinants of health in 9-1-1 activated chest pain – DocWire…

This article was originally published here

Res Nurs Health. 2021 Nov 24. doi: 10.1002/nur.22199. Online ahead of print.

ABSTRACT

Healthcare disparities in the initial management of patients with acute coronary syndrome (ACS) exist. Yet, the complexity of interactions between demographic, social, economic, and geospatial determinants of health hinders incorporating such predictors in existing risk stratification models. We sought to explore a machine-learning-based approach to study the complex interactions between the geospatial and social determinants of health to explain disparities in ACS likelihood in an urban community. This study identified consecutive patients transported by Pittsburgh emergency medical service for a chief complaint of chest pain or ACS-equivalent symptoms. We extracted demographics, clinical data, and location coordinates from electronic health records. Median income was based on US census data by zip code. A random forest (RF) classifier and a regularized logistic regression model were used to identify the most important predictors of ACS likelihood. Our final sample included 2400 patients (age 59 17 years, 47% Females, 41% Blacks, 15.8% adjudicated ACS). In our RF model (area under the receiver operating characteristic curve of 0.71 0.03) age, prior revascularization, income, distance from hospital, and residential neighborhood were the most important predictors of ACS likelihood. In regularized regression (akaike information criterion = 1843, bayesian information criterion = 1912, 2 = 193, df = 10, p < 0.001), residential neighborhood remained a significant and independent predictor of ACS likelihood. Findings from our study suggest that residential neighborhood constitutes an upstream factor to explain the observed healthcare disparity in ACS risk prediction, independent from known demographic, social, and economic determinants of health, which can inform future work on ACS prevention, in-hospital care, and patient discharge.

PMID:34820853 | DOI:10.1002/nur.22199

See original here:
Your neighborhood matters: A machine-learning approach to the geospatial and social determinants of health in 9-1-1 activated chest pain - DocWire...

Machine learning optimization of an electronic health record audit for heart failure in primary care – DocWire News

This article was originally published here

ESC Heart Fail. 2021 Nov 23. doi: 10.1002/ehf2.13724. Online ahead of print.

ABSTRACT

AIMS: The diagnosis of heart failure (HF) is an important problem in primary care. We previously demonstrated a 74% increase in registered HF diagnoses in primary care electronic health records (EHRs) following an extended audit procedure. What remains unclear is the accuracy of registered HF pre-audit and which EHR variables are most important in the extended audit strategy. This study aims to describe the diagnostic HF classification sequence at different stages, assess general practitioner (GP) HF misclassification, and test the predictive performance of an optimized audit.

METHODS AND RESULTS: This is a secondary analysis of the OSCAR-HF study, a prospective observational trial including 51 participating GPs. OSCAR used an extended audit based on typical HF risk factors, signs, symptoms, and medications in GPs EHR. This resulted in a list of possible HF patients, which participating GPs had to classify as HF or non-HF. We compared registered HF diagnoses before and after GPs assessment. For our analysis of audit performance, we used GPs assessment of HF as primary outcome and audit queries as dichotomous predictor variables for a gradient boosted machine (GBM) decision tree algorithm and logistic regression model. Of the 18 011 patients eligible for the audit intervention, 4678 (26.0%) were identified as possible HF patients and submitted for GPs assessment in the audit stage. There were 310 patients with registered HF before GP assessment, of whom 146 (47.1%) were judged not to have HF by their GP (over-registration). There were 538 patients with registered HF after GP assessment, of whom 374 (69.5%) did not have registered HF before GP assessment (under-registration). The GBM and logistic regression model had a comparable predictive performance (area under the curve of 0.70 [95% confidence interval 0.65-0.77] and 0.69 [95% confidence interval 0.64-0.75], respectively). This was not significantly impacted by reducing the set of predictor variables to the 10 most important variables identified in the GBM model (free-text and coded cardiomyopathy, ischaemic heart disease and atrial fibrillation, digoxin, mineralocorticoid receptor antagonists, and combinations of renin-angiotensin system inhibitors and beta-blockers with diuretics). This optimized query set was enough to identify 86% (n = 461/538) of GPs self-assessed HF population with a 33% reduction (n = 1537/4678) in screening caseload.

CONCLUSIONS: Diagnostic coding of HF in primary care health records is inaccurate with a high degree of under-registration and over-registration. An optimized query set enabled identification of more than 80% of GPs self-assessed HF population.

PMID:34816632 | DOI:10.1002/ehf2.13724

Read this article:
Machine learning optimization of an electronic health record audit for heart failure in primary care - DocWire News

High-performance, low-cost machine learning infrastructure is accelerating innovation in the cloud – MIT Technology Review

Artificial intelligence and machine learning (AI and ML) are key technologies that help organizations develop new ways to increase sales, reduce costs, streamline business processes, and understand their customers better. AWS helps customers accelerate their AI/ML adoption by delivering powerful compute, high-speed networking, and scalable high-performance storage options on demand for any machine learning project. This lowers the barrier to entry for organizations looking to adopt the cloud to scale their ML applications.

Developers and data scientists are pushing the boundaries of technology and increasingly adopting deep learning, which is a type of machine learning based on neural network algorithms. These deep learning models are larger and more sophisticated resulting in rising costs to run underlying infrastructure to train and deploy these models.

To enable customers to accelerate their AI/ML transformation, AWS is building high-performance and low-cost machine learning chips. AWS Inferentia is the first machine learning chip built from the ground up by AWS for the lowest cost machine learning inference in the cloud. In fact, Amazon EC2 Inf1 instances powered by Inferentia, deliver 2.3x higher performance and up to 70% lower cost for machine learning inference than current generation GPU-based EC2 instances. AWS Trainium is the second machine learning chip by AWS that is purpose-built for training deep learning models and will be available in late 2021.

Customers across industries have deployed their ML applications in production on Inferentia and seen significant performance improvements and cost savings. For example, AirBnBs customer support platform enables intelligent, scalable, and exceptional service experiences to its community of millions of hosts and guests across the globe. It used Inferentia-based EC2 Inf1 instances to deploy natural language processing (NLP) models that supported its chatbots. This led to a 2x improvement in performance out of the box over GPU-based instances.

With these innovations in silicon, AWS is enabling customers to train and execute their deep learning models in production easily with high performance and throughput at significantly lower costs.

Machine learning is an iterative process that requires teams to build, train, and deploy applications quickly, as well as train, retrain, and experiment frequently to increase the prediction accuracy of the models. When deploying trained models into their business applications, organizations need to also scale their applications to serve new users across the globe. They need to be able to serve multiple requests coming in at the same time with near real-time latency to ensure a superior user experience.

Emerging use cases such as object detection, natural language processing (NLP), image classification, conversational AI, and time series data rely on deep learning technology. Deep learning models are exponentially increasing in size and complexity, going from having millions of parameters to billions in a matter of a couple of years.

Training and deploying these complex and sophisticated models translates to significant infrastructure costs. Costs can quickly snowball to become prohibitively large as organizations scale their applications to deliver near real-time experiences to their users and customers.

This is where cloud-based machine learning infrastructure services can help. The cloud provides on-demand access to compute, high-performance networking, and large data storage, seamlessly combined with ML operations and higher level AI services, to enable organizations to get started immediately and scale their AI/ML initiatives.

AWS Inferentia and AWS Trainium aim to democratize machine learning and make it accessible to developers irrespective of experience and organization size. Inferentias design is optimized for high performance, throughput, and low latency, which makes it ideal for deploying ML inference at scale.

EachAWS Inferentiachip contains four NeuronCores that implement a high-performancesystolic arraymatrix multiply engine, which massively speeds up typical deep learning operations, such as convolution and transformers. NeuronCores are also equipped with a large on-chip cache, which helps to cut down on external memory accesses, reducing latency, and increasing throughput.

AWS Neuron, the software development kit for Inferentia, natively supports leading ML frameworks, likeTensorFlow andPyTorch. Developers can continue using the same frameworks and lifecycle developments tools they know and love. For many of their trained models, they can compile and deploy them on Inferentia by changing just a single line of code, with no additional application code changes.

The result is a high-performance inference deployment, that can easily scale while keeping costs under control.

Sprinklr, a software-as-a-service company, has an AI-driven unified customer experience management platform that enables companies to gather and translate real-time customer feedback across multiple channels into actionable insights. This results in proactive issue resolution, enhanced product development, improved content marketing, and better customer service. Sprinklr used Inferentia to deploy its NLP and some of its computer vision models and saw significant performance improvements.

Several Amazon services also deploy their machine learning models on Inferentia.

Amazon Prime Video uses computer vision ML models to analyze video quality of live events to ensure an optimal viewer experience for Prime Video members. It deployed its image classification ML models on EC2 Inf1 instances and saw a 4x improvement in performance and up to a 40% savings in cost as compared to GPU-based instances.

Another example is Amazon Alexas AI and ML-based intelligence, powered by Amazon Web Services, which is available on more than 100 million devices today. Alexas promise to customers is that it is always becoming smarter, more conversational, more proactive, and even more delightful. Delivering on that promise requires continuous improvements in response times and machine learning infrastructure costs. By deploying Alexas text-to-speech ML models on Inf1 instances, it was able to lower inference latency by 25% and cost-per-inference by 30% to enhance service experience for tens of millions of customers who use Alexa each month.

As companies race to future-proof their business by enabling the best digital products and services, no organization can fall behind on deploying sophisticated machine learning models to help innovate their customer experiences. Over the past few years, there has been an enormous increase in the applicability of machine learning for a variety of use cases, from personalization and churn prediction to fraud detection and supply chain forecasting.

Luckily, machine learning infrastructure in the cloud is unleashing new capabilities that were previously not possible, making it far more accessible to non-expert practitioners. Thats why AWS customers are already using Inferentia-powered Amazon EC2 Inf1 instances to provide the intelligence behind their recommendation engines and chatbots and to get actionable insights from customer feedback.

With AWS cloud-based machine learning infrastructure options suitable for various skill levels, its clear that any organization can accelerate innovation and embrace the entire machine learning lifecycle at scale. As machine learning continues to become more pervasive, organizations are now able to fundamentally transform the customer experienceand the way they do businesswith cost-effective, high-performance cloud-based machine learning infrastructure.

Learn more about how AWSs machine learning platform can help your company innovate here.

This content was produced by AWS. It was not written by MIT Technology Reviews editorial staff.

Link:
High-performance, low-cost machine learning infrastructure is accelerating innovation in the cloud - MIT Technology Review