Archive for the ‘Machine Learning’ Category

Revolutionizing carbon neutrality: Machine learning paves the way for advanced CO reduction catalysts – EurekAlert

A perspective highlights the transformative impact of machine learning (ML) on enhancing carbon dioxide reduction reactions (CO2RR), steering us closer to carbon neutrality. It emphasizes ML's ability to swiftly predict catalyst properties, innovate in the design of catalysts and electrodes, and elevate experimental synthesis with heightened efficiency and precision.

The quest for carbon neutrality has led scientists to explore innovative strategies to reduce atmospheric carbon dioxide reduction reactions (CO2). The carbon dioxide reduction reactions (CO2RR) offers a promising avenue by converting CO2 into value-added chemicals. However, the traditional trial-and-error approach in catalyst development is time-consuming and costly, necessitating novel approaches for rapid and efficient advancements.

In a perspective (doi: 10.1016/j.esci.2023.100136) published in the journal eScience, highlights machine learning's (ML) capacity to accelerate the prediction of catalyst properties, enhance the design of novel catalysts and electrodes, and support experimental synthesis with greater efficiency and accuracy.

The research delves deeply into ML revolutionary impact on enhancing and optimizing catalyst design for CO2RR, a key element in the quest for carbon neutrality. Leveraging advanced ML algorithms has allowed for a significant speed-up in identifying and refining catalysts, making the experimental synthesis process more streamlined than ever before. This methodology not only facilitates the rapid discovery of effective catalysts but also improves the accuracy in predicting their performance, dramatically cutting down the traditional time and resources needed for catalyst development. Highlighting ML's capability, the study sets a new standard for sustainable environmental solutions, showcasing its potential to bring about faster, more precise advancements in CO2RR catalyst technology, and encouraging future explorations in this vital field.

Prof. Zongyou Yin, one of the study's lead authors, emphasized, "Machine learning revolutionizes our approach to developing CO2 reduction catalysts, enabling faster, data-driven decisions that drastically cut down research time and accelerate our progress towards carbon neutrality."

The integration of machine learning into the development of catalysts for carbon dioxide reduction is a promising step towards achieving carbon neutrality. As the world continues to seek sustainable and efficient solutions to combat climate change, the innovative application of ML in environmental science opens new horizons for research and development.

###

Media contactEditorial Office of eScienceEmail: eScience@nankai.edu.cn

The publisher KeAi was established by Elsevier andChina Science Publishing & Media Ltd to unfold quality research globally. In 2013, our focus shifted to open access publishing. We now proudly publish more than 100 world-class, open access, English language journals, spanning all scientific disciplines. Many of these are titles we publish in partnership with prestigious societies and academic institutions, such as the National Natural Science Foundation of China (NSFC).

eScience a Diamond Open Access journal (free for both readers and authors before 2025) cooperated with KeAi and published online at ScienceDirect. eScience is founded by Nankai University and aims to publish high-quality academic papers on the latest and finest scientific and technological research in interdisciplinary fields related to energy, electrochemistry, electronics, and environment. eScience has been indexed by DOAJ, Scopus and ESCI. The latest CiteScore is 33.5 in 2024. The founding Editor-in-Chief is Professor Jun Chen from Nankai University. eScience has published 15 issues, which can be viewed at https://www.sciencedirect.com/journal/escience.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

Continue reading here:
Revolutionizing carbon neutrality: Machine learning paves the way for advanced CO reduction catalysts - EurekAlert

Construction of environmental vibration prediction model for subway transportation based on machine learning … – Nature.com

This chapter focuses on a brief overview of the prediction methods that need to be used and the construction of the methods, and then through the creation of a database, a new model for predicting track vibration in metro transportation using machine learning algorithms was constructed. The new model combines machine learning algorithms and a database model, enabling it to be more conducive to the analysis and prediction of metro rail data.

Database is a general term for the entire database system, which includes the entire system services, program services, librarians and computer applications. Which the construction of the database is generally on some logical language and information parameters for the design, usually use the design language for the SQL, mainly using the computer for the construction of the database as shown in Fig.1 for the database of the basic system construction model16.

Basic database framework.

The management system in the management system of the database in Fig.1 includes the relevant collection in the database, the software that manages the database, the relevant programs in the database, etc. are unified by the data administrator, and the user information and parameter storage are saved by the computer structure. Database through the maintenance of data management and control, used to achieve the management and storage of data. In the construction of the database, it is necessary to analyze the demand, determine the target data needed, and determine the information and entity information to be stored. The data prediction of the subway traffic environment is needed to store the subway environmental traffic and environmental vibration data, the database mainly serves the vibration measurement data and numerical calculation data17. In the construction of the data structure, including several kinds of storage information of the database, construction information, data detection information, vibration information, model building information and so on. Construction information is a collection of structure, soil, level, distance and building information for subway track information; vibration information and modeling information are a collection of actual information data on site. In the design of the database usually use the ER model for model building, the ER model is a conceptual method model that can describe the data of the real world. It is shown in Fig.2.

ER model basic structure.

As shown in Fig.2, the total construction conditions include a summary of the tunnel structure and soil conditions, which also includes a summary of the construction distance construction depth, the basic simulation model and vibration conditions are summarized in the construction conditions, in which the vibration conditions need to determine the vibration frequency, spectrum, and evaluation indexes, and experimental testing needs to be determined for the target, location, and instrumentation time and so on. The design of the model requires the use of software to build the model time name.

In the subway traffic environment of the model for the experiment, the need for managers to manage the database, data mobilization, the state of the vibration of the state of the prediction and other state of the analysis, for the management of data is mainly on the subway traffic environment vibration of the actual measurements and numerical data to manage the query and call. Vibration prediction is mainly on the subway better environmental vibration field using machine learning algorithms for data learning, so as to simulate the working conditions of the training and prediction, data visualization is mainly the use of charts and graphs of information on the data to analyze and display18. As shown in Fig.3 the various functions of the database are introduced.

Database function diagram.

As shown in Fig.3, the system components of the database are mainly data management, vibration model predictive analysis, and visualization and analysis of data. Through the analysis and use of the three modules can realize a variety of functions of the database, so as to then use learning algorithms to calculate and analyze the data. Machine learning is a learning method that allows computers to learn through data by issuing instructions to the computer, thereby commanding the computer to perform operations such as programming, and traditionally there are a variety of artificial learning algorithms, including decision trees, clustering, Gaussian, and vector machines. Deep learning then belongs to a sub-domain of machine learning learning algorithms, belonging to a more detailed division of machine learning in the collection. For the analysis of vibration in the underground traffic environment, it is necessary to take the parameter information of the train, such as speed, depth of vibration source, horizontal distance, shear wave velocity, damping ratio, and other factors affecting the vibration response of the subway traffic environment as the sample features, and the response parameters of vibration as the dependent variables. Therefore the parameter analysis is carried out using deep learning algorithms in machine learning algorithms, but since neural networks consist of a large number of neurons capable of transmitting and processing data information, the neurons are also capable of being trained and strengthened into a fixed neural ideology that allows for a stronger response to specific information. This allows for better data analysis and prediction of vibration situations in the metro traffic environment, which suggests that using neural network models is a better method of model prediction and analysis. Among them, for the traditional machine learning algorithm model building is mainly divided into input layer, output layer and hidden layer. Among them, the formula for the input layer is shown in Eq.(1)19.

In Eq.(1), (x_{i}) denotes the value output to the next layer and (a_{i}) denotes the input data. Where the hidden layer is calculated as shown in Eq.(2).

$$y_{j} = sumlimits_{i = 1}^{M} {w_{ij} x_{i} }$$

(2)

In Eq.(2), (y_{j}) denotes the output value of the input to the next layer and (w_{ij}) denotes the weights between the input layer and the hidden layer. The formula for the input layer is shown in Eq.(3).

$$o_{k} = sumlimits_{j = 1}^{o} {w_{kj} s_{j} }$$

(3)

In Eq.(3) (w_{kj}) denotes the weights between the hidden layer and the output layer, and (s_{j}) denotes the value of the output change through to the hidden layer, where (s_{j} = f(y_{j} )). Where the number of samples set is (m), the number of neurons in the neural network is (M) and the number of hidden layers is (Q)20. The basic model of the machine learning algorithm obtained after calculating each value is shown in Fig.4.

Neural network structure of machine learning algorithm.

In Fig.4, the basic structure of the machine learning algorithm consists of neuron visualization, where the input layer represents the input of data and passes it to the next layer, and the hidden layer represents where the data is transferred to the next output layer through computation and analysis. The output layer is the input and processed data through here to complete the final model data output. After the completion of the construction of the machine learning algorithm, it is necessary to calculate some of the necessary data such as some of the basic subway units within the railroad track control equations. As shown in Eq.(4).

$$E_{r}^{*} I_{r} frac{{partial^{2} u_{r} }}{{partial x^{4} }} + mfrac{{partial^{2} u_{r} }}{{partial t^{2} }} = Fe^{iw0t} delta (x - overline{{x_{0} }} - vt)$$

(4)

In Eq.(4), (E_{r}^{*}) represents the material and damping change elastic model of the railroad track, (v) represents the moving speed when loading, (w) represents the number of angular rotations of the track, (u_{r}) represents the moving distance of the track when it is in vertical direction, (m) represents the mass of the track per unit length, (overline{{x_{0} }}) represents the change of the train's position in the initial moment of loading, and (x) represents the moving distance. The damping force formula for the rail is shown in Eq.(5).

$$E^{*} = E(1 + 2ibeta )$$

(5)

In Eq.(5), (beta) denotes the ratio of damping in the medium, (E) denotes the modulus of elasticity, and (E^{*}) denotes the modulus of elasticity after considering the damping of the medium. As shown in Eq.(6)21.

$$G^{*} = G(1 + 2ibeta )$$

(6)

In Eq.(6), (G) denotes the shear modulus and (G^{*}) denotes the shear modulus after considering the medium damping. The two formulas are able to calculate the track operation data in the subway environment, thus obtaining the prediction data and experimental data that need to be calculated by the machine learning algorithm. As shown in Fig.5 is the flow chart of the basic operation structure of the machine learning algorithm22.

Basic operation process of machine learning algorithm.

As shown in Fig.5, first of all, in the machine learning algorithm on the subway environment track data to start the phase, the need to first initialize the data processing, by preparing the data set will be split into the data set, split into the training set of a test set of two parts, respectively, and then the algorithm of the data set, in the calculation of the data set of different data using the appropriate machine learning algorithms to analyze the data set, and train the appropriate algorithms after the analysis. At the same time, the size and features of the training algorithm are obtained through the above split analysis. Finally, the model is then used to determine whether the data set needs to be analyzed and calculated to output the algorithm model23.

The use of machine learning algorithms and database technology in the prediction of the subway transportation environment can improve the accuracy of the prediction data and reduce the cost of prediction by predicting the construction program, drawings, operation period and program enforceability data. Algorithmic predictive modelling applications for databases are generally programmed in Python. By saving the data information as a .py file in the Python programme, the entire data information is a single Python module, which can be imported into a different application using the import module in the programme. Therefore saving the algorithmic model and then calling the new data algorithmic program through the program ensures that the current data information can be used by the algorithmic model, which in turn enables the conduction of data between the model predictions and the database.

For the construction process of the algorithmic model and the underground vibration SQL database, firstly, in the query of the data parameters of the underground traffic environment, send data prediction requirements; SQL database through the search engine will get the data transmission into the server, after receiving the data through the Python programme view tool will be converted to the data module, and then the data will be transmitted into the algorithmic prediction model, and secondly Receive the current prediction result parameter data, and finally display the front-end interface in the form of data conversion in the Python programme to get the final prediction result data. At this time to complete the construction process of the database algorithm prediction model.

Vibration modeling refers to the use of formulas and measurements of vibration targets along the construction line to simulate the speed vibration frequency and vibration size of the vibration targets. However, the prediction results using the model need to make reference to the subway construction environment value and significance24. The traditional subway vibration prediction model uses the construction of empirical prediction methods, experimental prediction methods and other methods. The empirical prediction method is the more widely used prediction method, often using the fitting of the formula to achieve the vibration target prediction of the subway environment. As shown in Eq.(7).

$$L_{a} (room) = L_{t} (tunnelwall) - C_{g} - C_{gb} - C_{b}$$

(7)

In the formula (7), (L_{a} (room)) indicates the acceleration level of the inner wall of the tunnel in the construction of the subway building, and (L_{t} (tunnelwall)) indicates the vibration acceleration level of the ground in the construction of the subway. (C_{g}) (C_{gb}) and (C_{b}) indicate the attenuation of vibration in the vibration fault, vibration into the tunnel, vibration in the tunnel. Equation(8) for the prediction of vibration noise and vibration sound pressure formula.

$$L_{B} = L_{r} + R_{tr} + R_{tu} + R_{g} + R_{b}$$

(8)

In Eq.(8), (L_{B}) denotes the predicted sound pressure level of the subway environment, (L_{r}) denotes the velocity level of the subway track, (R_{tr}) denotes the amount of vibration energy lost in the subway track, (R_{tu}) denotes the reduction of vibration energy transmitted in the subway channel, (R_{g}) denotes the energy lost in the transmission of vibration energy at the ground level, and (R_{b}) denotes the loss of vibration energy in the inner wall of the subway tunnel. The vibration prediction formula when the subway passes through the soft soil layer is shown in Eq.(9)25.

$$V = F_{v} F_{R} F_{B} = [V_{T} F_{S} F_{D} ]F_{R} F_{B}$$

(9)

In Eq.(9), (F_{v}) represents the change of the function of vibration in the subway channel, (F_{R}) represents the change of the mass of the subway track, (F_{B}) represents the change of the subway building after amplification, (F_{T}) represents the type of the train passing through the subway, (F_{S}) represents the speed of the train, and (F_{D}) represents the distance of the train. The vibration prediction data can be obtained by the method of hammer tapping. As shown in Eq.(10)26.

$$L_{v} = L_{F} + TM_{line} + C_{building}$$

(10)

In Eq.(10), (L_{v}) denotes the vibration velocity level of the predicted value, (L_{F}) denotes the force density level of the vibration generating source, (TM_{line}) denotes the linear transmission efficiency of the train vibration, and (C_{building}) denotes the corrected energy of the vibration transmitted from the ground to the inner wall. The formula for the change of vibration environmental impact during train operation is shown in Eq.(11).

$$VL{}_{Zmax } = VL_{Z0max } + C_{VB}$$

(11)

In Eq.(11), (VL{}_{Zmax }) denotes the predicted maximum vibration level, (VL_{Z0max }) denotes the predicted vibration source intensity, and (C_{VB}) denotes the correction index of the vibration target. The improved formula of Eq.(11) is shown in Eq.(12) .

$$C_{VB} = C_{V} + C_{W} + C_{B} + C_{T} + C_{D} + C_{B} + C_{TD}$$

(12)

In Eq.(12), (C_{V}) represents the correction value of speed of running train, (C_{W}) represents the correction value of bearing weight and spring mass of train, (C_{R}) represents the correction condition value of train track, (C_{T}) represents the correction value of different track styles, (C_{D}) represents the correction value of attenuation of distance, (C_{B}) represents the correction value of building in the project, and (C_{TD}) represents the correction value of density of the train traveling. The change formula of vibration parameters in different areas is different, such as Eq.(13) for the vibration value in another area.

$$VL{}_{Zmax } = VL_{Zmax ,0} + C$$

(13)

In Eq.(13), (VL_{Zmax ,0}) denotes the vibration level of the maximum measured vibration source of the train passing through the engineered tunnel, and (C) denotes the corrected value of vibration.

$$C = C_{v} + C_{g} + C_{l} + C_{b} + C_{L} + C_{B}$$

(14)

Equation(14) is the expression of (C) value, in Eq.(14), (C_{v}) represents the correction value of the train's traveling speed, (C_{g}) represents the mass of the train's bearings, (C_{l}) represents the correction value of the train's traveling curve, (C_{b}) represents the correction value of the train's track, (C_{L}) represents the correction value of the train's traveling distance, and (C_{B}) represents the correction value of the track's construction. The empirical method is able to calculate the fitted formula, so using the empirical method for prediction can make the prediction more effective and less costly, so using the empirical method for prediction during the construction phase of the subway will make the construction project less costly, but the lack of accuracy of the empirical method is a big problem of this method. At this time the use of machine learning neuron method can greatly reduce the construction cost and enhance the accuracy of prediction.

Test prediction is the vibration assessment of the actual predicted vibration targets in the field, meaning that the actual measurement of the field data is then predicted. However, in the actual field test, because the subway has not yet been completed, the test data results are mostly simulation results data. Hybrid prediction method is through the prediction results of the accuracy and multi-parameter system coupled to the different methods of simulation experiments so that the experimental prediction method for the hybrid prediction method, usually hybrid prediction method can solve the complex parameter uncertainty and multi-data mixed system error problems. As shown in Fig.6, the prediction and evaluation process of subway construction environment.

Prediction and evaluation of subway construction environment.

Figure6 shows that in the feasibility of the program stage needs to be the subway line and construction phase of the feasibility analysis, through the empirical prediction method of the subway environment model for the prediction of the parameters, to achieve a full range of environmental prediction of the preliminary assessment. After the predictive analysis of the feasibility of the program, the actual data of the subway environment using the actual method of testing and analyzing the sensitivity of the calculator environment, to achieve the local data prediction of the subway environmental parameters. In the construction design stage, the sensitivity is judged by using the actual testing and in-tunnel measurement method to determine whether the sensitivity meets the construction requirements. Finally, the total construction process evaluation conclusion is calculated. In the prediction model of machine learning and database technology, the selected characteristic parameters include column velocity, vibration depth, distance, damping ratio, density, Poisson's ratio, shear wave speed and other influencing factors. Therefore, multiple neuron parameters need to be designed while building data for the model. The number of neurons designed determines the number of parameters selected for the data, the more parameters for the experimental data will be more comprehensive, and the whole prediction results will be more accurate.

See the original post:
Construction of environmental vibration prediction model for subway transportation based on machine learning ... - Nature.com

Introducing ‘Get started with generative AI on AWS: A guide for public sector organizations’ | Amazon Web Services – AWS Blog

Amazon Web Services (AWS) is excited to announce the release of a public-sector focused eBook on generative artificial intelligence (AI). Titled Get started with generative AI on AWS: A guide for public sector organizations, the new resource aims to help public sector leaders explore opportunities and best practices for adopting the technology responsibly.

Technologies like large language models (LLM) have shown great potential to automate tasks, personalize experiences, and enhance analysis capabilities across industries. However, we recognize public sector work holds unique obligations around accountability, accuracy, and equitable outcomes that must guide any technology changes.

Our new eBook provides guidance for navigating technical, cultural, and strategic considerations involved in building generative AI applications. Whether automating workflows, responding to constituents, or unlocking creativity, the core question remains: how can we maximize benefits while avoiding potential downsides?

This cost-free resource aims to support leaders as they grapple with implementation challenges. In the eBook, our experts share the importance of starting with a clear goal and use case. Once that is defined, we dive deep into considerations like model selection, secure and responsible use, and staffing. We hope more organizations can benefit from AI safely by outlining a methodology informed by our experiences powering projects across government, education, and more.

The eBook advises starting by precisely defining how generative AI might fit your operations and advance your mission. What are processes that could be streamlined versus areas ripe for innovation? Opportunities range from automated document processing to personalized services to accelerating analysis.

Carefully scoping initial applications helps determine skills and resources required for success. Testing modest pilots avoids overcommitting while providing critical learnings to refine subsequent efforts. Continued evaluation ensures AI augments rather than replaces human judgement where risks outweigh rewards.

Foundation models (FMs) represent a major step towards off-the-shelf AI applications. However, public sector realities necessitate considering variables like accuracy assurances, explainability needs, and effective oversight.

Model options are evaluated based on required outputs, compatible input types from available data sources, and customization desires on a spectrum from simple prompting to full retraining. Costs stem from hosting requirements and technical headcounts to develop, deploy, and maintain each approach.

When building for public trust, data responsibility becomes paramount. The eBook outlines techniques for classification, access controls, and continuous adaptations to regulations. It also introduces tools supporting oversight needs around topics like detecting biases or inaccuracies over the long term.

Collaboration across legal, security, and technical roles proves crucial to scaling guardrails as AI interfaces with sensitive domains. Similarly, change-ready architectures allow iterative improvements without sacrificing compliance as technologies and standards evolve together.

Technical abilities required vary by project complexity, necessitating assessments of in-house skills versus outsourcing potential components. Basic applications may involve self-service prompt engineering through interfaces like Amazon Bedrock playgrounds.

More advanced customizations demand resources for tasks like model training, configuration within machine learning (ML) pipelines, or DevOps maintenance after deploying onto infrastructure. Clear communication regarding AIs fit within broader strategic visions also galvanizes support.

In summary, the eBook provides an approachable framework and best practices compilation to help public sector organizations align generative AI capabilities with their constituents needs responsibly. Its publication signifies our dedication to powering innovations that serve humanity.

If youre a public sector leader interested in learning more about how generative AI can help accelerate your mission, dont miss the upcoming Transform Government with Generative AI learning series March 25-29. Hosted by AWS experts and featuring insights from agencies already applying these technologies, the series will guide attendees through key considerations for adoption highlighted in our eBook. Each session will explore practical steps for unlocking generative AIs potential responsibly. Register today to reserve your spot.

Access our new eBook Get started with generative AI on AWS: A guide for public sector organizations.

Originally posted here:
Introducing 'Get started with generative AI on AWS: A guide for public sector organizations' | Amazon Web Services - AWS Blog

Generative deep learning for the development of a type 1 diabetes simulator | Communications Medicine – Nature.com

Kaizer, J. S., Heller, A. K. & Oberkampf, W. L. Scientific computer simulation review. Reliab. Eng. Syst. Saf. 138, 210218 (2015).

Article Google Scholar

Kadota, R. et al. A mathematical model of type 1 diabetes involving leptin effects on glucose metabolism. J. Theor. Biol. 456, 213223 (2018).

Article MathSciNet CAS PubMed ADS Google Scholar

Farmer Jr, T., Edgar, T. & Peppas, N. Pharmacokinetic modeling of the glucoregulatory system. J. Drug Deliv. Sci. Technol. 18, 387 (2008).

Article CAS PubMed Google Scholar

Nath, A., Biradar, S., Balan, A., Dey, R. & Padhi, R. Physiological models and control for type 1 diabetes mellitus: a brief review. IFAC-PapersOnLine 51, 289294 (2018).

Article Google Scholar

Mansell, E. J., Docherty, P. D. & Chase, J. G. Shedding light on grey noise in diabetes modelling. Biomed. Signal Process. Control 31, 1630 (2017).

Article Google Scholar

Mari, A., Tura, A., Grespan, E. & Bizzotto, R. Mathematical modeling for the physiological and clinical investigation of glucose homeostasis and diabetes. Front. Physiol. https://doi.org/10.3389/fphys.2020.575789 (2020).

Hovorka, R. et al. Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes. Physiol. Meas. 25, 905 (2004).

Article PubMed Google Scholar

Man, C. D. et al. The UVA/PADOVA type 1 diabetes simulator: new features. J. Diabetes Sci. Technol. 8, 2634 (2014).

Article PubMed PubMed Central Google Scholar

Bergman, R. N. & Urquhart, J. The pilot gland approach to the study of insulin secretory dynamics. In Proceedings of the 1970 Laurentian Hormone Conference 583605 (Elsevier, 1971).

Franco, R. et al. Output-feedback sliding-mode controller for blood glucose regulation in critically ill patients affected by type 1 diabetes. IEEE Trans. Control Syst. Technol. 29, 27042711 (2021).

Article Google Scholar

Nielsen, M. A visual proof that neural nets can compute any function. http://neuralnetworksanddeeplearning.com/chap4.html (2016).

Zhou, D.-X. Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 48, 787794 (2020).

Article MathSciNet Google Scholar

Nikzad, M., Movagharnejad, K., Talebnia, F. Comparative study between neural network model and mathematical models for prediction of glucose concentration during enzymatic hydrolysis. Int. J. Comput. Appl. 56, 1 (2012).

Nalisnick, E.T., Matsukawa, A., Teh, Y.W., Grr, D., Lakshminarayanan, B.: Do deep generative models know what they dont know? In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, https://openreview.net/forum?id=H1xwNhCcYm (2019).

Noguer, J., Contreras, I., Mujahid, O., Beneyto, A. & Vehi, J. Generation of individualized synthetic data for augmentation of the type 1 diabetes data sets using deep learning models. Sensors. https://doi.org/10.3390/s22134944 (2022).

Thambawita, V. et al. Deepfake electrocardiograms using generative adversarial networks are the beginning of the end for privacy issues in medicine. Sci. Rep. 11, 18 (2021).

Article Google Scholar

Marouf, M. et al. Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11, 112 (2020).

Article Google Scholar

Festag, S., Denzler, J. & Spreckelsen, C. Generative adversarial networks for biomedical time series forecasting and imputation. J. Biomed. Inform. 129, 104058 (2022).

Article PubMed Google Scholar

Xu, J., Li, H. & Zhou, S. An overview of deep generative models. IETE Tech. Rev. 32, 131139 (2015).

Article Google Scholar

Wan, C. & Jones, D. T. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nat. Mach. Intell. 2, 540550 (2020).

Article Google Scholar

Choudhury, S., Moret, M., Salvy, P., Weilandt, D., Hatzimanikatis, V., & Miskovic, L. Reconstructing kinetic models for dynamical studies of metabolism using generative adversarial networks. Nat. Mach. Intell. 4, 710719 (2022).

Dieng, A.B., Kim, Y., Rush, A. M. & Blei, D. M. Avoiding latent variable collapse with generative skip models. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research (eds Chaudhuri, K. & Sugiyama, M.) Vol. 89, 23972405 (PMLR, 2019).

Ruthotto, L. & Haber, E. An introduction to deep generative modeling. GAMM-Mitteilungen 44, 202100008 (2021).

Article MathSciNet Google Scholar

Xie, T. et al. Progressive attention integration-based multi-scale efficient network for medical imaging analysis with application to COVID-19 diagnosis. Comput. Biol. Med. 159, 106947 (2023).

Article CAS PubMed PubMed Central Google Scholar

Li, H., Zeng, N., Wu, P. & Clawson, K. Cov-net: A computer-aided diagnosis method for recognizing COVID-19 from chest x-ray images via machine vision. Expert Syst. Appl. 207, 118029 (2022).

Article PubMed PubMed Central Google Scholar

Li, K., Liu, C., Zhu, T., Herrero, P. & Georgiou, P. Glunet: a deep learning framework for accurate glucose forecasting. IEEE J. Biomed. health Inform. 24, 414423 (2019).

Article PubMed Google Scholar

Rabby, M. F. et al. Stacked LSTM based deep recurrent neural network with Kalman smoothing for blood glucose prediction. BMC Med. Inform. Decis. Mak. 21, 115 (2021).

Article Google Scholar

Munoz-Organero, M. Deep physiological model for blood glucose prediction in T1DM patients. Sensors 20, 3896 (2020).

Article CAS PubMed PubMed Central ADS Google Scholar

Noaro, G., Zhu, T., Cappon, G., Facchinetti, A. & Georgiou, P. A personalized and adaptive insulin bolus calculator based on double deep q-learning to improve type 1 diabetes management. IEEE J. Biomed. Health Inform. 27, pp. 25362544 (2023).

Emerson, H., Guy, M. & McConville, R. Offline reinforcement learning for safer blood glucose control in people with type 1 diabetes. J. Biomed. Inform. 142, 104376 (2023).

Article PubMed Google Scholar

Lemercier, J.-M., Richter, J., Welker, S. & Gerkmann, T. Analysing diffusion-based generative approaches versus discriminative approaches for speech restoration. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 15 (2023).

Richter, J., Welker, S., Lemercier, J.-M., Lay, B. & Gerkmann, T. Speech enhancement and dereverberation with diffusion-based generative models. In IEEE/ACM Transactions on Audio, Speech, and Language Processing 113 (2023).

Yoo, T. K. et al. Deep learning can generate traditional retinal fundus photographs using ultra-widefield images via generative adversarial networks. Comput. Methods Prog. Biomed. 197, 105761 (2020).

Article Google Scholar

You, A., Kim, J. K., Ryu, I. H. & Yoo, T. K. Application of generative adversarial networks (GAN) for ophthalmology image domains: a survey. Eye Vis. 9, 119 (2022).

Article Google Scholar

Liu, M. et al. Aa-wgan: attention augmented Wasserstein generative adversarial network with application to fundus retinal vessel segmentation. Comput. Biol. Med. 158, 106874 (2023).

Article PubMed Google Scholar

Wang, S. et al. Diabetic retinopathy diagnosis using multichannel generative adversarial network with semisupervision. IEEE Trans. Autom. Sci. Eng. 18, 574585 (2021).

Article Google Scholar

Zhou, Y., Wang, B., He, X., Cui, S. & Shao, L. DR-GAN: conditional generative adversarial network for fine-grained lesion synthesis on diabetic retinopathy images. IEEE J. Biomed. Health Inform. 26, 5666 (2020).

Article CAS Google Scholar

Liu, S. et al. Prediction of OCT images of short-term response to anti-VEGF treatment for diabetic macular edema using different generative adversarial networks. Photodiagnosis Photodyn. Ther. 41, 103272 (2023).

Sun, L.-C. et al. Generative adversarial network-based deep learning approach in classification of retinal conditions with optical coherence tomography images. Graefes Arch. Clin. Exp. Ophthalmol. 261, 13991412 (2023).

Article Google Scholar

Zhang, J., Zhu, E., Guo, X., Chen, H. & Yin, J. Chronic wounds image generator based on deep convolutional generative adversarial networks. In Theoretical Computer Science: 36th National Conference, NCTCS 2018, Shanghai, China, October 1314, 2018, Proceedings 36, 150158 (Springer, 2018).

Cichosz, S. L. & Xylander, A. A. P. A conditional generative adversarial network for synthesis of continuous glucose monitoring signals. J. Diabetes Sci. Technol. 16, 12201223 (2022).

Article PubMed Google Scholar

Mujahid, O. et al. Conditional synthesis of blood glucose profiles for T1D patients using deep generative models. Mathematics. https://doi.org/10.3390/math10203741 (2022).

Eunice, H. W. & Hargreaves, C. A. Simulation of synthetic diabetes tabular data using generative adversarial networks. Clin. Med. J. 7, 4959 (2021).

Che, Z., Cheng, Y., Zhai, S., Sun, Z. & Liu, Y. Boosting deep learning risk prediction with generative adversarial networks for electronic health records. In 2017 IEEE International Conference on Data Mining (ICDM) 787792 (2017).

Noguer, J., Contreras, I., Mujahid, O., Beneyto, A. & Vehi, J. Generation of individualized synthetic data for augmentation of the type 1 diabetes data sets using deep learning models. Sensors 22, 4944 (2022).

Article CAS PubMed PubMed Central ADS Google Scholar

Lim, G., Thombre, P., Lee, M. L. & Hsu, W. Generative data augmentation for diabetic retinopathy classification. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) 10961103 (2020).

Zhu, T., Yao, X., Li, K., Herrero, P. & Georgiou, P. Blood glucose prediction for type 1 diabetes using generative adversarial networks. In CEUR Workshop Proceedings, Vol. 2675, 9094 (2020).

Zeng, A., Chen, M., Zhang, L., & Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence.37, pp. 1112111128 (2023).

Zhu, T., Li, K., Herrero, P. & Georgiou, P. Glugan: generating personalized glucose time series using generative adversarial networks. IEEE J. Biomed. Health Inf. https://doi.org/10.1109/JBHI.2023.3271615 (2023).

Lanusse, F. et al. Deep generative models for galaxy image simulations. Mon. Not. R. Astron. Soc. 504, 55435555 (2021).

Article ADS Google Scholar

Ghosh, A. & ATLAS collaboration. Deep generative models for fast shower simulation in ATLAS. In Journal of Physics: Conference Series. IOP Publishing. 1525, p. 012077 (2020).

Borsoi, R. A., Imbiriba, T. & Bermudez, J. C. M. Deep generative endmember modeling: an application to unsupervised spectral unmixing. IEEE Trans. Comput. Imaging 6, 374384 (2019).

Article MathSciNet Google Scholar

Ma, H., Bhowmik, D., Lee, H., Turilli, M., Young, M., Jha, S., & Ramanathan, A.. Deep generative model driven protein folding simulations. In I. Foster, G. R. Joubert, L. Kucera, W. E. Nagel, & F. Peters (Eds.), Parallel Computing: Technology Trends (pp. 4555). (Advances in Parallel Computing; Vol. 36). IOS Press BV. https://doi.org/10.3233/APC200023 (2020)

Wen, J., Ma, H. & Luo, X. Deep generative smoke simulator: connecting simulated and real data. Vis. Comput. 36, 13851399 (2020).

Article Google Scholar

Mincu, D. & Roy, S. Developing robust benchmarks for driving forward AI innovation in healthcare. Nat. Mach. Intell. 4, 916921 (2022).

Mirza, M. & Osindero, S. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).

Isola, P., Zhu, J.-Y., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 11251134 (2017).

Ahmad, S. et al. Generation of virtual patient populations that represent real type 1 diabetes cohorts. Mathematics 9, 1200 (2021).

Bertachi, A. et al. Prediction of nocturnal hypoglycemia in adults with type 1 diabetes under multiple daily injections using continuous glucose monitoring and physical activity monitor. Sensors https://doi.org/10.3390/s20061705 (2020).

Marling, C. & Bunescu, R. The OhioT1DM dataset for blood glucose level prediction: update 2020. In CEUR Workshop Proceedings, Vol. 2675, 71 (NIH Public Access, 2020).

Estremera, E., Cabrera, A., Beneyto, A. & Vehi, J. A simulator with realistic and challenging scenarios for virtual T1D patients undergoing CSII and MDI therapy. J. Biomed. Inform. 132, 104141 (2022).

Article PubMed Google Scholar

Marin, I., Gotovac, S., Russo, M. & Boi-tuli, D. The effect of latent space dimension on the quality of synthesized human face images. J. Commun. Softw. Syst. 17, 124133 (2021).

Article Google Scholar

The Editorial Board. Into the latent space. Nat. Mach. Intell. 2, 151 (2020).

Battelino, T. et al. Continuous glucose monitoring and metrics for clinical trials: an international consensus statement. Lancet Diabetes Endocrinol. https://doi.org/10.1016/S2213-8587(22)00319-9 (2022).

Beneyto, A., Bertachi, A., Bondia, J. & Vehi, J. A new blood glucose control scheme for unannounced exercise in type 1 diabetic subjects. IEEE Trans. Control Syst. Technol. 28, 593600 (2020).

Article Google Scholar

See the original post here:
Generative deep learning for the development of a type 1 diabetes simulator | Communications Medicine - Nature.com

Integrating core physics and machine learning for improved parameter prediction in boiling water reactor operations … – Nature.com

Low-fidelity and high-fidelity data

The LF model was made in the US NRC codes, Purdue Advanced Reactor Core Simulator (PARCS)19. This model consists of three different fuel bundles labeled each with varying uranium enrichment and gadolinia concentration. The model includes 560 fuel bundles encircled by reflectors. Along with the radial setup, there are 26 axial planes made up of 24 fuel nodes, plus a node of reflectors at the top and bottom planes.

In this work, the model was made in quarter symmetry to save computational time and further reduce the data complexity20. The symmetry was conducted in the radial direction only. The axial discretization was explicitly modeled from bottom to top of the reactor, from reflector to reflector. This is because BWRs axial variation is not symmetrical axially, so it is required to model it in sufficient detail. Based on this description, the boundary condition was set to be reflective in the west and north of the radial core and vacuum (zero incoming neutron currents) for the other directions.

For developing the ML model, the depletion steps were reduced to 12 steps, from the typical 3040 depletion steps. The PARCS cross-section library was generated using CASMO-4 for fuel lattices and reflectors. The library includes group constants from eight lattice simulations over control rod positions, coolant density, and fuel temperature. Lattices were simulated at 23 kW/g of heavy metal power density to a burnup of 50 GWd/MT of initial heavy metal.

The HF data were collected using Serpent21 Monte Carlo simulations. The model was created to reproduce PARCS solutions on the same core conditions but with higher resolutions and using the state-of-the-art simulation approach. This means no diffusion approximation and continuous energy neutron transport was modeled in detailed geometry structures. Each Serpent calculation was run on 500,000 particles, 500 active cycles, and 100 inactive cycles. The other simulation settings were also optimized for depletion calculations.

The reactor model used in this work is based on cycle 1 of the Edwin Hatch Unit 1 nuclear power plant. The power plant, located near Baxley, Georgia, is a boiling water reactor of the BWR-4 design, developed by General Electric, with a net electrical output of approximately 876 MWe and 2436 MWth of thermal output. Since its commissioning in 1975, Unit 1 has operated with a core design containing uranium dioxide fuel assemblies, utilizing a direct cycle where water boils within the reactor vessel to generate steam that drives turbines.

The specification of cycle 1 of Hatch reactor unit 1 is presented in Table 5. While it is a commercial, large power plant, Hatch 1 is not as large as a typical 1,000 GWe LWR. Some BWR designs also have about 700-800 assemblies. Nevertheless, due to the availability of the core design for this work, it is generally viable to use this model as a test case.

There are 560 fuel bundles the size of a 7(times)7 GE lattice in the Hatch 1 Cycle 1 model. Out of the number of fuel bundles in the cycle 1 core, there are three different types of fuels with varying enrichments and burnable absorbers. Using the procedures in running the Serpent model, high-resolution simulations were obtained as shown in the geometry representation in Fig. 6. In the figure, different colors represent different material definitions in Serpent. Because of how the materials were defined individually, the color scheme shown also varied from pin to pin and assembly to assembly. The individual material definition in the pin level was required to capture the isotopic concentration and instantaneous state variables at different fuel exposures and core conditions.

Geometry representation of the full-size BWR core modeled in Serpent. Images were generated by the Serpent geometry plotter.

There are 2400 data points collected as samples for this work with various combinations of control blade patterns and core flow rates and 12 different burnup steps. These data points are translated from 200 independent cycle runs for both PARCS and Serpent to provide LF and HF simulation data, respectively. The collected data were processed into a single HDF5 file.

The data processing parts are performed through data split procedures and data normalization. The data is separated into different sets, with a training-validation-test ratio of 70:15:15. The training data is used to teach the network, the validation data to tune hyperparameters and prevent overfitting, and the test data to evaluate the models generalization performance on unseen data. From the 2400 data points (200 cycles), the dataset was separated into:

Train Dataset: 140 runs or 1680 data points

Validation Dataset: 30 runs or 360 data points

Test Dataset: 30 runs or 360 data points

The data splitting process was not conducted randomly, but based on the average control blade position in a cycle run. Figure 7 presents the distribution of the average control rod inserted in the reactor. The maximum number of steps is 48 for fully withdrawn blades. In the plot, it can be inferred that the test data have the lowest average CR position (largest insertion), followed by the validation set, and the train data have the highest average CR position (smallest insertion).

Train-validation-test data split based on average control blade position in the BWR core. Image was generated using Python Matplotlib Library.

The CR-based splitting for the dataset has the purpose of demonstrating the generalization of the model on out-of-sample CR position data. On the other hand, random splitting is not preferred for small datasets, like this problem as the ML model tends to overfit (or imitate) the data. The fixed (CR-based) splitting process used here ensures that the model can perform well on data with a different distribution than the training dataset.

After splitting the data, normalization of the data is important for the ML model to ensure data integrity and avoid anomalies. In this context, the data processing employs Min-Max scaling, a common normalization technique, to rescale the features to a range [0, 1]. This is achieved by subtracting the minimum value of each feature and then dividing by the range of that feature. The scaling is conducted to fit the training data using the MinMaxScaler class from the scikit-learn package then apply the same scaling to the validation and testing data.

The target parameters used here are the core eigenvalue (or (k_{textrm eff})) and power distribution. The ML model will provide the correction (via predicted errors) of the target parameters that can be used to obtain the predicted HF parameters of interest. The perturbed variables are the parameters that are varied and govern the data collection process and in ML modeling. In this case, the perturbed variables are summarized in Table 6.

In this work, a neural network architecture, called BWR-ComodoNet (Boiling Water ReactorCorrection Model for Diffusion SolverNetwork) is built which is based on the 3D2D convolutional neural network (CNN) architecture. This means that the spatial data in the input and output are processed according to their actual dimensions, which are 3D and 2D arrays. The scalar data are still processed using standard dense layers of neural networks.

The architecture of the BWR-ComodoNet is presented in Fig. 8. The three input features: core flow rate, control rod pattern, and nodal exposure enter three different channels of the network. The scalar parameter goes directly into the dense layer in the encoding process, while the 2D and 3D parameters enter the 2D and 3D CNN layers, respectively. The encoding processes end in the step where all channels are concatenated into one array and connected to dense layers.

Architecture of BWR-ComodoNet using 3D-2D CNN-based encoder-decoder neural networks. Image was generated using draw.io diagram application.

The decoding process follows the shape of the target data. In this case, the output will be both (k_{textrm eff}) error (scalar) and the 3D nodal power error. Since the quarter symmetry is used in the calculation, the 3D nodal power has the shape of (14,14,26) in the x,y, and z dimensions, respectively. BWR-ComodoNet outputs the predicted errors, so there is an additional post-processing step to add the LF data with the predicted error to obtain the predicted HF data.

The output parameters from the neural network model comprise errors in the effective neutron multiplication factor, (k_{eff}), and the errors in nodal power, which is quantified as:

$$begin{aligned} begin{array}{l} e_{k} = k_H-k_L \ vec {e}_{P} = vec {P}_H-vec {P}_L end{array} end{aligned}$$

(4)

Here, (e_k) denotes the error in (k_{eff}) and (vec {e}_{P}) represents the nodal power error vector. The subscripts H and L indicate high-fidelity and low-fidelity data, respectively. According to the equation, the predicted high-fidelity data can be determined by adding the error predictions from the machine learning model to the low-fidelity solutions22.

Given the predicted errors, (hat{e}_k) and (hat{vec {e}}_{P}), the predicted high-fidelity data, (k_H) and (vec {P}_H) is defined as:

$$begin{aligned} begin{array}{l} k_H = k_L + hat{e}_k = k_L + mathscr {N}_k(varvec{theta }, textbf{x}) \ vec {P}_H = vec {P}_L + hat{vec {e}}_{P} = vec {P}_L + mathscr {N}_P(varvec{theta }, textbf{x}) end{array} end{aligned}$$

(5)

where (mathscr {N}_k(varvec{theta }, textbf{x})) and (mathscr {N}_P(varvec{theta }, textbf{x})) are the neural networks for (k_{eff}) and power with optimized weights (varvec{theta }) and input features (textbf{x}). Although Eq. 5 appears to represent a linear combination of low-fidelity parameters and predicted errors, itis important to note that the neural network responsible for predicting the errors is inherently non-linear. As a result, the predicted error is expected to encapsulate the non-linear discrepancies between the low-fidelity and high-fidelity data.

The machine learning architecture for predicting reactor parameters is constructed using the TensorFlow Python library. The optimization of the model is performed through Bayesian Optimization, a technique that models the objective function, which in this case is to minimize validation loss, using a Gaussian Process (GP). This surrogate model is then used to efficiently optimize the function23. Hyperparameter tuning was conducted over 500 trials to determine the optimal configuration, including the number of layers and nodes, dropout values, and learning rates.

The activation function employed for all layers is the Rectified Linear Unit (ReLU), chosen for its effectiveness in introducing non-linearity without significant computational cost. The output layer utilizes a linear activation function to directly predict the target data.

Regularization is implemented through dropout layers to prevent overfitting and improve model generalizability. Additionally, early stopping is employed with a patience of 96 epochs, based on monitoring validation loss, to halt training if no improvement is observed. A learning rate schedule is also applied, reducing the learning rate by a factor of 0.1 every 100 epochs, starting with an initial rate. The training process is conducted with a maximum of 512 epochs and a batch size of 64, allowing for sufficient iterations to optimize the model while managing computational resources.

It is important to note that the direct ML model mentioned in the results, which directly outputs (k_{eff}) and nodal power, follows a different architecture and is independently optimized with distinct hyperparameters compared to the LF+ML model. This differentiation allows for tailored optimization to suit the specific objectives of each model.

See the original post here:
Integrating core physics and machine learning for improved parameter prediction in boiling water reactor operations ... - Nature.com