Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports – Nature.com
Quantum deep reinforcement learning
Quantum deep reinforcement learning is a novel action value-based decision-making framework derived from QRL23 and deep q-learning10 framework. Like conventional RL9,31, our qDRL based CDSS framework is comprised of 5 main elements: clinical AI agent, ARTE, radiation dose decision-making policy, reward, and q-value function. Here, the AI agent is a clinical decision-maker that learns to make dose decisions for achieving clinically desirable outcomes within the ARTE. The learning takes place by the agent-environment interaction, which can be sequentially ordered as: the AI decides on a dose and executes it, and in response, a patient (part of the ARTE) transits from one state to the next. Each transition provides the AI with feedback for its decision in terms of RT outcome and associated reward value. The goal of RL is for the AI to learn a decision-making policy that maximizes the reward in the long run, defined in terms of a specified q-value function that assigns a value to every state-dose-decision pair obtained from the accumulation of rewards over time (returns).
Assuming Markovs property (i.e., an environments response at time (t + 1) depends only on the state and dose-decision at time (t)), the qDRL task can be mathematically described as a 5-tuple ((S, left| D rightrangle , TF, P, R)), where (S) is a finite set of patients states, (left| D rightrangle) is a superimposed quantum state representing the finite set of eigen-dose decision, (TF:S times D to S^{prime }) is the transition function that maps patients state (s_{t}) and eigen-dose (left| d rightrangle_{t}) to the next state (s_{t + 1}), (P_{LC|RP2} :S^{prime } to left[ {0,1} right]) is the RT outcome estimator that assigns probability values (p_{LC}) and (p_{RP2}) to the state (s_{t + 1}), and (R:left[ {0,1} right] times left[ {0,1} right] to {mathbb{R}}) is the reward function that assigns a reward (r_{t + 1}) to the state-decision pair (left( {s_{t} ,left| d rightrangle_{t} } right)) based on the outcome probability estimates.
Eigen-dose (left| d rightrangle) is a physically performable decision that is selected via quantum methods from the superimposed quantum state (left| D rightrangle) which simultaneously represents all possible eigen-doses at once. In simple words, (left| D rightrangle) is the collection of all possible dose options and (left| d rightrangle) is one of those options which is selected after a decision is made. Selecting dose decision (left| d rightrangle) is carried out in two steps: (1) amplifying the optimal eigen-dose (left| d rightrangle^{*}) from the superimposed state (left| D rightrangle) (i.e., (left| D rightrangle^{prime } = widehat{Amp}_{{left| d rightrangle^{*} }} left| D rightrangle)) and (2) measuring the amplified state (i.e., (left| d rightrangle = widehat{Measure}(left| {D^{prime } } rightrangle )).
The optimal eigen-dose (left| d rightrangle^{*}) is obtained from deep Q-net, which is the AIs memory. Deep Q-net, (DQN:S to {mathbb{R}}^{d}), is a neural network that takes patients state as input and then outputs q-value for each eigen-dose ((left{ {q_{left| d rightrangle } } right})). The optimal dose is then selected following greedy policy where the dose with the maximum q-value is selected (i.e., (left| d rightrangle^{*} = begin{array}{*{20}c} {argmax} \ {left| {d^{prime } } rightrangle } \ end{array} { q_{left| d rightrangle } })). We have applied a double Q-learning 32 algorithm in training the deep Q-net. The schematic of a training cycle is presented in Fig.2 and additional technical details are presented in the Supplementary Material.
We initially employed Grovers amplification procedure33,34 for the decision selection mechanism. While Grovers procedure works on a quantum simulator, it fails to correctly work in a quantum computer. The quantum circuit depth of Grovers procedure (for 4 or higher qubits) is much greater than the coherence length of the current quantum processor35. Whenever the quantum circuit length exceeds the coherence length, quantum state becomes significantly affected by the system noise and loses vital information. Therefore, we designed a quantum controller circuit that is shorter than the coherence length and is suitable for the task of decision selection. The merit of our design is its fixed length; since its length is fixed for any number of qubits, it is suitable for higher qubit systems, as much as permitted by the circuit width. Technical details regarding its implementation in quantum processor is presented in the Supplementary Materials.
An example of a controller circuit is given in Fig.5. Controller circuits use twice the number of qubits (n), which can be divided into control and main. Optimal eigen-states obtained from deep Q-net are created in the control by selecting the appropriate pre-control gates. Then the control is entangled with the qubits from the main via controlled NOT (CNOT) gates. CNOT gates are connected between a control qubit from the control and a target qubit from the main. CNOT gates flip the target qubit from (left| 1 rightrangle) to (left| 0 rightrangle) only when the control is in (left| 1 rightrangle) state and does not perform any operation otherwise. Because all the main qubits are prepared in (left| 0 rightrangle) state, we introduced the reverse gates (n X-gates in parallel) to flip them to (left| 1 rightrangle). X-gates flip (left| 0 rightrangle) to (left| 1 rightrangle), and vice-versa. The CNOT flips all the qubits whose controls are in (left| 1 rightrangle) state, creating a state that is element-wise opposite to the marked state. Finally, another set of reverse gates is applied to the main before making a measurement.
Quantum controller circuit for a 5 qubit (32 bit) system. (a) Quantum controller circuit for the selection of the state (left| {10101} rightrangle). The probability distribution corresponding to (b) failed Grovers amplification procedure for one iteration run in the 5-qubit IBMQ Santiago quantum processor and (c) successful quantum controller selection run in the 15-qubit IBMQ Melbourne quantum processor.
Another advantage of the controller circuit is controlled uncertainty level. The controller circuit has additional degrees of freedom that can control the level of uncertainty that might be needed to model a highly dubious clinical situation. By replacing the CNOT gate by a more general (CU3left( {theta ,phi ,lambda } right)) gate, we can control the level of additional stochasticity with the rotation angles (theta), (phi), and (lambda), which corresponds to the angles in the Bloch sphere. The angles can either be fixed or, for additional control, changed with training episode.
The patients state in the ARTE is defined by 5 biological features: cytokine (IP10), PET imaging feature (GLSZM-ZSV), radiation doses (Tumor gEUD and lung gEUD), and genetics (cxcr1- Rs2234671). Their descriptions are presented in Table 2. These 5 variables were selected from a multi-objective Bayesian Network study13, which considered over 297 various biological features and found the best features for predicting the joint LC and RP2 RT outcomes.
The training data analyzed in this study are obtained from the University of Michigan study UMCC 2007.123 (NCI clinical trial NCT01190527) and the validation data analyzed in this study are obtained from the RTOG-0617 study (NCI clinical trial NCT00533949). Both trials were conducted in accordance with relevant guidelines and regulations and informed consent was obtained from all subjects and/or legal guardians. Details on training and validation datasets, and necessary model imputation carried out to accommodate the differences in the datasets are presented in the Supplementary Materials.
Deep Neural Networks (DNN) were applied as transition functions for IP10 and GLSZM-ZSV features. They were trained with a longitudinal (time-series) dataset, with the pre-irradiation patient state and corresponding radiation dose as input features and post-irradiation state as output. For lung and tumor gEUD, we utilized prior knowledge and applied a monotonic relationship for the transition function since we know that gEUD should increase with increasing radiation dose. We assumed that the change in gEUD is proportional to the dose fractionation and tissue radiosensitivity,
$$frac{{gleft( {t_{n} } right) - gleft( {t_{n - 1} } right)}}{{t_{n} - t_{n - 1} }} propto d_{n} left( {1 + frac{{d_{n} }}{{frac{alpha }{beta }}}} right).$$
(1)
Here (gleft( {t_{n} } right)) is the gEUD at time point (t_{n}), (d_{n}) is the radiation dose fractionation given during the nth time period, and (alpha /beta) ratio is the radiosensitivity parameter which differs between tissue type. Note that we first applied constrained training42 to maintain monotonicity with DNN model, however the gEUD over time trend was flatter than anticipated, thus we opted for a process-driven approach in the final implementation. The technical details on the NNs and its training are presented in the Supplementary Material.
DNN classifiers were applied as the RT outcome estimator for LC and RP2 treatment outcomes. They were trained with post irradiation patient states as input and binary LC and RP2 outcomes as its labels.
RT outcome estimator must also satisfy a monotone condition between increasing radiation dose and increasing probability of local control as well as probability of radiation induced pneumonitis. To maintain this monotonic relationship, we used a generic logistic function,
$$p_{LC|RP2} = frac{1}{{1 + exp left( {frac{{gleft( {t_{6} } right) - mu }}{T}} right)}},$$
(2)
where (gleft( {t_{6} } right)) is the gEUD at week 6, and (mu) and (T) are two patient-specific parameters that are learned from training the DNN. Here, (mu) and (T) are the outputs of two neural networks that are fed into the logistic function and tuned one after the other, leaving the other fixed. The training details are presented in the Supplementary Materials.
The task of the agent is to determine the optimal dose that maximizes (p_{LC}) while minimizing (p_{RP2}). Accordingly, we built a reward function on the base function (P^{ + } = P_{LC} left( {1 - P_{RP2} } right)) as shown in Fig.6. The algebraic form is as follows,
$$R = left{ {begin{array}{*{20}l} {P^{ + } + 10 } hfill & { {text{if}} 70% < p_{Lc} < 100% ;{text{and}}; 0% < p_{RP2} < 17.2% } hfill \ {P^{ + } + 5} hfill & {{text{if}} 50% < p_{Lc} < 70% ;{text{and}}; 17.2% < p_{RP2} < 50% } hfill \ {P^{ + } - 1} hfill & {{text{if}} 0% < p_{Lc} < 50% ;{text{and}}; 50 < p_{RP2} < 100% } hfill \ end{array} } right.$$
(3)
Reward function for reinforcement learning. Contour plot of reward function as a function of the probability of local control (PLC) and radiation induced pneumonitis of grade 2 or higher (PRP2). Area enclosed by the blue line corresponds to the clinically desirable outcome, i.e., (P_{LC} > 70{%}) and ({P_{RP2}} <17.2{%}). Similarly, the area enclosed by the green lines corresponds to the computationally desirable outcome, i.e., (P_{LC} > 50{%}) and ({P_{RP2}} <50{%}). Along with (P_{LC} times (1-P_{RP2})) the AI agent receives+10 reward for achieving clinically desirable outcome,+5 for achieving computationally desirable outcome, and -1 when unable to achieve a desirable outcome.
Here the AI agent receives additional 10 points for achieving clinically desirable outcome (i.e., (p_{LC} > 70% quad {text{and}} quad p_{RP2} < 17.2%)), 5 points for achieving computationally desirable outcome (i.e., (p_{LC} > 50% quad {text{and}} quad p_{RP2} < 50%)), and -1 point for failing to achieve a desirable outcome altogether. The negative point motivates the AI agent to search for the optimal dose as soon as possible.
To compensate for low number of data points we employed WGAN-GP43, which learns the underlying data distribution and generates more data points. We generated 4000 additional data points for training qDRL models. Having a larger training dataset helps the reinforcement learning algorithm in accurately representing the state space. The training details are presented in the Supplementary Material.
See the rest here:
Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports - Nature.com
- China unveils quantum computer thats one quadrillion times faster than existing supercomputers - Yahoo Finance UK - March 7th, 2025 [March 7th, 2025]
- China unveils quantum computer that could spell new era of processors - The Independent - March 5th, 2025 [March 5th, 2025]
- Startup PsiQuantum says it is making millions of quantum computing chips - Reuters - March 1st, 2025 [March 1st, 2025]
- A quantum computing startup says it is already making millions of light-powered chips - The Conversation - March 1st, 2025 [March 1st, 2025]
- Quantum Breakthrough: Microsoft and Purdue Unlock the Future of Topological Qubits - SciTechDaily - March 1st, 2025 [March 1st, 2025]
- Interested in Quantum Computing Investing? Here Are 4 Fantastic Picks to Maximize Your Odds of Picking a Winner - The Motley Fool - March 1st, 2025 [March 1st, 2025]
- If I Could Only Buy 1 Quantum Computing Stock, This Would Be It - The Motley Fool - March 1st, 2025 [March 1st, 2025]
- Amazon unveils quantum chip, aiming to shave years off development time - Reuters - March 1st, 2025 [March 1st, 2025]
- Quantum Computing Is Finally Here. But What Is It? - Bloomberg - March 1st, 2025 [March 1st, 2025]
- Microsoft makes quantum computing breakthrough - Drexel University The Triangle Online - March 1st, 2025 [March 1st, 2025]
- Google, Microsoft, and now Amazon: The quantum computing race is heating up - Quartz - March 1st, 2025 [March 1st, 2025]
- Groundbreaking qubit technology reduces errors in quantum computing - The Brighter Side of News - March 1st, 2025 [March 1st, 2025]
- Fortanix Tackles Quantum Computing Threats With New Algorithms - Dark Reading - March 1st, 2025 [March 1st, 2025]
- What Investors Need to Know About the Wild World of Quantum Computing - Barron's - March 1st, 2025 [March 1st, 2025]
- Quantum computing will be bigger than AI so why is no one talking about it? - The Hill - March 1st, 2025 [March 1st, 2025]
- It seems like something out of a movie - they successfully achieve the first quantum teleportation in history - Unin Rayo - March 1st, 2025 [March 1st, 2025]
- Amazon joins the quantum computing race with a chip designed for error correction - Engadget - March 1st, 2025 [March 1st, 2025]
- Amazon Unveils Ocelot Quantum Chip. Its the Latest Tech Giant to Move Into the Space. - Barron's - March 1st, 2025 [March 1st, 2025]
- Amazon says its new quantum computing chip will make error correction more efficient - The Verge - March 1st, 2025 [March 1st, 2025]
- Microsoft's Majorana 1 widened the quantum field. But are we any closer to a eureka moment? - Fast Company - March 1st, 2025 [March 1st, 2025]
- Amazon Bets Big on Quantum Computing With Ocelot-Fewer Qubits, Faster Results - Yahoo Finance - March 1st, 2025 [March 1st, 2025]
- A Once-in-a-Lifetime Market Opportunity: Is Alphabet or Microsoft Winning the Quantum Computing Race? - The Motley Fool - March 1st, 2025 [March 1st, 2025]
- Quantum Computing Has Arrived; We Need To Prepare For Its Impact - Forbes - February 25th, 2025 [February 25th, 2025]
- Scientists create world's 1st chip that can protect data in the age of quantum computing attacks - Livescience.com - February 25th, 2025 [February 25th, 2025]
- DARPA Expands Quantum Initiative to Bring Quantum Computing One Step Closer - TipRanks - February 25th, 2025 [February 25th, 2025]
- QuEra and Deloitte Tohmatsu Join to Advance Quantum Innovations in Japan - The Quantum Insider - February 25th, 2025 [February 25th, 2025]
- Quantum innovation balances on commercial tightrope - ComputerWeekly.com - February 25th, 2025 [February 25th, 2025]
- 7 Quantum Computing Stocks That Could Supercharge Your Portfolio - The Motley Fool - February 25th, 2025 [February 25th, 2025]
- What Is Quantum Computing, and Why Does It Matter? - The Wall Street Journal - February 25th, 2025 [February 25th, 2025]
- Microsoft Reports a Win on Quantum Computing. What It Means for the Sector. - Barron's - February 25th, 2025 [February 25th, 2025]
- Microsofts Majorana Topological Chip An Advance 17 Years in The Making - The Quantum Insider - February 25th, 2025 [February 25th, 2025]
- This Chip Could Be the Massive Breakthrough Weve Been Waiting for in Quantum Computing - Popular Mechanics - February 25th, 2025 [February 25th, 2025]
- Northeastern researcher wins NSF award to cut costs and boost efficiency of quantum computing - Northeastern University - February 25th, 2025 [February 25th, 2025]
- A New State of Matter Just Changed the Future of Quantum Computing - SciTechDaily - February 25th, 2025 [February 25th, 2025]
- Microsoft Just Delivered Fantastic Quantum Computing News to Investors. Is the Stock a Buy? - The Motley Fool - February 25th, 2025 [February 25th, 2025]
- Microsoft's quantum computing breakthrough questioned by experts - Fortune - February 25th, 2025 [February 25th, 2025]
- Big Tech Gets Their Qubits in Line: Quantum Computing Adding Another Dimension to Pharma Innovation - geneonline - February 25th, 2025 [February 25th, 2025]
- Quantum Computing in the Palm of Your Hand - Money and Markets - February 25th, 2025 [February 25th, 2025]
- Microsoft overcomes quantum barrier with new particle - ComputerWeekly.com - February 25th, 2025 [February 25th, 2025]
- Quantum Computers Vs Garbage Excavators: The Race For The Lost Bitcoin - Forbes - February 25th, 2025 [February 25th, 2025]
- New Microsoft Quantum Computing Chip Could Revolutionize the Industry - DISCOVER Magazine - February 25th, 2025 [February 25th, 2025]
- ET Graphics: Majorana I, Willow and new frontiers of quantum computing - The Economic Times - February 25th, 2025 [February 25th, 2025]
- Microsoft has unveiled a new quantum computer chip. How does it work and will it transform technology? - ABC News - February 23rd, 2025 [February 23rd, 2025]
- Chinese superconducting quantum computer receives over 20 million global visits - Global Times - February 18th, 2025 [February 18th, 2025]
- A Teleportation Breakthrough for Quantum Computing Is Here - WIRED - February 18th, 2025 [February 18th, 2025]
- A Once-in-a-Lifetime Buying Opportunity: This Quantum Computing Stock Looks Primed to Skyrocket - The Motley Fool - February 18th, 2025 [February 18th, 2025]
- What's Going On With D-Wave Quantum Stock Today? - Benzinga - February 18th, 2025 [February 18th, 2025]
- Prediction: These 2 Quantum Computing Stocks Will Be the Biggest AI Winners of 2025 - The Motley Fool - February 18th, 2025 [February 18th, 2025]
- Cleveland Clinic, Miami University partner on quantum computing education - ideastream - February 18th, 2025 [February 18th, 2025]
- Will D-Wave Lead the Charge in Commercial Quantum Computing? - PUNE.NEWS - February 18th, 2025 [February 18th, 2025]
- Google (GOOGL) Races Ahead in Quantum Computing, Partnering with Promising Startups - TipRanks - February 18th, 2025 [February 18th, 2025]
- Telefnica and Biscay Partner to Advance Quantum Innovation with Fujitsu Digital Annealer - The Quantum Insider - February 18th, 2025 [February 18th, 2025]
- 1 Quantum Computing Stock That Could Be the Biggest AI Buy of 2025 - The Motley Fool - February 18th, 2025 [February 18th, 2025]
- D-Wave and Staque Introduce Quantum-Powered Optimization for Autonomous Agricultural Vehicles - The Quantum Insider - February 18th, 2025 [February 18th, 2025]
- 3 Stocks That Could Derail the AI Hype Train - Schaeffers Research - February 18th, 2025 [February 18th, 2025]
- IonQ: Competitive Wake-Up Call For Quantum Dreams (NYSE:IONQ) - Seeking Alpha - February 18th, 2025 [February 18th, 2025]
- Quantum computing, cyber security, quality food; Efforts to create centers of excellence will translate to jobs - MassLive.com - February 18th, 2025 [February 18th, 2025]
- Chinas Quantum Strategy and The Threat of Global Data-Centric Authoritarianism - The Quantum Insider - February 18th, 2025 [February 18th, 2025]
- Quantum Computing Breakthrough Brings Us Closer to Universal Simulation - mitechnews.com - February 18th, 2025 [February 18th, 2025]
- Unlocking the Future: Top Quantum Computing Stocks to Watch - La Noticia Digital - February 18th, 2025 [February 18th, 2025]
- IonQ Aims to Meet Big Targets Amid Soaring Investor Expectations - TipRanks - February 18th, 2025 [February 18th, 2025]
- Quantum computers have finally arrived, but will they ever be useful? - New Scientist - February 14th, 2025 [February 14th, 2025]
- Global visits to Chinas Origin Wukong quantum computer surpass 20m; majority of intl access from US - Global Times - February 14th, 2025 [February 14th, 2025]
- D-Wave Quantum Computer Used in Simulating Potential Universe Decay - HPCwire - February 14th, 2025 [February 14th, 2025]
- UN Year of Quantum Spurs Global Tech Giants Into Action - Technology Magazine - February 14th, 2025 [February 14th, 2025]
- Oxford University Team Makes Connections to Build a Quantum Supercomputer - The Quantum Insider - February 14th, 2025 [February 14th, 2025]
- The Answer to Whats Next in Computing - Brownstone Research - February 14th, 2025 [February 14th, 2025]
- $1 billion Capital of Quantum intiative to establish UMD, Maryland as quantum hub - The Diamondback - February 14th, 2025 [February 14th, 2025]
- Scientists Simulated a Quantum Apocalypse. Then the Universe Disappeared. - Popular Mechanics - February 14th, 2025 [February 14th, 2025]
- Nvidias Quantum Leap: Are We On the Brink of a Computing Revolution? - MotoPaddock - February 14th, 2025 [February 14th, 2025]
- Quantum to take center stage at OFC 2025 - LightWave Online - February 14th, 2025 [February 14th, 2025]
- D-Wave Announces On-Premises Systems Offering to Push Boundaries of Quantum-Fueled Research and Advance Quantum + AI Development - Business Wire - February 14th, 2025 [February 14th, 2025]
- Scientists Just Linked Two Quantum Computers With "Quantum Teleportation" for the First Time and It Changes Everything - ZME Science - February 14th, 2025 [February 14th, 2025]
- Rigetti Stock Gets a Massive 76% Price Target Boost - Wall Street Pit - February 14th, 2025 [February 14th, 2025]
- How Google CEO Sundar Pichai may have just agreed with Nvidia CEO Jensen Huang's sentence that wiped bill - The Times of India - February 14th, 2025 [February 14th, 2025]
- Revolutionary 5,000-Qubit Quantum Computer Now Available for Private Installation, German Research Giant First to Buy - StockTitan - February 14th, 2025 [February 14th, 2025]
- Quantum Leap: Oxfords Breakthrough Paves the Way for the Quantum Internet - Mi Valle - February 14th, 2025 [February 14th, 2025]
- Unveiling the Next Big Leap: Could Rigetti Be Your Best Quantum Investment Yet? - Mi Valle - February 14th, 2025 [February 14th, 2025]
- Will 2025 mark the beginning of practically useful quantum computers? - Observer Research Foundation - February 14th, 2025 [February 14th, 2025]
- D-Wave Announces On-Premises Advantage Quantum Systems for AI and HPC - HPCwire - February 14th, 2025 [February 14th, 2025]