Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports – Nature.com
Quantum deep reinforcement learning
Quantum deep reinforcement learning is a novel action value-based decision-making framework derived from QRL23 and deep q-learning10 framework. Like conventional RL9,31, our qDRL based CDSS framework is comprised of 5 main elements: clinical AI agent, ARTE, radiation dose decision-making policy, reward, and q-value function. Here, the AI agent is a clinical decision-maker that learns to make dose decisions for achieving clinically desirable outcomes within the ARTE. The learning takes place by the agent-environment interaction, which can be sequentially ordered as: the AI decides on a dose and executes it, and in response, a patient (part of the ARTE) transits from one state to the next. Each transition provides the AI with feedback for its decision in terms of RT outcome and associated reward value. The goal of RL is for the AI to learn a decision-making policy that maximizes the reward in the long run, defined in terms of a specified q-value function that assigns a value to every state-dose-decision pair obtained from the accumulation of rewards over time (returns).
Assuming Markovs property (i.e., an environments response at time (t + 1) depends only on the state and dose-decision at time (t)), the qDRL task can be mathematically described as a 5-tuple ((S, left| D rightrangle , TF, P, R)), where (S) is a finite set of patients states, (left| D rightrangle) is a superimposed quantum state representing the finite set of eigen-dose decision, (TF:S times D to S^{prime }) is the transition function that maps patients state (s_{t}) and eigen-dose (left| d rightrangle_{t}) to the next state (s_{t + 1}), (P_{LC|RP2} :S^{prime } to left[ {0,1} right]) is the RT outcome estimator that assigns probability values (p_{LC}) and (p_{RP2}) to the state (s_{t + 1}), and (R:left[ {0,1} right] times left[ {0,1} right] to {mathbb{R}}) is the reward function that assigns a reward (r_{t + 1}) to the state-decision pair (left( {s_{t} ,left| d rightrangle_{t} } right)) based on the outcome probability estimates.
Eigen-dose (left| d rightrangle) is a physically performable decision that is selected via quantum methods from the superimposed quantum state (left| D rightrangle) which simultaneously represents all possible eigen-doses at once. In simple words, (left| D rightrangle) is the collection of all possible dose options and (left| d rightrangle) is one of those options which is selected after a decision is made. Selecting dose decision (left| d rightrangle) is carried out in two steps: (1) amplifying the optimal eigen-dose (left| d rightrangle^{*}) from the superimposed state (left| D rightrangle) (i.e., (left| D rightrangle^{prime } = widehat{Amp}_{{left| d rightrangle^{*} }} left| D rightrangle)) and (2) measuring the amplified state (i.e., (left| d rightrangle = widehat{Measure}(left| {D^{prime } } rightrangle )).
The optimal eigen-dose (left| d rightrangle^{*}) is obtained from deep Q-net, which is the AIs memory. Deep Q-net, (DQN:S to {mathbb{R}}^{d}), is a neural network that takes patients state as input and then outputs q-value for each eigen-dose ((left{ {q_{left| d rightrangle } } right})). The optimal dose is then selected following greedy policy where the dose with the maximum q-value is selected (i.e., (left| d rightrangle^{*} = begin{array}{*{20}c} {argmax} \ {left| {d^{prime } } rightrangle } \ end{array} { q_{left| d rightrangle } })). We have applied a double Q-learning 32 algorithm in training the deep Q-net. The schematic of a training cycle is presented in Fig.2 and additional technical details are presented in the Supplementary Material.
We initially employed Grovers amplification procedure33,34 for the decision selection mechanism. While Grovers procedure works on a quantum simulator, it fails to correctly work in a quantum computer. The quantum circuit depth of Grovers procedure (for 4 or higher qubits) is much greater than the coherence length of the current quantum processor35. Whenever the quantum circuit length exceeds the coherence length, quantum state becomes significantly affected by the system noise and loses vital information. Therefore, we designed a quantum controller circuit that is shorter than the coherence length and is suitable for the task of decision selection. The merit of our design is its fixed length; since its length is fixed for any number of qubits, it is suitable for higher qubit systems, as much as permitted by the circuit width. Technical details regarding its implementation in quantum processor is presented in the Supplementary Materials.
An example of a controller circuit is given in Fig.5. Controller circuits use twice the number of qubits (n), which can be divided into control and main. Optimal eigen-states obtained from deep Q-net are created in the control by selecting the appropriate pre-control gates. Then the control is entangled with the qubits from the main via controlled NOT (CNOT) gates. CNOT gates are connected between a control qubit from the control and a target qubit from the main. CNOT gates flip the target qubit from (left| 1 rightrangle) to (left| 0 rightrangle) only when the control is in (left| 1 rightrangle) state and does not perform any operation otherwise. Because all the main qubits are prepared in (left| 0 rightrangle) state, we introduced the reverse gates (n X-gates in parallel) to flip them to (left| 1 rightrangle). X-gates flip (left| 0 rightrangle) to (left| 1 rightrangle), and vice-versa. The CNOT flips all the qubits whose controls are in (left| 1 rightrangle) state, creating a state that is element-wise opposite to the marked state. Finally, another set of reverse gates is applied to the main before making a measurement.
Quantum controller circuit for a 5 qubit (32 bit) system. (a) Quantum controller circuit for the selection of the state (left| {10101} rightrangle). The probability distribution corresponding to (b) failed Grovers amplification procedure for one iteration run in the 5-qubit IBMQ Santiago quantum processor and (c) successful quantum controller selection run in the 15-qubit IBMQ Melbourne quantum processor.
Another advantage of the controller circuit is controlled uncertainty level. The controller circuit has additional degrees of freedom that can control the level of uncertainty that might be needed to model a highly dubious clinical situation. By replacing the CNOT gate by a more general (CU3left( {theta ,phi ,lambda } right)) gate, we can control the level of additional stochasticity with the rotation angles (theta), (phi), and (lambda), which corresponds to the angles in the Bloch sphere. The angles can either be fixed or, for additional control, changed with training episode.
The patients state in the ARTE is defined by 5 biological features: cytokine (IP10), PET imaging feature (GLSZM-ZSV), radiation doses (Tumor gEUD and lung gEUD), and genetics (cxcr1- Rs2234671). Their descriptions are presented in Table 2. These 5 variables were selected from a multi-objective Bayesian Network study13, which considered over 297 various biological features and found the best features for predicting the joint LC and RP2 RT outcomes.
The training data analyzed in this study are obtained from the University of Michigan study UMCC 2007.123 (NCI clinical trial NCT01190527) and the validation data analyzed in this study are obtained from the RTOG-0617 study (NCI clinical trial NCT00533949). Both trials were conducted in accordance with relevant guidelines and regulations and informed consent was obtained from all subjects and/or legal guardians. Details on training and validation datasets, and necessary model imputation carried out to accommodate the differences in the datasets are presented in the Supplementary Materials.
Deep Neural Networks (DNN) were applied as transition functions for IP10 and GLSZM-ZSV features. They were trained with a longitudinal (time-series) dataset, with the pre-irradiation patient state and corresponding radiation dose as input features and post-irradiation state as output. For lung and tumor gEUD, we utilized prior knowledge and applied a monotonic relationship for the transition function since we know that gEUD should increase with increasing radiation dose. We assumed that the change in gEUD is proportional to the dose fractionation and tissue radiosensitivity,
$$frac{{gleft( {t_{n} } right) - gleft( {t_{n - 1} } right)}}{{t_{n} - t_{n - 1} }} propto d_{n} left( {1 + frac{{d_{n} }}{{frac{alpha }{beta }}}} right).$$
(1)
Here (gleft( {t_{n} } right)) is the gEUD at time point (t_{n}), (d_{n}) is the radiation dose fractionation given during the nth time period, and (alpha /beta) ratio is the radiosensitivity parameter which differs between tissue type. Note that we first applied constrained training42 to maintain monotonicity with DNN model, however the gEUD over time trend was flatter than anticipated, thus we opted for a process-driven approach in the final implementation. The technical details on the NNs and its training are presented in the Supplementary Material.
DNN classifiers were applied as the RT outcome estimator for LC and RP2 treatment outcomes. They were trained with post irradiation patient states as input and binary LC and RP2 outcomes as its labels.
RT outcome estimator must also satisfy a monotone condition between increasing radiation dose and increasing probability of local control as well as probability of radiation induced pneumonitis. To maintain this monotonic relationship, we used a generic logistic function,
$$p_{LC|RP2} = frac{1}{{1 + exp left( {frac{{gleft( {t_{6} } right) - mu }}{T}} right)}},$$
(2)
where (gleft( {t_{6} } right)) is the gEUD at week 6, and (mu) and (T) are two patient-specific parameters that are learned from training the DNN. Here, (mu) and (T) are the outputs of two neural networks that are fed into the logistic function and tuned one after the other, leaving the other fixed. The training details are presented in the Supplementary Materials.
The task of the agent is to determine the optimal dose that maximizes (p_{LC}) while minimizing (p_{RP2}). Accordingly, we built a reward function on the base function (P^{ + } = P_{LC} left( {1 - P_{RP2} } right)) as shown in Fig.6. The algebraic form is as follows,
$$R = left{ {begin{array}{*{20}l} {P^{ + } + 10 } hfill & { {text{if}} 70% < p_{Lc} < 100% ;{text{and}}; 0% < p_{RP2} < 17.2% } hfill \ {P^{ + } + 5} hfill & {{text{if}} 50% < p_{Lc} < 70% ;{text{and}}; 17.2% < p_{RP2} < 50% } hfill \ {P^{ + } - 1} hfill & {{text{if}} 0% < p_{Lc} < 50% ;{text{and}}; 50 < p_{RP2} < 100% } hfill \ end{array} } right.$$
(3)
Reward function for reinforcement learning. Contour plot of reward function as a function of the probability of local control (PLC) and radiation induced pneumonitis of grade 2 or higher (PRP2). Area enclosed by the blue line corresponds to the clinically desirable outcome, i.e., (P_{LC} > 70{%}) and ({P_{RP2}} <17.2{%}). Similarly, the area enclosed by the green lines corresponds to the computationally desirable outcome, i.e., (P_{LC} > 50{%}) and ({P_{RP2}} <50{%}). Along with (P_{LC} times (1-P_{RP2})) the AI agent receives+10 reward for achieving clinically desirable outcome,+5 for achieving computationally desirable outcome, and -1 when unable to achieve a desirable outcome.
Here the AI agent receives additional 10 points for achieving clinically desirable outcome (i.e., (p_{LC} > 70% quad {text{and}} quad p_{RP2} < 17.2%)), 5 points for achieving computationally desirable outcome (i.e., (p_{LC} > 50% quad {text{and}} quad p_{RP2} < 50%)), and -1 point for failing to achieve a desirable outcome altogether. The negative point motivates the AI agent to search for the optimal dose as soon as possible.
To compensate for low number of data points we employed WGAN-GP43, which learns the underlying data distribution and generates more data points. We generated 4000 additional data points for training qDRL models. Having a larger training dataset helps the reinforcement learning algorithm in accurately representing the state space. The training details are presented in the Supplementary Material.
See the rest here:
Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy | Scientific Reports - Nature.com
- Small, room-temperature quantum computers that use light on the horizon after breakthrough, scientists say - Live Science - July 4th, 2025 [July 4th, 2025]
- Quantum computers are surprisingly random but that's a good thing - New Scientist - July 4th, 2025 [July 4th, 2025]
- Quantum computers could bring lost Bitcoin back to life: Heres how - Cointelegraph - July 4th, 2025 [July 4th, 2025]
- The Quantum Computing Industry Is Crowded. Why D-Wave, IonQ, and Rigetti Are a Buy. - Barron's - July 4th, 2025 [July 4th, 2025]
- Quantum tech is coming and with it a risk of cyber doomsday - politico.eu - July 4th, 2025 [July 4th, 2025]
- Quantum Annealers From D-Wave Optimise Robotic Inspection Of Industrial Components. - Quantum Zeitgeist - July 4th, 2025 [July 4th, 2025]
- The Best Quantum Computing Stocks to Buy Right Now - Yahoo Finance - July 4th, 2025 [July 4th, 2025]
- QBTS: With Its Quantum Leap Priced In, Jump In On A Dip (NYSE:QBTS) - Seeking Alpha - July 4th, 2025 [July 4th, 2025]
- Buy this quantum computing stock that can rally more than 30%, Cantor says - CNBC - July 4th, 2025 [July 4th, 2025]
- A new tech race is on. Can Europe learn from the ones it lost? - politico.eu - July 4th, 2025 [July 4th, 2025]
- Rigetti Computing: Cantor's Bullish Call May Be Just the Start - MarketBeat - July 4th, 2025 [July 4th, 2025]
- The Quantum Data Center of the Future: Q&A - IoT World Today - July 4th, 2025 [July 4th, 2025]
- Quantum Computing Investments: A Once-in-a-Lifetime Opportunity? - Yahoo Finance - July 2nd, 2025 [July 2nd, 2025]
- Q&A: Companies are racing to develop the first useful quantum computerultracold neutral atoms could be the key - Phys.org - July 2nd, 2025 [July 2nd, 2025]
- Quantum Computers Just Reached the Holy Grail No Assumptions, No Limits - SciTechDaily - July 2nd, 2025 [July 2nd, 2025]
- Scientists Achieve Teleportation Between Quantum Computers for the First Time Ever - MSN - July 2nd, 2025 [July 2nd, 2025]
- The IBM Comeback Story That's Making Wall Street Pay Attention - Investopedia - July 2nd, 2025 [July 2nd, 2025]
- Scientists Achieve Teleportation Between Quantum Computers for the First Time Ever - The Daily Galaxy - July 2nd, 2025 [July 2nd, 2025]
- Measuring error rates of mid-circuit measurements - Nature - July 2nd, 2025 [July 2nd, 2025]
- IonQ Backs Texas Quantum Initiative To Boost Innovation - Quantum Zeitgeist - July 2nd, 2025 [July 2nd, 2025]
- Inside the Quantum Economy: Insights from the 2025 QED-C Report - AZoQuantum - July 2nd, 2025 [July 2nd, 2025]
- Six Ways Argonne Is Advancing Quantum Information Research - HPCwire - July 2nd, 2025 [July 2nd, 2025]
- The Best Quantum Computing Stocks to Buy Right Now - MSN - July 2nd, 2025 [July 2nd, 2025]
- Researchers Target Quantum Advantage in Binding Energy Calculations - The Quantum Insider - July 2nd, 2025 [July 2nd, 2025]
- Pure Quantum: Rigetti's Journey From YC To NASDAQ And What Could Be Next - Quantum Zeitgeist - July 2nd, 2025 [July 2nd, 2025]
- Quantum machine learning (QML) is closer than you think: Why business leaders should start paying attention now - cio.com - July 2nd, 2025 [July 2nd, 2025]
- Quantum Threat: Bitcoins Fight To Secure Our Digital Future - Forbes - July 2nd, 2025 [July 2nd, 2025]
- The road to quantum datacentres goes beyond logical qubits - Computer Weekly - July 2nd, 2025 [July 2nd, 2025]
- Potential Solution Halves Testing Cost for Quantum Chips, Boosting Commercial Viability | Newswise - Newswise - June 29th, 2025 [June 29th, 2025]
- Scientists achieve teleportation between quantum computers for the first time ever - Earth.com - June 29th, 2025 [June 29th, 2025]
- Down 48%, Should You Buy the Dip on Rigetti Computing? - Yahoo Finance - June 29th, 2025 [June 29th, 2025]
- QuEra Computing, founded by researchers at Harvard University and the Massachusetts Institute of Te.. - - June 29th, 2025 [June 29th, 2025]
- Down 30%, Should You Buy the Dip on IonQ? - MSN - June 29th, 2025 [June 29th, 2025]
- New Hybrid QuantumClassical Computing Approach Used to Study Chemical Systems - Caltech - June 28th, 2025 [June 28th, 2025]
- Quantum, Moores Law, And AIs Future - Forbes - June 28th, 2025 [June 28th, 2025]
- Canada Sets Timeline to Shield Government Systems from Quantum Threat - The Quantum Insider - June 28th, 2025 [June 28th, 2025]
- Is the UK Set for an AI-Powered Future with Quantum Boost? - AI Magazine - June 28th, 2025 [June 28th, 2025]
- 'Quantum AI' algorithms already outpace the fastest supercomputers, study says - Live Science - June 28th, 2025 [June 28th, 2025]
- IonQ vs IBM: Which Quantum Computing Stock Is the Better Buy Today? - Zacks Investment Research - June 28th, 2025 [June 28th, 2025]
- Quantum Computers Stealing Bitcoin? Stealing Ideas Is A Bigger Threat - Forbes - June 28th, 2025 [June 28th, 2025]
- IonQ And The University of Washington Simulate Process Linked To The Universes Matter-Antimatter Imbalance - The Quantum Insider - June 28th, 2025 [June 28th, 2025]
- Where Will Rigetti Computing Stock Be in 5 Years? - The Motley Fool - June 28th, 2025 [June 28th, 2025]
- Hearing Wrap Up: U.S. Must Update Technology to Prepare for the Quantum Age - United States House Committee on Oversight and Accountability - (.gov) - June 26th, 2025 [June 26th, 2025]
- U.S. Lawmakers Urge Action on Cybersecurity in Face of Quantum Threat - The Quantum Insider - June 26th, 2025 [June 26th, 2025]
- New chip could be the breakthrough the quantum computing industry has been waiting for - Live Science - June 26th, 2025 [June 26th, 2025]
- Want to Invest in Quantum Computing? 2 Stocks That Are Great Buys Right Now. - MSN - June 26th, 2025 [June 26th, 2025]
- Quantum Computing Achieves Protein Folding Breakthrough - IoT World Today - June 26th, 2025 [June 26th, 2025]
- Mace Opens Hearing on Quantum Computing and Advancing U.S. Cybersecurity - United States House Committee on Oversight and Accountability - (.gov) - June 26th, 2025 [June 26th, 2025]
- Report to Congress on Cyber Threats from Quantum Computing - USNI News - June 26th, 2025 [June 26th, 2025]
- Bringing post-quantum cryptography to Windows - InfoWorld - June 26th, 2025 [June 26th, 2025]
- Modeling a nitrogen-vacancy center with NVIDIA CUDA-Q Dynamics: University of Washington Capstone Project - Amazon.com - June 26th, 2025 [June 26th, 2025]
- ISC2025 Panel: Quantum Software Needs to Move Beyond Duct Tape But How? - HPCwire - June 26th, 2025 [June 26th, 2025]
- Q-CTRLs Fire Opal Integrated with Rigettis Ankaa-3, Demonstrating Significant Performance Boosts - Quantum Computing Report - June 26th, 2025 [June 26th, 2025]
- IonQ and the University of Washington Simulate Process Linked To The Universes Matter-Antimatter Imbalance - Business Wire - June 26th, 2025 [June 26th, 2025]
- IonQ to Participate in Quantum Korea 2025 and Support Quantum Hackathon for Emerging Talent - Business Wire - June 26th, 2025 [June 26th, 2025]
- 'This result has been more than a decade in the making': Millions of qubits on a single quantum processor now possible after cryogenic breakthrough -... - June 26th, 2025 [June 26th, 2025]
- A quantum opportunity; Colorado is the future of quantum computing, and a local nonprofit is part of the team - Montrose Daily Press - June 26th, 2025 [June 26th, 2025]
- IonQ and University of Washington Simulate Neutrinoless Double-Beta Decay on Quantum Computer - Quantum Computing Report - June 26th, 2025 [June 26th, 2025]
- Government to Invest 645.4 Billion Won in Quantum Computer Development Over 8 Years - Businesskorea - June 26th, 2025 [June 26th, 2025]
- This Tech Giant Just Pulled the Curtain on a New Quantum Computer - 24/7 Wall St. - June 26th, 2025 [June 26th, 2025]
- IBM brings Fugaku supercomputer together with first quantum computer - SDxCentral - June 26th, 2025 [June 26th, 2025]
- At last, we are discovering what quantum computers will be useful for - New Scientist - June 24th, 2025 [June 24th, 2025]
- IBM and RIKEN Unveil First IBM Quantum System Two Outside of the U.S. - IBM Newsroom - June 24th, 2025 [June 24th, 2025]
- The Year of Quantum: From concept to reality in 2025 - McKinsey & Company - June 24th, 2025 [June 24th, 2025]
- IBM and RIKEN Unveil First IBM Quantum System Two Outside of the U.S. - PR Newswire - June 24th, 2025 [June 24th, 2025]
- IBM and RIKEN Unveil First IBM Quantum System Two Outside of the U.S. - The Quantum Insider - June 24th, 2025 [June 24th, 2025]
- Quantum breakthrough: Magic states now easier, faster, and way less noisy - ScienceDaily - June 24th, 2025 [June 24th, 2025]
- Unpacking quantum myths...and why they matter - Diginomica - June 24th, 2025 [June 24th, 2025]
- Bitcoins Countdown Has Begun: Experts Reveal When Quantum Computers Will Finally Shatter Its Legendary Encryption - Rude Baguette - June 24th, 2025 [June 24th, 2025]
- Six ways Argonne is advancing quantum information research - anl.gov - June 24th, 2025 [June 24th, 2025]
- IBM and RIKEN Unveil First IBM Quantum System Two Outside of the U.S. - MarketScreener - June 24th, 2025 [June 24th, 2025]
- eleQtron selected as Technology Pioneer 2025 by the World Economic Forum - The Quantum Insider - June 24th, 2025 [June 24th, 2025]
- Why Photonics is Essential for the Future of Quantum Innovation - AZoQuantum - June 24th, 2025 [June 24th, 2025]
- Microsoft Unveils a New 4-Dimension Geometrical Code for Quantum Error Correction - Quantum Computing Report - June 24th, 2025 [June 24th, 2025]
- A quantum satellite computer was launched into space for the first time: it was delivered to orbit by a SpaceX rocket - dev.ua - June 24th, 2025 [June 24th, 2025]
- Falcon 9 starts the era of space qubits: Historic launch of a quantum computer - Universe Space Tech - June 24th, 2025 [June 24th, 2025]
- What Happens To Bitcoin When Quantum Computers Arrive? - Bitcoin Magazine - June 22nd, 2025 [June 22nd, 2025]
- 'Reliable quantum computing is here': Novel approach to error-correction can reduce errors in future systems up to 1,000 times, Microsoft scientists... - June 22nd, 2025 [June 22nd, 2025]
- 2 Top Quantum Computing Stocks to Buy in 2025 - Yahoo - June 22nd, 2025 [June 22nd, 2025]
- IQC and Waterloo mourn the loss of Raymond Laflamme - University of Waterloo - June 22nd, 2025 [June 22nd, 2025]