US 12,068,065 B2
Methods and apparatus for controlling treatment delivery using reinforcement learning
Esa Heikki Kuusela, Espoo (FI); Shahab Basiri, Espoo (FI); Elena Czeizler, Helsinki (FI); Mikko Oskari Hakala, Rajamaki (FI); and Lauri Jaakonpoika Halko, Helsinki (FI)
Assigned to SIEMENS HEALTHINEERS INTERNATIONAL AG, Steinhausen (CH)
Filed by Siemens Healthineers International AG, Steinhausen (CH)
Filed on Apr. 27, 2023, as Appl. No. 18/140,114.
Application 18/140,114 is a continuation of application No. 16/832,115, filed on Mar. 27, 2020, granted, now 11,651,848.
Prior Publication US 2024/0079113 A1, Mar. 7, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G16H 20/40 (2018.01); A61N 5/10 (2006.01); G16H 30/40 (2018.01); G16H 50/20 (2018.01)
CPC G16H 20/40 (2018.01) [A61N 5/1038 (2013.01); A61N 5/1039 (2013.01); A61N 5/1067 (2013.01); A61N 5/1071 (2013.01); G16H 30/40 (2018.01); G16H 50/20 (2018.01); A61N 2005/1072 (2013.01); A61N 2005/1074 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A method for training an artificial intelligence (AI) agent to control delivery of radiation treatment to a subject, comprising:
using a first data set as input data for a machine-learning process implemented in a computer processing device, the first data set including a set of observations regarding a state of a treatment device configured to deliver the radiation treatment to the subject and observations regarding geometry of the subject, the machine-learning process being configured to determine a treatment device state based on the first data set; and
iteratively executing the machine-learning process until a predefined set of objectives are achieved,
wherein the predefined set of objectives includes treatment objectives and constraints,
wherein the iteratively executing the machine-learning process includes:
determining whether there is a preceding time step in a radiation delivery fraction; and
determining whether the subject has been subjected to radiation treatment in a previous radiation delivery fraction,
wherein a current treatment state of the subject is determined:
based at least in part on a preceding treatment state of the subject determined as part of the preceding time step when there is a preceding time step in the radiation delivery fraction;
based at least in part on a preceding treatment state determined at a conclusion of the previous radiation delivery fraction when there is no preceding time step in the radiation delivery fraction and the subject has been subjected to radiation treatment in the previous radiation delivery fraction; and
based at least in part on defining an initial treatment state when there is no preceding time step in the radiation delivery fraction and the subject has not been subjected to radiation treatment in the previous radiation delivery fraction.