CPC G16H 20/40 (2018.01) [A61N 5/1047 (2013.01); G06F 18/217 (2023.01); G06N 20/00 (2019.01); G16H 10/60 (2018.01)] | 20 Claims |
1. A computer-implemented method comprising:
iteratively training, by a processor, a machine learning model, wherein in at least one iteration, the processor:
executes the machine learning model to ingest patient data to select a predicted radiotherapy treatment attribute from a plurality of treatment attribute options for a category of radiotherapy treatment from a plurality of categories of radiotherapy treatment corresponding to different radiation therapy treatment techniques;
in response to displaying the plurality of treatment attribute options on an electronic device, receives a selection of at least one attribute;
calculates a reward value for the predicted radiotherapy treatment attribute, wherein when the selection matches the predicted radiotherapy treatment attribute, the processor adjusts the reward value upwards;
generates a subsequent predicted radiotherapy treatment attribute corresponding to a subsequent category of radiotherapy treatment, the processor selecting the subsequent category of radiotherapy treatment based on the selection of at least one radiotherapy treatment attribute received from the electronic device; and
calculates a subsequent reward value for the subsequent predicted radiotherapy treatment attribute,
wherein the processor trains a policy, using the reward value or the subsequent reward value, to generate a combination of predicted radiotherapy treatment attributes that generates a cumulative reward value that satisfies a threshold.
|