| CPC H04L 63/0272 (2013.01) [G06F 18/2415 (2023.01)] | 17 Claims |

|
1. A method for managing workload placement, the method comprising:
obtaining, by a variant selection agent of a workload placement service and from a front-end device, a request for an inferencing payload generated by an inferencing workload and wherein the request includes sensitive information;
in response to the request:
performing a payload classification on the request to determine a variant selection for the inferencing workload, wherein the variant selection is one selected from the group consisting of a secured variant and a public variant and based on the inclusion of the sensitive information in the request;
selecting, based on the payload classification, the secure variant as the variant selection;
in response to the selection, transmitting the request to the secured variant, wherein the secured variant executes an instance of the inferencing workload on a secured production environment;
obtaining, from the secured variant, the requested inferencing payload, wherein the requested inferencing payload is generated by the secured variant executing a generative artificial intelligence (AI) model; and
providing, after the obtaining, the requested inferencing payload to the front-end device.
|