US 12,236,193 B1
	Automated selection of large language models in cloud computing environments
Leonid Kuperman, Toronto (CA); Žilvinas Urbonas, Vilnius (LT); Laurynas Stasys, Vilnius (LT); and Kyrylo Yefimenko, Santo António da Serra (PT)
Assigned to CAST AI Group, Inc., Miami, FL (US)
Filed by CAST AI Group, Inc., North Miami Beach, FL (US)
Filed on Apr. 19, 2024, as Appl. No. 18/641,001.
Claims priority of provisional application 63/565,551, filed on Mar. 15, 2024.
Int. Cl. G10L 15/18 (2013.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01)

CPC G06F 40/284 (2020.01) [G06F 40/30 (2020.01); G10L 15/16 (2013.01); G10L 15/18 (2013.01); G10L 15/1822 (2013.01); G10L 2015/223 (2013.01); G10L 2015/225 (2013.01)]

18 Claims

1. A method for selection of Large Language Models (LLMs), the method comprising:

receiving a request from a service that hosts an application, wherein the request is configured to be processed by an LLM to generate a response;

applying a classification model to the request to determine a class of the request, wherein the classification model is trained to receive data examples and classify the data examples into a plurality of classes;

selecting an LLM from a plurality of candidate LLMs based at least in part on the determined class of the request; and

recommending the selected LLM to the application,

wherein the plurality of candidate LLMs comprise a first LLM hosted on an open source platform having a first user feedback score derived from the determined class of the request, and a second LLM hosted on a non-open source platform having a second feedback score derived from the determined class of the request,

wherein the selecting the LLM from the plurality of candidate LLMs comprises:

for a difference between the first user feedback score and the second user feedback score being less than a threshold, selecting the first LLM as the LLM, and

for the difference being equal to or greater than the threshold, selecting the second LLM as the LLM.