CPC G06F 40/284 (2020.01) [G06F 40/30 (2020.01); G10L 15/16 (2013.01); G10L 15/18 (2013.01); G10L 15/1822 (2013.01); G10L 2015/223 (2013.01); G10L 2015/225 (2013.01)] | 18 Claims |
1. A method for selection of Large Language Models (LLMs), the method comprising:
receiving a request from a service that hosts an application, wherein the request is configured to be processed by an LLM to generate a response;
applying a classification model to the request to determine a class of the request, wherein the classification model is trained to receive data examples and classify the data examples into a plurality of classes;
selecting an LLM from a plurality of candidate LLMs based at least in part on the determined class of the request; and
recommending the selected LLM to the application,
wherein the plurality of candidate LLMs comprise a first LLM hosted on an open source platform having a first user feedback score derived from the determined class of the request, and a second LLM hosted on a non-open source platform having a second feedback score derived from the determined class of the request,
wherein the selecting the LLM from the plurality of candidate LLMs comprises:
for a difference between the first user feedback score and the second user feedback score being less than a threshold, selecting the first LLM as the LLM, and
for the difference being equal to or greater than the threshold, selecting the second LLM as the LLM.
|