CPC G06F 16/3344 (2019.01) [G06F 16/355 (2019.01)] | 15 Claims |
1. A method of instance-wise adaptive knowledge injection in a large language pre-trained language model (PTLM), the method being executed by at least one processor, the method comprising:
determining whether external knowledge is needed for a respective query in a plurality of queries of a first dataset based on a thrust score of the respective query using internal knowledge of a target large scale pre-trained language model, wherein determining the thrust score comprises:
generating a query distribution based on the target lar e scale pre-trained language model;
generating one or more clusters based on the query distribution;
for the respective query among the plurality of queries, determining one or more unit vectors associated with the query that pointing from a query vector of the query to a center of each cluster among the one or more clusters, wherein each unit vector is associated with the query and a respective cluster among the one or more clusters; and
determining the thrust score for the respective query based on a sum vector of the one or more unit vectors weighted by a size of each of the one or more clusters, and
wherein each query in the or more clusters is represented using last layer hidden states of the target large scale pre-trained language model associated with each query;
based on determining that external knowledge is needed for one or more queries among the plurality of queries of the first dataset, augmenting the one or more queries with respective pieces of external knowledge;
generating a combined dataset based on combining the first dataset and the one or more augmented queries; and
applying the combined dataset to the target large scale pre-trained language model.
|