CPC G06F 16/242 (2019.01) [G06F 16/24575 (2019.01)] | 20 Claims |
1. A system comprising:
one or more computing devices configured to:
receive programming related training data comprising queries and answers;
implement an encoder that is initially pre-trained to perform initial embedding of the queries and the answers;
determine contrastive loss values for query-answer pairs, wherein a query of a given query-answer pair is encoded as a vector in Euclidean space and a corresponding answer of the given query-answer pair is encoded as another vector in the Euclidean space, and wherein a contrastive loss value for the given query-answer pair is determined based on comparing a similarity score determined between the encoded query vector and the corresponding encoded answer vector and another similarity score determined between the encoded query vector and an encoded unrelated answer vector;
train transformer layers for the encoder using the query-answer pairs and the determined contrastive loss values for the respective query-answer pairs such that encoded query vectors are located in the Euclidean space proximate to encoded answer vectors for the respective query-answer pairs, as compared to the encoded unrelated answer vectors; and
perform, using the encoder comprising the trained transformer layers, query answering for programming related queries.
|