US 12,437,158 B2
Method for filtering and semi-automatically labeling training data
Ziaul Hasan Hashmi, Kirkland, WA (US); Mitul Tiwari, Santa Clara, CA (US); Soham Parikh, Santa Clara, CA (US); Quaizar Vohra, Santa Clara, CA (US); Jignesh Parmar, Santa Clara, CA (US); Shounak Purkayastha, Santa Clara, CA (US); Anil Madamala, Santa Clara, CA (US); Patrice Bechard, Montreal (CA); Orlando Marquez, Montreal (CA); Olivier Nguyen, Montreal (CA); and Srivatsava Daruru, Santa Clara, CA (US)
Assigned to ServiceNow, Inc., Santa Clara, CA (US)
Filed by ServiceNow, Inc., Santa Clara, CA (US)
Filed on Jul. 18, 2023, as Appl. No. 18/354,247.
Prior Publication US 2025/0028910 A1, Jan. 23, 2025
Int. Cl. G06F 40/40 (2020.01); G06F 16/355 (2025.01); G06F 40/35 (2020.01)
CPC G06F 40/40 (2020.01) [G06F 16/355 (2019.01); G06F 40/35 (2020.01)] 18 Claims
OG exemplary drawing
 
1. A method comprising:
obtaining user text;
applying an embedding model to the user text to generate an embedding vector in a vector space, wherein the embedding model was trained to generate respective embedding vectors for training data sets, and wherein the training data sets include a plurality of textual training examples each with respectively associated class labels;
identifying a particular textual training example, of the plurality of textual training examples, whose associated embedding vector is closest in the vector space to the embedding vector that is associated with the user text;
providing an indication of the user text and a class label respectively associated with the particular textual training example;
receiving a response that the class label correctly classifies the user text;
in response to receiving the response, training a production model using the user text associated with the class label;
obtaining additional user text; and
applying a natural language understanding (NLU) model to the additional user text to generate a predicted class label associated with the additional user text, wherein applying the embedding model to the user text to generate the embedding vector in the vector space is performed in response to determining that the NLU model was not applied to the user text to generate any class label for the user text.