US 11,868,727 B2
Context tag integration with named entity recognition models
Duy Vu, Melbourne (AU); Tuyen Quang Pham, Springvale (AU); Cong Duy Vu Hoang, Wantirna South (AU); Srinivasa Phani Kumar Gadde, Fremont, CA (US); Thanh Long Duong, Seabrook (AU); Mark Edward Johnson, Castle Cove (AU); and Vishal Vishnoi, Redwood City, CA (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Jan. 19, 2022, as Appl. No. 17/648,376.
Claims priority of provisional application 63/139,569, filed on Jan. 20, 2021.
Prior Publication US 2022/0229993 A1, Jul. 21, 2022
Int. Cl. G06F 40/295 (2020.01); G06F 40/205 (2020.01); G06V 30/19 (2022.01); G06F 40/40 (2020.01); G06F 40/35 (2020.01); G06F 40/279 (2020.01)
CPC G06F 40/295 (2020.01) [G06F 40/205 (2020.01); G06F 40/279 (2020.01); G06F 40/35 (2020.01); G06F 40/40 (2020.01); G06V 30/19147 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, at a chatbot system comprising a processor, at least one utterance comprising one or more words;
generating, by a transformer-based model of the chatbot system, a plurality of embeddings for the one or more words of the at least one utterance;
generating, by a first vectorizer of the chatbot system, at least one regular expression and gazetteer feature vector for the at least one utterance;
generating, by a second vectorizer of the chatbot system, at least one context tag distribution feature vector for the at least one utterance;
concatenating or interpolating the plurality of embeddings with the at least one regular expression and gazetteer feature vector and the at least one context tag distribution feature vector to generate a first set of feature vectors;
generating, by a main sequence model of the chatbot system, an encoded form of the at least one utterance based on the first set of feature vectors;
generating, by a discriminative model of the chatbot system, a plurality of log-probabilities for candidate entities based on the encoded form of the at least one utterance; and
identifying, using the plurality of log-probabilities, one or more constraints for the at least one utterance based on the candidate entities.