US 12,230,254 B2
Adversarial learning framework for persona-based dialogue modeling
Oluwatobi Olabiyi, Arlington, VA (US); Alan Salimov, San Bruno, CA (US); Anish Khazane, San Francisco, CA (US); and Erik Mueller, Chevy Chase, MD (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Jun. 1, 2023, as Appl. No. 18/204,746.
Application 18/204,746 is a continuation of application No. 17/228,158, filed on Apr. 12, 2021, granted, now 11,705,112.
Application 17/228,158 is a continuation of application No. 16/560,571, filed on Sep. 4, 2019, granted, now 10,978,051, issued on Apr. 13, 2021.
Claims priority of provisional application 62/737,089, filed on Sep. 26, 2018.
Prior Publication US 2023/0368778 A1, Nov. 16, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/00 (2013.01); G06N 3/08 (2023.01); G10L 15/16 (2006.01); G10L 15/183 (2013.01); G10L 15/22 (2006.01)
CPC G10L 15/16 (2013.01) [G06N 3/08 (2013.01); G10L 15/183 (2013.01); G10L 15/22 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
receiving, by at least one server communicatively coupled with a user device, a dialogue utterance;
applying, by the at least one server, a generative adversarial network (GAN) to the dialogue utterance to generate response candidates to the utterance and determine a response from the response candidates to respond to the utterance, the GAN comprising a generator and a discriminator, and wherein the applying the GAN comprises:
generating, by the generator, utilizing source attributes and target attributes, the response candidates responsive to the dialogue utterance, wherein the source attributes comprise a speaker identity, a speaker background, a speaker location, a speaker preference, a speaker sentiment, or combination thereof, and the target attributes comprise a respondent identity, a respondent background, a respondent location, a respondent preference, a respondent sentiment, or a combination thereof;
determining, by the discriminator, the response to respond to the dialogue utterance from the response candidates based on discrimination metrics comprising human-likeness and persona, the discriminator comprising an attribute discriminator to utilize the target attributes as a discriminator target and a dialogue history for multi-label attribute classification to classify outputs from the generator with an attribute class, and an adversarial discriminator to determine a binary output for human-likeness to identify the response candidates as real or fake; and
causing, by the at least one server, communication of the response to the user device responsive to the dialogue utterance.