US 12,112,338 B2
	Assistance for customer service agents
James Ellison, Issaquah, WA (US); Mark Hanson, Plano, TX (US); Joel Werdell, Seattle, WA (US); Stephen King, Seattle, WA (US); Christopher Mills, Bellevue, WA (US); Phoebe Parsons, Redmond, WA (US); Kasey Snow, Renton, WA (US); and Rudy Bourcelot, Everett, WA (US)
Assigned to T-Mobile USA, Inc., Bellevue, WA (US)
Filed by T-Mobile USA, Inc., Bellevue, WA (US)
Filed on May 11, 2021, as Appl. No. 17/317,751.
Claims priority of provisional application 63/032,438, filed on May 29, 2020.
Claims priority of provisional application 63/023,077, filed on May 11, 2020.
Prior Publication US 2021/0350384 A1, Nov. 11, 2021
Int. Cl. G06Q 30/016 (2023.01); G06N 20/00 (2019.01)

CPC G06Q 30/016 (2013.01) [G06N 20/00 (2019.01)]

13 Claims

1. A computer-implemented method, comprising:

establishing, by an audio service, a session initiation protocol (SIP) session with a contact center service;

during the SIP session and during an interaction between a first user and a second user, receiving, by a SIP module of the audio service, an incoming audio stream associated with the first user and an outgoing stream associated with the second user;

generating, by the SIP module, stream status events associated with the incoming audio stream and the outgoing audio stream;

providing, by the SIP module and to a stream handler ensemble, the stream status events;

based on the stream status events, selecting, by the stream handler ensemble, portions of the incoming audio stream and the outgoing audio stream for transcription by a transcription service;

generating, by the stream handler ensemble, a stateful connection node that is configured to maintain a state until removed or deactivated;

providing, by the stream handler ensemble, to the transcription service, and through the stateful connection node, the portions of the incoming audio stream and the outgoing audio stream;

providing, by the stream handler ensemble and to the transcription service over a stateless message bus, data related to when the portions of the incoming audio stream and the outgoing audio stream begin and end or data related to the opening and closing of the portions of the incoming audio stream and the outgoing audio stream;

receiving historical data that includes, for each previous interaction between various users, previous transcriptions from portions of previous incoming and outgoing audio streams of previous voice communications between previous users of the various users, previous characteristics of the communications between the previous users, previous actions performed by one of the previous users, and a previous customer summary file another of the previous users;

training, using machine learning and the historical data, a first model and additional first models that are configured to identify a given characteristic of given first communications based on receiving (i) a given first transcription from given first portions of given first incoming and first outgoing audio streams of given first voice communications between given first users and (ii) a given first customer summary file of one of the given first users;

training, using machine learning and the historical data, a second model and additional second models that are configured to identify a given script and instructions to read the script based on receiving (i) a given second transcription from given second portions of given second incoming and second outgoing audio streams of given second voice communications between given second users, (ii) a given second characteristic of the given second voice communications, and (iii) a given second customer summary file of one of the given second users;

receiving, by the stream handler ensemble, from the transcription service, and through the stateful connection node, transcriptions of the portions of the incoming audio stream and the outgoing audio stream;

receiving a customer summary file that reflects characteristics of the first user;

providing the transcriptions of the portions of the incoming audio stream and the outgoing audio stream and the customer summary file to the first model that is configured to determine a first characteristic of the interaction between the first user and the second user;

receiving, from the first model, the first characteristic of the interaction between the first user and the second user;

based on the first characteristic of the interaction between the first user and the second user, selecting the second model from among the second model and the additional second models;

providing the transcriptions of the portions of the incoming audio stream and the outgoing audio stream and the customer summary file to the second model that is configured to determine a script for the second user to speak to the first user during the interaction between the first user and the second user;

receiving, from the second model, the script for the second user to speak to the first user during the interaction between the first user and the second user;

providing, for output to the second user, the script for the second user to speak to the first user during the interaction between the first user and the second user;

receiving, from the second user, feedback associated with the script that was for the second user to speak to the first user during the interaction between the first user and the second user;

retraining the second model and the additional second models using machine learning and using (i) the historical data, (ii) the script that was for the second user to speak to the first user during the interaction between the first user and the second user, (iii) the customer summary file, (iv) the transcriptions of the portions of the incoming audio stream and the outgoing audio stream, and (v) the feedback associated with the script that was for the second user to speak to the first user during the interaction between the first user and the second user;

during an additional interaction between an additional first user and an additional second user, receiving, by the SIP module of the audio service, an additional incoming audio stream associated with the additional first user and an additional outgoing stream associated with the additional second user;

selecting, by the stream handler ensemble, portions of the additional incoming audio stream and the additional outgoing audio stream for transcription by the transcription service;

providing, by the stream handler ensemble and to the transcription service, the portions of the additional incoming audio stream and the additional outgoing audio stream;

receiving, by the stream handler ensemble and from the transcription service, transcriptions of the portions of the additional incoming audio stream and the additional outgoing audio stream;

receiving an additional customer summary file that reflects characteristics of the first user;

providing the transcriptions of the portions of the additional incoming audio stream and the additional outgoing audio stream and the additional customer summary file to the first model;

receiving, from the first model, an additional first characteristic of the additional interaction between the additional first user and the additional second user;

based on the additional first characteristic of the additional interaction between the additional first user and the additional second user, selecting the retrained second model from among the retrained second model and the retrained additional second models;

providing the transcriptions of the portions of the additional incoming audio stream and the additional outgoing audio stream and the additional customer summary file to the retrained second model;

receiving, from the retrained second model, an additional script for the additional second user to speak to the additional first user during the additional interaction between the additional first user and the additional second user; and

providing, for output to the additional second user, the additional script for the additional second user to speak to the additional first user during the additional interaction between the additional first user and the additional second user.