US 12,223,946 B2
	Artificial intelligence voice response system for speech impaired users
Shikhar Kwatra, San Jose, CA (US); Laura Grace Ellis, Austin, TX (US); Kaitlin McGoldrick, New York, NY (US); and Sarbajit K. Rakshit, Kolkata (IN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Sep. 11, 2020, as Appl. No. 16/948,310.
Prior Publication US 2022/0084504 A1, Mar. 17, 2022
Int. Cl. G10L 15/08 (2006.01); G06F 3/16 (2006.01); G06N 3/04 (2023.01); G10L 15/06 (2013.01); G10L 15/30 (2013.01)

CPC G10L 15/08 (2013.01) [G06F 3/167 (2013.01); G06N 3/04 (2013.01); G10L 15/063 (2013.01); G10L 15/30 (2013.01); G10L 2015/088 (2013.01)]

23 Claims

1. A method for voice responses, the method comprising:

gathering user data from at least one connected device, wherein the connected device comprises an artificial intelligence (AI) system that can observe human conversations and detect behavioral signals or biometric patterns of a user within a range of sensors used to detect the user's behavioral signals or biometric patterns;

predicting when a user desires to submit a voice command and whether the user has the ability to submit the voice command, wherein heuristics and health conditions of the user are used to predict a topic of a voice command and voice request and to provide a spoken menu from which the user may choose at least one voice command;

analyzing user speech stream data via natural language processing (NLP) algorithms to dynamically determine a satisfaction or frustration level of a user;

passing the gathered user data to a random forest algorithm to perform a binary classification;

training a voice response system based on the gathered user data, wherein the trained voice response system includes a plurality of voice menus, one of the plurality of voice menus being dynamically modified to include a set of hierarchical questions relating to an observed interaction or set of interactions with a user, until a customized menu is determined;

using a deep reinforcement learning model, such as a long short term memory recurrent neural network model (LSTM-RNN), to improve a knowledge corpus by correlating gathered behavioral input, body language, and biometric signals with an intended topic or a hierarchical voice menu related to the intended topic;

identifying a wakeup signal for the connected device based on observed behavioral signals or biometric patterns of the user within the range of the sensors used to detect the user's behavioral signals or biometric patterns, wherein the identified wakeup signal comprises a change in a behavior or a biometric parameter of the user;

determining that user engagement is intended based on identifying the wakeup signal;

handling a non-routine event of the user by initiating a phone call to a live person who may assist in understanding the non-routine event;

determining the customized menu based on the intended user engagement; and

engaging with the user through the at least one connected device.