US 11,869,511 B2
	Using speech mannerisms to validate an integrity of a conference participant
Faisal Siyavudeen, Ernakulam (IN); Anupam Mukherjee, Bangalore (IN); and Vibhor Jain, Bangalore (IN)
Assigned to CISCO TECHNOLOGY, INC., San Jose, CA (US)
Filed by Cisco Technology, Inc., San Jose, CA (US)
Filed on Jun. 9, 2021, as Appl. No. 17/343,051.
Prior Publication US 2022/0399024 A1, Dec. 15, 2022
Int. Cl. G10L 17/04 (2013.01); G10L 15/18 (2013.01); G10L 15/26 (2006.01); G10L 15/01 (2013.01); G06N 20/00 (2019.01); G10L 15/19 (2013.01)

CPC G10L 17/04 (2013.01) [G06N 20/00 (2019.01); G10L 15/01 (2013.01); G10L 15/1822 (2013.01); G10L 15/19 (2013.01); G10L 15/26 (2013.01)]

21 Claims

1. A method comprising:

establishing a conference session with a plurality of participant user devices;

receiving, via the conference session, a digitized audio signal from a participant user device of the plurality of participant user devices;

establishing a user account identity associated with the participant user device;

determining reference speech mannerism features using a plurality of speech classifiers configured to evaluate multiple distinct features of speech mannerisms extracted from of one or more digitized audio signals generated by a particular individual associated with the user account identity;

converting the digitized audio signal to text;

generating, based on the text, observed speech mannerism features that are exhibited by the digitized audio signal using the plurality of speech classifiers;

determining a similarity measure between the reference speech mannerism features and the observed speech mannerism features based on a number of instances and a frequency of occurrence of respective features of the multiple distinct features in the observed speech mannerism features compared to the reference speech mannerism features;

validating an integrity of the digitized audio signal based on the similarity measure; and

selectively maintaining the participant user device in the conference session based on the validating,

wherein determining the similarity measure includes at least:

identifying a first feature of the multiple distinct features evaluated by the plurality of speech classifiers; and

applying a mismatch frequency weight to the similarity measure as a scaling factor when the frequency of the first feature in the observed speech mannerism features does not correspond with a frequency of the first feature in the reference speech mannerism features associated with the user account identity.