US 12,437,776 B2
	Automated classification of relative dominance based on reciprocal prosodic behaviour in an audio conversation
Baruchi Har-Lev, Herzliya (IL); Ori Manor Zuckerman, Tel Aviv (IL); Eran Yessodi, Kfar-Saba (IL); and Alessandro Vinciarelli, Glasgow (GB)
Assigned to SubStrata Ltd., Tel Aviv (IL)
Filed by SubStrata Ltd., Tel Aviv (IL)
Filed on Sep. 19, 2022, as Appl. No. 17/947,507.
Prior Publication US 2024/0105208 A1, Mar. 28, 2024
Int. Cl. G10L 25/63 (2013.01); G10L 17/02 (2013.01); G10L 17/04 (2013.01); G10L 25/03 (2013.01)

CPC G10L 25/63 (2013.01) [G10L 17/02 (2013.01); G10L 17/04 (2013.01); G10L 25/03 (2013.01)]

19 Claims

1. A system comprising a processor and memory circuitry (PMC) configured to, for at least one session comprising at least an audio content, the session involving at least a first participant and a second participant:

use one or more speech processing algorithms, from digitized audio content informative of the audio content of the session:

for the first participant:

extract features informative of an audio content associated with the first participant in a first initial period of time to generate first baseline data,

for at least one first period of time starting after an end of the first initial period of time, extract features informative of an audio content associated with the first participant in the first period of time to generate first updated baseline data,

for the second participant:

extract features informative of an audio content associated with the second participant in a second initial period of time to generate second baseline data,

for at least one second period of time starting after an end of the second initial period of time, extract features informative of an audio content associated with the second participant in the second period of time to generate second updated baseline data,

use one or more speech processing algorithms to determine, based on the digitized audio content, a vector VF₁comprising features informative of an audio content associated with the first participant in a first limited period of time being at least partially within the first period of time, wherein a duration of the first limited period of time is shorter than a duration of the first period of time,

use one or more speech processing algorithms to determine, based on the digitized audio content, a vector VF₂comprising features informative of an audio content associated with the second participant in a second limited period of time being at least partially within the second period of time, wherein a duration of the second limited period of time is shorter than a duration of the second period of time,

feed the first baseline data, the second baseline data, the first updated baseline data, the second updated baseline data, VF₁and VF₂to a machine learning deep neural network, and

determine, using the machine learning deep neural network, data D_dominanceinformative of a dominance of at least one of the first participant or the second participant in at least part of the session based on the first baseline data, the second baseline data, the first updated baseline data, the second updated baseline data, VF₁and VF₂.