| CPC H04M 3/2281 (2013.01) [G06F 40/40 (2020.01); H04M 3/5175 (2013.01)] | 20 Claims |

|
1. A method for analyzing audio speech signals to detect fraudulent calls to a contact center, the method comprising:
splitting an audio recording of a call in real-time into a foreground speech signal attributed to a main speaker and a background audio signal;
extracting audio features from the foreground speech signal and the background audio signal;
inputting the extracted audio features into an ensemble model, wherein the ensemble model comprises multiple different machine learning models co-trained to cumulatively detect fraud, wherein the multiple different machine learning models include:
a speaker audio model to detect audio speech anomalies in the foreground speech signal attributed by clustering to the main speaker,
a speaker intent model to classify intent of the main speaker in the foreground speech signal using a large language model and call transcription, and
a prosody model to detect voice intonation of the main speaker in the foreground speech signal; and
outputting, by the ensemble model, a prediction of whether the call is fraudulent.
|