US 12,499,897 B2
	System and method for authentication of processed audio files
Venkata Satya Sivajee Pinnamaneni, Dardenne Prairie, MO (US); Sachin Kumar Singh, Pune (IN); and Kaushal Shetty, Thane (W) Maharashtra (IN)
Filed by Mastercard International Incorporated, Purchase, NY (US)
Filed on Sep. 28, 2022, as Appl. No. 17/935,987.
Prior Publication US 2024/0105183 A1, Mar. 28, 2024
Int. Cl. G10L 17/06 (2013.01); G10L 17/02 (2013.01); G10L 17/04 (2013.01); H04L 9/32 (2006.01); G10L 17/18 (2013.01)

CPC G10L 17/06 (2013.01) [G10L 17/02 (2013.01); G10L 17/04 (2013.01); H04L 9/3242 (2013.01); G10L 17/18 (2013.01)]

11 Claims

1. An audio file authentication system comprising:

at least one processor configured to:

extract a plurality of audio features from a plurality of raw audio files containing at least a digital recording of a voice of a specific user;

provide an authenticated audio file by authenticating an audio file that includes a digital recording of the voice of the specific user based on the plurality of audio features and authentication information for the specific user;

identify a number of audio gaps within the recording of the voice of the specific user in the audio file;

generate a parent hash key unique to the authenticated audio file;

generate sequential child hash keys based on the parent hash key; and

generate a processed audio file in which the parent hash key and each of the sequential child hash keys are inserted sequentially as data in the audio gaps within the recording of the voice of the specific user in the audio file.

7. An audio file authentication system comprising:

a sonic engine computing system comprising at least one processor;

a plurality of user devices, each of the plurality of user devices associated with a user that subscribes to the audio file authentication system;

a database configured to be in electronic communication with the sonic engine computing system and to store and retrieve user authentication information, raw audio files, extracted audio features, sequential hash key arrays, and processed user audio clips;

a network electronically connecting the plurality of user devices to the sonic engine computing device;

the at least one processor of the sonic engine computing system configured to:

receive, over the network, user authentication information from a specific one of the plurality of user devices;

authenticate a specific user using the user authentication information based on the user authentication information stored in the database;

receive, over the network, an audio file to be published from the specific user comprising at least a digital recording of a voice of the specific user;

input the raw audio files associated with the specific user from the database into a trained audio features extraction model;

obtain a plurality of extracted audio features for the specific user outputted by the trained audio features extraction model;

store the plurality of extracted audio features obtained from the trained audio features extraction model as the extracted audio features for the specific user in the database;

input the audio file from the specific user as well as the extracted audio features and the user authentication information stored in the database for the specific user into a trained audio file verification model;

obtain a probabilistic authentication from the trained audio file verification model that the audio file comprises a digital recording of the voice of the specific user;

identify a length of the audio file and a number of audio gaps within the audio file;

generate a parent hash key unique to the audio file;

generate sequential child hash keys based on the parent hash key, the number of sequential child hash keys being equal to the number of gaps minus one;

generate a sequential hash key array comprised of the parent hash key and the sequential child hash keys;

generate a processed audio file in which the parent hash key and each of the sequential child hash keys are each inserted sequentially as data within the audio gaps of the audio file;

assign a random alphanumeric code to the processed audio file;

store the processed audio file in the database in a manner that associates the processed audio file with the random alphanumeric code and the user authentication information for the specific user; and

store the sequential hash key array in the database in a manner that associates the sequential hash key array with the processed audio file and the random alphanumeric code for the processed audio file.