US 12,142,083 B2
	Audiovisual deepfake detection
Tianxiang Chen, Atlanta, GA (US); and Elie Khoury, Atlanta, GA (US)
Assigned to Pindrop Security, Inc., Atlanta, GA (US)
Filed by Pindrop Security, Inc., Atlanta, GA (US)
Filed on Oct. 15, 2021, as Appl. No. 17/503,152.
Claims priority of provisional application 63/092,956, filed on Oct. 16, 2020.
Prior Publication US 2022/0121868 A1, Apr. 21, 2022
Int. Cl. G06K 9/00 (2022.01); G06F 18/21 (2023.01); G06F 18/22 (2023.01); G06K 9/62 (2022.01); G06V 20/40 (2022.01); G06V 40/16 (2022.01); G06V 40/40 (2022.01); G06V 40/70 (2022.01); G10L 17/22 (2013.01)

CPC G06V 40/40 (2022.01) [G06F 18/21 (2023.01); G06F 18/22 (2023.01); G06V 20/49 (2022.01); G06V 40/168 (2022.01); G06V 40/70 (2022.01); G10L 17/22 (2013.01)]

22 Claims

1. A computer-implemented method comprising:

obtaining, by a computer, an audiovisual data sample containing audiovisual data;

applying, by the computer, a machine-learning architecture to the audiovisual data to generate a similarity score using a biometric embedding extracted from the audiovisual data, generate a lip-sync score using one or more lip-sync embeddings extracted from the audiovisual data, and generate a deepfake score using a speaker spoofprint embedding and a facial spoofprint embedding extracted from the audiovisual data; and

generating, by the computer, a final output score indicating a likelihood that the audiovisual data is genuine based upon algorithmically combining the similarity score, the lip-sync score, and the deepfake score.