US 12,451,121 B2
	System and method for translation of streaming encrypted content
Edwin Grappin, Seville (ES); and Jerome Verdier, Montreal (CA)
Assigned to COMMUNAUTE WOOPEN INC., Montreal (CA)
Appl. No. 18/282,112
Filed by COMMUNAUTE WOOPEN INC., Montreal (CA)
PCT Filed Mar. 31, 2022, PCT No. PCT/IB2022/053047 § 371(c)(1), (2) Date Sep. 14, 2023, PCT Pub. No. WO2022/208451, PCT Pub. Date Oct. 6, 2022.
Claims priority of application No. 21305426 (EP), filed on Apr. 1, 2021.
Prior Publication US 2024/0161734 A1, May 16, 2024
Int. Cl. G10L 15/06 (2013.01)

CPC G10L 15/063 (2013.01)

15 Claims

1. A method of generating a speech model, the speech model for generating signals representative of utterances in a first language and a second language based on respective signals representative of utterances in the second and first languages respectively, the speech model being hosted by a server communicatively coupled with a first device associated with a first user and a second device associated with a second user, the method executable by the server, the method comprising:

transmitting, by the server, a first speech model to the first device, the first speech model for locally generating by the first device signals representative of utterances in the second language based on signals representative of utterances in the first language;

transmitting, by the server, a second speech model to the second device, the second speech model for locally generating by the second device signals representative of utterances in the first language based on signals representative of utterances in the second language,

the first device being communicatively coupled with the second device by an encrypted communication link;

acquiring, by the server, a third speech model from the second device, the third speech model being the second speech model that has been locally trained on the second device based on a training set, the training set including:

a first decrypted signal being a given signal generated by the first device based on utterance of the first user in the first language and having been encrypted by the first device and decrypted by the second device,

a second decrypted signal being another given signal generated by the first speech model based on the given signal and having been encrypted by the first device and decrypted by the second device, the other given signal being representative of a translated utterance of the first user in the second language,

the third speech model having been trained to generate a training signal based on the second decrypted signal such that the training signal is similar to the first encrypted signal; and

locally generating, by the server, the speech model by combining the second speech model with the third speech model.