US 11,056,116 B2
Low latency nearby group translation
Shijing Xian, Sunnyvale, CA (US); and Deric Cheng, Bedford, TX (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Appl. No. 16/611,135
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Apr. 11, 2018, PCT No. PCT/US2018/027183
§ 371(c)(1), (2) Date Nov. 5, 2019,
PCT Pub. No. WO2019/199306, PCT Pub. Date Oct. 17, 2019.
Prior Publication US 2020/0194000 A1, Jun. 18, 2020
Int. Cl. G10L 15/30 (2013.01); G10L 13/08 (2013.01); G10L 15/00 (2013.01); G10L 15/26 (2006.01); G10L 15/22 (2006.01)
CPC G10L 15/30 (2013.01) [G10L 13/086 (2013.01); G10L 15/005 (2013.01); G10L 15/26 (2013.01); G10L 2015/227 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method of group translation, comprising:
in a first computing device of a first user and configured to output spoken audio in a first language, joining a multi-language translation group additionally including second and third users by establishing a local wireless network connection to one or both of a second computing device of the second user and a third computing device of the third user, wherein the second and third computing devices are respectively configured to output spoken audio in second and third languages, and wherein the first, second and third languages differ from one another;
in the first computing device, and in response to a speech input in the second language received at the second computing device and directed to the multi-language translation group:
receiving non-audio data associated with the speech input; and
generating a spoken audio output in the first language and directed to the first user from the non-audio data associated with the speech input, wherein generating the spoken audio output in the first computing device includes locally translating the non-audio data associated with the speech input to the first language; and
in the first computing device, and in response to a second speech input in the first language received from the first user at the first computing device:
receiving the second speech input from the first user;
performing automated speech recognition on the second speech input and locally translating an output of the automated speech recognition from the first language to a different language to generate second non-audio data associated with the second speech input; and
sending the second non-audio data associated with the second speech input to the second and third computing devices via the multi-language translation group to cause at least one of the second computing device to locally generate a translation of the second non-audio data into the second language and the third computing device to locally generate a translation of the second non-audio data into the third language.