US 12,073,844 B2
	Audio-visual hearing aid
Anatoly Efros, Rishon LeZion (IL); Noam Etzion-Rosenberg, Binyamina (IL); Tal Remez, Jerusalem (IL); Oran Lang, Givatayim (IL); Inbar Mosseri, Raanana (IL); Israel Or Weinstein, Tel Aviv (IL); Benjamin Schlesinger, Ramat Hasharon (IL); Michael Rubinstein, Natick, MA (US); Ariel Ephrat, Efrat (IL); Yukun Zhu, Shoreline, WA (US); Stella Laurenzo, Seattle, WA (US); Amit Pitaru, Brooklyn, NY (US); and Yossi Matias, Tel Aviv (IL)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 17/601,042
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Oct. 1, 2020, PCT No. PCT/US2020/053843 § 371(c)(1), (2) Date Oct. 1, 2021, PCT Pub. No. WO2022/071959, PCT Pub. Date Apr. 7, 2022.
Prior Publication US 2023/0267942 A1, Aug. 24, 2023
Int. Cl. G10L 21/0208 (2013.01); G10L 17/00 (2013.01); G10L 21/0272 (2013.01); G10L 25/57 (2013.01)

CPC G10L 21/0208 (2013.01) [G10L 17/00 (2013.01); G10L 21/0272 (2013.01); G10L 25/57 (2013.01); G10L 2021/02087 (2013.01)]

20 Claims

1. A method comprising:

receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device;

in response to receiving the first indication, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of each of the one or more first speakers in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, wherein sending the isolated speech signals for each of the one or more first speakers to the listening device comprises, for each first speaker of the one or more first speakers:

identifying a respective location of the first speaker relative to a location of the listening device that is configured to receive audio input from a plurality of audio channels; and

sending an isolated speech signal to a respective audio channel of the plurality of audio channels in accordance with the respective location of the first speaker corresponding to the isolated speech signal;

while generating the respective isolated speech signal for each of the one or more first speakers, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device; and

in response to the second indication, generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.