US 12,321,663 B2
	Leveraging visual data to enhance audio reception
Zachary Cleigh Meredith, Eagle River, AK (US); Peter Hardie, Cumming, GA (US); and Sheldon Kent Meredith, Roswell, GA (US)
Assigned to AT&T Intellectual Property I, L.P., Atlanta, GA (US); and AT&T Mobility II LLC, Atlanta, GA (US)
Filed by AT&T Intellectual Property I, L.P., Atlanta, GA (US); and AT&T Mobility II LLC, Atlanta, GA (US)
Filed on May 31, 2022, as Appl. No. 17/804,722.
Prior Publication US 2023/0385015 A1, Nov. 30, 2023
Int. Cl. G06F 3/16 (2006.01); G10K 11/178 (2006.01)

CPC G06F 3/165 (2013.01) [G06F 3/162 (2013.01); G10K 11/17821 (2018.01)]

20 Claims

1. A method comprising:

calculating, by a processing system including at least one processor, a signal to noise ratio of a captured audio stream, wherein the captured audio stream comprises an utterance spoken in a first language;

determining, by the processing system, that the signal to noise ratio of the captured audio stream is lower than a predefined threshold;

acquiring, by the processing system, visual data of a source of the captured audio stream in response to the determining that the signal to noise ratio of the captured audio stream is lower than the predefined threshold;

using, by the processing system, the visual data to infer a sound that is being made by the source of the captured audio stream;

indexing, by the processing system, the sound that is being made by the source of the captured audio stream to a library index; and

transferring, by the processing system, the library index to a receiving user endpoint device, wherein the library index that is transferred corresponds to a sound that is used to reconstruct the utterance in a second language different from the first language.