US 12,148,443 B2
	Speaker-specific voice amplification
Rachel Ostrand, Milford, PA (US); Sundar Saranathan, Framingham, MA (US); Fang Lu, Billerica, MA (US); and Carla Paola Agurto Rios, Ossining, NY (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Dec. 18, 2020, as Appl. No. 17/126,261.
Prior Publication US 2022/0199102 A1, Jun. 23, 2022
Int. Cl. G10L 21/0324 (2013.01); G06F 3/16 (2006.01); G10L 25/84 (2013.01); H04L 65/401 (2022.01)

CPC G10L 21/0324 (2013.01) [G06F 3/165 (2013.01); G10L 25/84 (2013.01); H04L 65/401 (2022.05)]

19 Claims

1. A method of using a computing device to amplify user voices, the method comprising:

generating a plurality of customized acoustic profiles for a first user of a conferencing system, wherein the generating comprises:

receiving, by a second microphone of the conferencing system, an audio sample of speech submitted by the first user for a first customized acoustic profile of the plurality of the customized acoustic profiles; and

generating, by the conferencing system, a user-specific acoustic model and a supplemental acoustic model for enhancement of speech by the first user based upon the audio sample, wherein the pre-trained acoustic model is adapted for a normal physical condition of the user and wherein the supplemental pre-trained acoustic model is adapted for a sick voice of the user;

receiving, in response to the generating the plurality of customized acoustic profiles, a live audiovisual stream from the first user at the conferencing system, the live audiovisual stream including live speech by the first user captured by a first microphone, wherein the live audiovisual stream includes background noise, and the first microphone is a non-directional microphone;

prompting, in response to the receiving the live audiovisual stream, the first user to select an acoustic profile for the live audiovisual stream;

receiving, by the conferencing system, a selection of the first customized acoustic profile, the selection input by the first user in response to the prompting;

in response to the selection of the first customized acoustic profile, applying the supplemental user-specific acoustic model by the conferencing system to selectively amplify live speech received from a user device during the live audiovisual stream without amplifying the background noise, wherein the conferencing system is configured to suppress the sick voice of the user when the pre-trained acoustic model is selected, and the amplification comprises recording the live speech and the amplification is based on the selected acoustic profile; and

retransmitting the amplified live speech of the first user to a plurality of other users of the conferencing system.