US 12,118,980 B2
Artificial intelligence-based text-to-speech system and method
Martin Reber, Remetschwill (CH); and Vijeta Avijeet, Dübendorf (CH)
Assigned to Telepathy Labs, Inc., Tampla, FL (US)
Filed by Telepathy Labs, Inc., Tampa, FL (US)
Filed on Jul. 3, 2023, as Appl. No. 18/346,657.
Application 18/346,657 is a continuation of application No. 17/589,449, filed on Jan. 31, 2022, granted, now 11,735,161.
Application 17/589,449 is a continuation of application No. 16/446,893, filed on Jun. 20, 2019, granted, now 11,244,670, issued on Jan. 19, 2022.
Application 16/446,893 is a continuation of application No. 16/022,823, filed on Jun. 29, 2018, granted, now 10,373,605, issued on Aug. 6, 2019.
Application 16/022,823 is a continuation of application No. 15/982,326, filed on May 17, 2018, granted, now 10,319,364, issued on Jun. 11, 2019.
Claims priority of provisional application 62/508,024, filed on May 18, 2017.
Prior Publication US 2023/0351999 A1, Nov. 2, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 25/30 (2013.01); G06F 18/10 (2023.01); G06F 18/21 (2023.01); G06F 18/2135 (2023.01); G06N 3/02 (2006.01); G06N 3/042 (2023.01); G06N 3/08 (2023.01); G06N 5/02 (2023.01); G10L 13/04 (2013.01); G10L 13/08 (2013.01); G10L 19/00 (2013.01)
CPC G10L 13/08 (2013.01) [G06F 18/10 (2023.01); G06F 18/2135 (2023.01); G06F 18/217 (2023.01); G06N 3/02 (2013.01); G06N 3/042 (2023.01); G06N 3/08 (2013.01); G06N 5/02 (2013.01); G10L 13/04 (2013.01); G10L 19/00 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A text-to-speech (TTS) system including one or more processors and one or more memories configured to perform operations for converting text into a corrected speech signal comprising:
training a neural network based upon, at least in part, data of previously generated speech in a pre-existing knowledgebase of phonemes, wherein the previously generated speech has an inaccuracy;
generating a lossy representation of at least a portion of the data for use in the training; and
applying lossy representation of at least the portion of the data to the previously generated speech for correcting the inaccuracy of the previously generated speech in the pre-existing knowledgebase of phonemes.