US 12,406,655 B2
Increased accessibility of synthesized speech by replacement of difficulty to understand words
Grzegorz Piotr Szczepanik, Cracow (PL); Piotr Kalandyk, Zielonki (PL); Łukasz Józef Matyasik, Cracow (PL); and Piotr Jan Kotara, Jodłownik (PL)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on May 20, 2022, as Appl. No. 17/664,260.
Prior Publication US 2023/0377559 A1, Nov. 23, 2023
Int. Cl. G10L 13/08 (2013.01); G06F 40/166 (2020.01); G06F 40/247 (2020.01); G10L 13/047 (2013.01); G10L 25/69 (2013.01)
CPC G10L 13/08 (2013.01) [G06F 40/166 (2020.01); G06F 40/247 (2020.01); G10L 13/047 (2013.01)] 9 Claims
OG exemplary drawing
 
1. A method of improving accessibility to computer reader tools, the method executable by a processor and comprising:
receiving data corresponding to one or more words to be displayed to a user;
filtering the received data to remove numbers, proper nouns, and non-standard characters;
converting the filtered data into synthesized speech;
identifying, from the synthesized speech, one or more words exceeding an understanding threshold value, the understanding threshold value corresponding to a probability of difficulty associated with understanding the one or more words, wherein the identifying comprises a neural network processing the synthesized speech and identifying a confidence score associated with each word within the synthesized speech;
retrieving one or more replacement words for the one or more words exceeding the understanding threshold value, wherein the one or more replacement words are synonyms retrieved from an electronic thesaurus;
updating the synthesized speech with the one or more replacement words, wherein the updating comprises generating a new waveform of updated synthesized speech with the one or more replacement words;
generating a report indicating the one or more words and the one or more replacement words, wherein the report comprises an audio file;
updating, via the neural network, the understanding threshold value based on the report; and
playing the updated synthesized speech and the report to the user.