US 11,997,344 B2
Translating a media asset with vocal characteristics of a speaker
Vijay Kumar, Karnataka (IN); Rajendran Pichaimurthy, Karnataka (IN); and Madhusudhan Seetharam, Karnataka (IN)
Assigned to Rovi Guides, Inc., San Jose, CA (US)
Filed by Rovi Guides, Inc., San Jose, CA (US)
Filed on Oct. 25, 2021, as Appl. No. 17/509,401.
Application 17/509,401 is a continuation of application No. 16/152,017, filed on Oct. 4, 2018, granted, now 11,195,507.
Prior Publication US 2022/0044668 A1, Feb. 10, 2022
Int. Cl. G06F 40/40 (2020.01); G10L 13/027 (2013.01); G10L 13/033 (2013.01); G10L 15/07 (2013.01); G10L 15/19 (2013.01); G10L 25/63 (2013.01); H04N 21/43 (2011.01); H04N 21/81 (2011.01)
CPC H04N 21/43072 (2020.08) [G10L 13/027 (2013.01); G10L 15/07 (2013.01); G10L 15/19 (2013.01); G10L 25/63 (2013.01); H04N 21/8106 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A method comprising:
accessing a media asset, the media asset featuring a speaker that utters a plurality of spoken words;
determining an identity of the speaker in the media asset based on metadata associated with the media asset;
identifying vocal characteristics of the identified speaker by:
searching, based on the identity, for another media asset featuring the speaker;
extracting a voice sample featuring the speaker from the another media asset;
identifying the vocal characteristics based on the voice sample from the another media asset;
determining non-linguistic characteristics of the plurality of spoken words;
determining an emotional state expressed in the media asset featuring the speaker that utters the plurality of spoken words based on the non-linguistic characteristics; and
generating a translation of the plurality of spoken words of the media asset featuring the speaker that utters the plurality of spoken words using the identified vocal characteristics, and the determined emotional state.