US 12,431,119 B2
	Systems and methods for providing notifications within a media asset without breaking immersion
Vikram Makam Gupta, Karnataka (IN); Prateek Varshney, Karnataka (IN); Madhusudhan Seetharam, Karnataka (IN); Ashish Kumar Srivastava, Karnataka (IN); and Harshith Kumar Gejjegondanahally Sreekanth, Karnataka (IN)
Assigned to Adeia Guides Inc., San Jose, CA (US)
Filed by Adeia Guides Inc., San Jose, CA (US)
Filed on Jun. 25, 2024, as Appl. No. 18/753,840.
Application 18/753,840 is a continuation of application No. 18/238,231, filed on Aug. 25, 2023, granted, now 12,046,229.
Application 18/238,231 is a continuation of application No. 17/497,225, filed on Oct. 8, 2021, granted, now 11,798,528, issued on Oct. 24, 2023.
Application 17/497,225 is a continuation of application No. 16/144,395, filed on Sep. 27, 2018, granted, now 11,170,758, issued on Nov. 9, 2021.
Prior Publication US 2024/0347040 A1, Oct. 17, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 13/08 (2013.01); G06F 40/205 (2020.01); G06F 40/279 (2020.01); G10L 13/00 (2006.01); G10L 13/033 (2013.01); H04M 1/72433 (2021.01); H04W 68/00 (2009.01); H04M 1/72442 (2021.01)

CPC G10L 13/08 (2013.01) [G06F 40/205 (2020.01); G06F 40/279 (2020.01); G10L 13/00 (2013.01); G10L 13/033 (2013.01); H04M 1/72433 (2021.01); H04W 68/005 (2013.01); H04M 1/72442 (2021.01); H04M 2201/39 (2013.01)]

18 Claims

1. A method comprising:

receiving notification data during a display of a media asset by a media device, wherein the notification data is unrelated to the media asset;

in response to receiving the notification data during the display of the media asset on the media device:

determining that the media asset comprises a voice;

determining that the notification data comprises non-textual visual information;

converting the non-textual visual information to text;

converting the text to synthesized speech using a text-to-voice model generated based on characteristics of the voice; and

generating, for output by the media device, the synthesized speech by:

determining a position in the media asset for outputting the synthesized speech, based on one or more of contextual features of the media asset and the notification data; and

generating, for output at the position in the media asset by the media device, the synthesized speech.