US 11,669,683 B2
Speech recognition and summarization
Glen Shires, Danville, CA (US); Sterling Swigart, Mountain View, CA (US); Jonathan Zolla, Belmont, CA (US); and Jason J. Gauci, Mountain View, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on May 18, 2020, as Appl. No. 16/876,597.
Application 16/876,597 is a continuation of application No. 16/669,125, filed on Oct. 30, 2019, granted, now 10,679,005.
Application 16/669,125 is a continuation of application No. 16/216,565, filed on Dec. 11, 2018, granted, now 10,496,746, issued on Dec. 3, 2019.
Application 16/216,565 is a continuation of application No. 15/202,039, filed on Jul. 5, 2016, granted, now 10,185,711, issued on Jan. 22, 2019.
Application 15/202,039 is a continuation of application No. 14/078,800, filed on Nov. 13, 2013, granted, now 9,420,227, issued on Aug. 16, 2016.
Application 14/078,800 is a continuation of application No. 13/743,838, filed on Jan. 17, 2013, granted, now 8,612,211, issued on Dec. 17, 2013.
Claims priority of provisional application 61/699,072, filed on Sep. 10, 2012.
Prior Publication US 2020/0279074 A1, Sep. 3, 2020
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/00 (2020.01); G06F 40/279 (2020.01); G10L 15/26 (2006.01); H04N 7/15 (2006.01); G10L 21/10 (2013.01); G10L 15/18 (2013.01)
CPC G06F 40/279 (2020.01) [G10L 15/26 (2013.01); G10L 21/10 (2013.01); H04N 7/15 (2013.01); G10L 15/1815 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
during a video conference session between two or more participants each associated with a respective participant computing device:
from each participant computing device, receiving, at data processing hardware of a video conference system, video conference data representing speech of the participant associated with the corresponding participant computing device;
generating, by the data processing hardware, using an automated speech recognizer, a coalesced transcript of the video conference session by transcribing the speech represented by the video conference data received from each participant computing device into text in real-time;
annotating, by the data processing hardware, a particular phrase in the coalesced transcript by analyzing one or more terms in the text of the coalesced transcript;
processing, by the data processing hardware, the annotated particular phrase to generate an event invitation; and
transmitting, by the data processing hardware, the event invitation to each respective participant computing device associated with the two or more participants of the video conference session.
 
11. A video conference system comprising:
data processing hardware; and
memory hardware in communication with the data processing hardware and storing instructions, that when executed by the data processing hardware, cause the data processing hardware to perform operations comprising:
during a video conference session between two or more participants each associated with a respective participant computing device:
from each participant computing device, receiving video conference data representing speech of the participant associated with the corresponding participant computing device;
generating, using an automated speech recognizer, a coalesced transcript of the video conference session by transcribing the speech represented by the video conference data received from each participant computing device into text in real-time;
annotating a particular phrase in the coalesced transcript by analyzing one or more terms in the text of the coalesced transcript;
processing the annotated particular phrase to generate an event invitation; and
transmitting the event invitation to each respective participant computing device associated with the two or more participants of the video conference session.