US 12,217,180 B2
Training text summarization neural networks with an extracted segments prediction objective
Mohammad Saleh, Santa Clara, CA (US); Jingqing Zhang, London (GB); Yao Zhao, San Carlos, CA (US); and Peter J. Liu, Santa Clara, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Oct. 12, 2023, as Appl. No. 18/485,950.
Application 18/485,950 is a continuation of application No. 17/140,863, filed on Jan. 4, 2021, granted, now 11,803,751.
Application 17/140,863 is a continuation of application No. 16/869,419, filed on May 7, 2020, granted, now 10,885,436, issued on Jan. 5, 2021.
Prior Publication US 2024/0185065 A1, Jun. 6, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/08 (2023.01); G06F 40/30 (2020.01); G06N 3/045 (2023.01)
CPC G06N 3/08 (2013.01) [G06F 40/30 (2020.01); G06N 3/045 (2023.01)] 18 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, wherein the method comprises:
pre-training a neural network that has a plurality of network parameters to generate a pre-trained neural network, wherein the pre-training comprises:
obtaining a text document comprising a plurality of text segments;
determining, for each of the plurality of text segments, an importance score of the text segment that characterizes a relative importance of the segment with respect to other text segments in the text document;
selecting one or more text segments based on the importance scores;
generating a masked text document that replaces the one or more text segments in the text document with mask tokens;
processing, using the neural network and in accordance with current values of the plurality of network parameters, the masked text document to generate a prediction of the one or more text segments; and
determining, based on a difference between the prediction and the one or more text segments, an update to the current values of the plurality of network parameters; and
providing the pre-trained neural network for adaptation to perform a specific text processing task using labeled text data.