US 12,086,548 B2
Event extraction from documents with co-reference
Rishita Rajal Anubhai, Seattle, WA (US); Yahor Pushkin, Redmond, WA (US); Graham Vintcent Horwood, Centreville, VA (US); Yinxiao Zhang, Issaquah, WA (US); Ravindra Manjunatha, Seattle, WA (US); Jie Ma, Seattle, WA (US); Alessandra Brusadin, Mountain View, CA (US); Jonathan Steuck, New York, NY (US); Shuai Wang, Sunnyvale, CA (US); Sameer Karnik, Issaquah, WA (US); Miguel Ballesteros Martinez, New York, NY (US); Sunil Mallya Kasaragod, San Franisco, CA (US); and Yaser Al-Onaizan, Cortland Manor, NY (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Sep. 30, 2020, as Appl. No. 17/039,919.
Prior Publication US 2022/0100963 A1, Mar. 31, 2022
Int. Cl. G06F 40/30 (2020.01); G06F 40/295 (2020.01); G06N 20/00 (2019.01)
CPC G06F 40/30 (2020.01) [G06F 40/295 (2020.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
an event extraction service hosted by a provider network and Internet-accessible by a plurality of clients, wherein the provider network offers a plurality of services including the event extraction service, and wherein the event extraction service comprises one or more processors and one or more memories to store computer-executable instructions that, when executed, cause the one or more processors to:
receive a document, wherein the document is provided by an individual one of the clients;
encode the document using a pre-trained encoder for natural language processing tasks, wherein the encode generates an encoded representation of the document;
share the encoded representation across a plurality of different natural language processing tasks, wherein the sharing provides the encoded representation as a same input to different respective decoders for the plurality of different natural language processing tasks that respectively decode the encoded representation to:
identify one or more event groups in the document, wherein an individual one of the event groups comprises a plurality of textual references to an occurrence of an event type, and wherein the one or more event groups are associated with one or more argument slots representing one or more semantic roles for entities with respect to the one or more event groups;
identify one or more entity groups in the document, wherein an individual one of the entity groups comprises a plurality of textual references to a real-world object type;
assign one or more of the entity groups to one or more of the argument slots; and
provide, to the individual one of the clients, output indicating the one or more event groups and the one or more of the entity groups assigned to the one or more of the argument slots.