US 11,750,547 B2
Multimodal named entity recognition
Vitor Rocha de Carvalho, San Diego, CA (US); Leonardo Ribas Machado das Neves, Marina Del Rey, CA (US); and Seungwhan Moon, Bellevue, WA (US)
Assigned to Snap Inc., Santa Monica, CA (US)
Filed by Snap Inc., Santa Monica, CA (US)
Filed on Aug. 27, 2021, as Appl. No. 17/459,161.
Application 17/459,161 is a continuation of application No. 16/125,615, filed on Sep. 7, 2018, granted, now 11,120,334.
Claims priority of provisional application 62/556,206, filed on Sep. 8, 2017.
Prior Publication US 2021/0390411 A1, Dec. 16, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. H04L 51/10 (2022.01); G06N 3/08 (2023.01); G06V 30/19 (2022.01); G06V 10/82 (2022.01); G06V 10/40 (2022.01); H04L 67/10 (2022.01)
CPC H04L 51/10 (2013.01) [G06N 3/08 (2013.01); G06V 10/40 (2022.01); G06V 10/82 (2022.01); G06V 30/19147 (2022.01); G06V 30/19173 (2022.01); H04L 67/10 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A method comprising:
identifying a multimodal message comprising an image and a string, the string comprising one or more words;
generating, using an entity neural network, an indication that at least one of the one or more words is a named entity, the entity neural network comprising an attention neural network trained to increase emphasis on one of a plurality of embeddings based on relevance to the multimodal message, the plurality of embeddings comprising an image embedding from the image and a string embedding corresponding to the string in the multimodal message;
storing, using one or more processors of a machine, the named entity as being associated with the multimodal message; and
generating a combined embedding from the image embedding and the string embedding using the attention neural network,
wherein the entity neural network comprises a classification neural network that processes the combined embedding generated by the attention neural network.