US 12,450,932 B2
Text-based information extraction from images
Anamika Chatterjee, Kolkata (IN); Saurabh Jha, Austin, TX (US); and Rohitashwa Chakraborty, Konnagar (IN)
Assigned to Dell Products L.P., Round Rock, TX (US)
Filed by Dell Products L.P., Round Rock, TX (US)
Filed on Jun. 12, 2023, as Appl. No. 18/333,271.
Prior Publication US 2024/0412543 A1, Dec. 12, 2024
Int. Cl. G06V 30/00 (2022.01); G06V 10/82 (2022.01); G06V 30/146 (2022.01); G06V 30/16 (2022.01); G06V 30/19 (2022.01)
CPC G06V 30/19147 (2022.01) [G06V 10/82 (2022.01); G06V 30/1475 (2022.01); G06V 30/1607 (2022.01); G06V 30/19173 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A method for extracting text information from images, comprising:
obtaining an extraction request associated with live data comprising an image;
generating, using a prediction model, rotational variant features and rotational invariant features associated with the live data;
generating, using the prediction model, text embeddings associated with the rotational variant features using overlapping kernel-based embedding on the live data;
generating, using the prediction model, attention values for each pixel in the live data using context attention;
applying a trained language model to the text embeddings, attention values, and the live data to generate predictions; and
performing extraction actions based on the predictions.