US 12,444,215 B2
Method and system for detecting and extracting price region from digital flyers and promotions
Amit Kumar Agrawal, Kolkata (IN); Mantu Prasad Gupta, Mumbai (IN); Devang Jagdishchandra Patel, Mumbai (IN); and Pushp Kumar Jain, Pune (IN)
Assigned to TATA CONSULTANCY SERVICES LIMITED, Mumbai (IN)
Filed by Tata Consultancy Services Limited, Mumbai (IN)
Filed on Mar. 14, 2023, as Appl. No. 18/183,411.
Claims priority of application No. 202221029276 (IN), filed on May 20, 2022.
Prior Publication US 2023/0377356 A1, Nov. 23, 2023
Int. Cl. G06V 30/14 (2022.01); G06V 20/62 (2022.01); G06V 30/164 (2022.01); G06V 30/18 (2022.01); G06V 30/19 (2022.01)
CPC G06V 30/1444 (2022.01) [G06V 20/62 (2022.01); G06V 30/164 (2022.01); G06V 30/18086 (2022.01); G06V 30/19007 (2022.01)] 12 Claims
OG exemplary drawing
 
1. A processor implemented method for detecting and extracting price region from digital flyers and promotions, the method comprising:
acquiring via one or more hardware processors, a set of digital flyers and promotions as input images from external sources;
detecting via the one or more hardware processors, one or more text regions comprising a price information from each input image from the input images by creating a bounding box around the detected texts including individual text character and broken words in each input image and merging adjacent text regions bounding boxes to form text cluster and filtering the merged text regions when length to width ratio of the bounding box lies within a predefined range, wherein the price information includes different representation of values which varies with font, color, textual information and placement of the textual information;
converting via the one or more hardware processors, each text region into a two-color text comprising of a set of white pixels and a set of black pixels by, (i) changing each text region into a gray scale image and selecting top ranked peaks from a histogram comprising one or more text regions, (ii) calculating a mean value of boundary pixels of the gray scale image, (iii) calculating a mean range based on the peaks closer to the boundary pixel mean value and the peaks farther to the boundary pixel mean value, and (iv) converting one or more pixels in black color which lies within the mean range around peak closer to the boundary pixel mean value and the one or more pixels into white color which falls out of the boundary pixel mean value, wherein converting the each text region into the two-color text for handling color variations in the input images;
detecting via the one or more hardware processors, price region from the two-color text by using a price tag flag status; and
extracting via the one or more hardware processors, the price from the price region of each input image by using a price tag flag status, wherein values of the price include an irregular font size.