US 12,423,118 B2
Robotic process automation using enhanced object detection to provide resilient playback capabilities
Sudhir Kumar Singh, Dublin, CA (US); Jesse Truscott, Petaluma, CA (US); Virinchipuram Anand, San Ramon, CA (US); and Harshil Lodhiya, Vadodara (IN)
Assigned to Automation Anywhere, Inc., San Jose, CA (US)
Filed by Automation Anywhere, Inc., San Jose, CA (US)
Filed on Dec. 31, 2020, as Appl. No. 17/139,842.
Claims priority of provisional application 63/060,541, filed on Aug. 3, 2020.
Prior Publication US 2022/0032471 A1, Feb. 3, 2022
Int. Cl. G06F 9/451 (2018.01); B25J 13/06 (2006.01); G06F 3/0485 (2022.01)
CPC G06F 9/451 (2018.02) [B25J 13/06 (2013.01); G06F 3/0485 (2013.01); G05B 2219/32128 (2013.01); G05B 2219/50391 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for facilitating execution of a robotic process automation, the method comprising:
capturing an initial image of an initial user interface (UI) presented on a display device by an application program operating on a computing device;
detecting a plurality of UI controls of the initial UI within the captured initial image by programmatic examination of the captured initial image;
detecting positioning data and sizing data for each of the UI controls of the initial UI;
detecting a plurality of UI text labels of the initial UI within the captured initial image by programmatic examination of the captured initial image;
detecting positioning data and sizing data for each of the UI text labels;
associating the UI text labels to the UI controls based on the positioning data and the sizing data for the UI text labels and for UI controls of the initial UI;
recording a series of user interactions with a plurality of different ones of the UI controls of the initial UI;
subsequently capturing a subsequent image of a subsequent UI presented on a display device, the subsequent UI being generated by the same application program as the application program generating the initial UI, with the same application program operating on a computing device and presenting the subsequent UI on a display device;
detecting a plurality of UI controls of the subsequent UI within the captured subsequent image by programmatic examination of the captured subsequent image;
detecting positioning data and sizing data for each of the UI controls of the subsequent UI;
detecting a plurality of UI text labels of the subsequent UI within the captured subsequent image by programmatic examination of the captured subsequent image;
detecting positioning data and sizing data for each of the UI text labels of the subsequent UI;
associating the UI text labels to the UI controls based on the positioning data and the sizing data for the UI text labels and for UI controls of the subsequent UI;
matching the UI controls of the subsequent UI to the UI controls of the initial UI, the matching using the UI text labels that are associated with the respective UI controls of the initial UI and the subsequent UI; and
programmatically causing one or more interactions of the series of user interactions previously recorded with respect to at least one of the UI controls of the initial UI to be provided to and induced on at least one of the UI controls of the subsequent UI that has been matched to the at least one of the UI controls of the initial UI,
wherein the detecting positioning data and sizing data for the UI controls of the subsequent UI identifies at least a boundary box for each of the UI controls,
wherein the associating of the UI text labels to the UI controls is based on a separation distance and a direction from each of one or more of the UI controls to each of one or more of the UI text labels, and
wherein at least one of the UI text labels being associated with at least one of the UI controls is positioned external to the boundary box for each of the corresponding one or more of the UI controls.