CPC G06F 16/353 (2019.01) [G06F 16/2455 (2019.01); G06F 16/335 (2019.01)] | 20 Claims |
1. A data processing device comprising:
at least one processor; and
a machine-readable medium storing executable instructions that, when executed, cause the processor to perform operations comprising:
receiving job definitions including SQL queries for performing reprocessing operations on databases in a database system of a cloud-based service via a user input device of a modular selective recrawl system;
generating recrawl jobs based on the job definitions using a recrawl job generating module of the modular selective recrawl system;
fighting the recrawl jobs to the database system using a fighting system of the cloud-based service;
generating iterations of recrawl timer jobs for each of the databases in the database system based on a predefined recrawl timer job base class, each of the iterations being triggered based on a predefined schedule for the recrawl timer jobs, wherein, during each of the iterations, a recrawl timer job associated with a database of the database system is configured to perform functions comprising:
accessing a recrawl job list for the database, the recrawl job list including each of the recrawl timer jobs flighted to the database system;
accessing a property list of the database to identify recrawl job information stored in the property list during a previous iteration of the recrawl timer job;
based on the recrawl job information, selecting a respective batch of documents to be reprocessed in association with each of the recrawl jobs on the recrawl job list;
reprocessing each of the respective batches of documents using the reprocessing operation of the recrawl job associated with the batch of documents; and
once each of the batches of documents has been reprocessed, storing a last document identifier in the property list in association with each of the recrawl jobs.
|