US 12,141,214 B2
Adversarial bandits policy for crawling highly dynamic content
Michael Bendersky, Cupertino, CA (US); Przemysław Gajda, Zurich (CH); Sergey Novikov, Zurich (CH); Marc Alexander Najork, Palo Alto, CA (US); and Shuguang Han, Sunnyvale, CA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Appl. No. 17/995,248
Filed by GOOGLE LLC, Mountain View, CA (US)
PCT Filed Mar. 30, 2020, PCT No. PCT/US2020/025757
§ 371(c)(1), (2) Date Sep. 30, 2022,
PCT Pub. No. WO2021/201825, PCT Pub. Date Oct. 7, 2021.
Prior Publication US 2023/0169128 A1, Jun. 1, 2023
Int. Cl. G06F 16/951 (2019.01); G06F 18/214 (2023.01); G06Q 30/0207 (2023.01); G06Q 30/0601 (2023.01)
CPC G06F 16/951 (2019.01) [G06F 18/214 (2023.01); G06Q 30/0239 (2013.01); G06Q 30/0601 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, from a repository, a plurality of entities, each of the plurality of entities having a respective value of a quantity that is accurate at a previous time step;
for each of the plurality of entities, generating associated values of a plurality of parameters at a current time step, the plurality of parameters including at least one of an access rate of that entity from the repository or a likelihood of a change in the respective value of the quantity of that entity;
selecting a refresh strategy of a plurality of refresh strategies for updating the respective values of the quantity for the plurality of entities according to a refresh policy, the refresh policy including a weight distribution representing a respective likelihood that each of the plurality of refresh strategies is selected, the plurality of refresh strategies including at least two of a uniform strategy, a change-weighted strategy, an access-weighted strategy, or a resource-optimized strategy;
generating a respective refresh rate for each of the plurality of entities according to the selected refresh strategy, the respective refresh rate for each entity of the plurality of entities being based on the associated values of the plurality of parameters at a sequence of times comprising the previous time step and the current time step;
performing a refresh operation on the repository based on the respective refresh rates for the plurality of entities, the refresh operation being configured to obtain the respective values of the quantity at the current time step; and
updating the refresh policy based on a difference between the respective value of the quantity at the previous time step and the respective value of the quantity at the current time step of each of the plurality of entities.