US 12,333,346 B2
Discovery and routing to inference servers based on monitored version information
Yuliya L. Feldman, Campbell, CA (US); Seyedshahin Ashrafzadeh, Foster City, CA (US); Alexandr Nikitin, El Sobrante, CA (US); Chirag Rajan, Burlingame, CA (US); and Swaminathan Sundaramurthy, Los Altos, CA (US)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by Salesforce, Inc., San Francisco, CA (US)
Filed on Jun. 2, 2021, as Appl. No. 17/337,390.
Prior Publication US 2022/0391749 A1, Dec. 8, 2022
Int. Cl. G06F 9/50 (2006.01); G06F 8/60 (2018.01); G06F 9/455 (2018.01); H04L 67/133 (2022.01); G06N 20/00 (2019.01)
CPC G06F 9/5072 (2013.01) [G06F 8/60 (2013.01); G06F 9/45558 (2013.01); G06F 9/5077 (2013.01); H04L 67/133 (2022.05); G06F 2009/45562 (2013.01); G06F 2009/45595 (2013.01); G06N 20/00 (2019.01)] 15 Claims
OG exemplary drawing
 
1. A method for service discovery in a machine learning serving infrastructure, the method comprising:
detecting initialization of at least one service container;
establishing a monitor for monitoring label information for the at least one service container using a container orchestration system (COS) application programming interface (API);
identifying, by the monitor, the label information in the at least one service container;
collecting the label information for the initializing at least one service container, wherein the label information includes version information of a scoring service provided by the at least one service container;
storing a mapping between the label information and the at least one service container in a routing information storage accessible to a routing service;
receiving a request for executing a particular version of the scoring service;
determining, by the routing service, that the request is to be routed to the at least one service container based on the particular version of the scoring service matching the version information included in the label information included in the mapping; and
routing, by the routing service, the request to the at least one service container.