US 12,475,020 B2
Graph analysis and database for aggregated distributed trace flows
Hanzhang Wang, San Jose, CA (US); Huai Jiang, San Jose, CA (US); Liangfei Su, San Jose, CA (US); Selcuk Kopru, San Jose, CA (US); Sanjeev Katariya, San Jose, CA (US); and Wanxue Li, San Jose, CA (US)
Assigned to eBay Inc., San Jose, CA (US)
Filed by eBay Inc., San Jose, CA (US)
Filed on Aug. 10, 2023, as Appl. No. 18/232,525.
Application 18/232,525 is a continuation of application No. 17/209,633, filed on Mar. 23, 2021, granted, now 11,768,755.
Claims priority of provisional application 62/993,426, filed on Mar. 23, 2020.
Prior Publication US 2023/0385175 A1, Nov. 30, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 11/00 (2006.01); G06F 11/07 (2006.01); G06F 11/32 (2006.01); G06F 11/34 (2006.01); G06N 20/00 (2019.01)
CPC G06F 11/3476 (2013.01) [G06F 11/0772 (2013.01); G06F 11/323 (2013.01); G06F 11/3409 (2013.01); G06N 20/00 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
obtaining, by an application server, raw distributed trace data for a large-scale distributed system from a plurality of distributed tracing clients in the large-scale distributed system;
aggregating, by the application server, the raw distributed trace data into aggregated distributed trace data;
pre-processing, by the application server, the aggregated distributed trace data to repair at least one trace that is incomplete, broken or incorrect using an infrastructure design for the large-scale distributed system, the infrastructure design comprising a dependency graph indicating dependencies among a plurality of devices and services in the large-scale distributed system independent of the raw distributed trace data;
generating, by the application server, a plurality of process flow graphs from the pre-processed aggregated distributed trace data;
storing, by the application server, the plurality of process flow graphs in graph-based storage in communication with the application server;
processing a graph query using the graph-based storage to determine a first critical path from the plurality of process flow graphs based on the infrastructure design for the large-scale distributed system including the dependency graph indicating dependencies among the plurality of devices and services in the large-scale distributed system; and
providing a process flow graph corresponding to the first critical path for graphical display.