| CPC G06N 3/08 (2013.01) [G06F 9/3877 (2013.01); G06F 9/4843 (2013.01); G06F 9/5066 (2013.01); G06F 18/29 (2023.01); G06N 5/04 (2013.01); G06F 2209/5019 (2013.01); G06F 2209/502 (2013.01)] | 21 Claims |

|
1. A method for executing a directed graph, wherein each step is conducted by at least one processor, comprising:
receiving at least two batches of indices including a first batch and a second batch, wherein the at least two batches of indices, when used to access a set of embeddings: (i) provide at least two batches of embedding outputs which correspond to the at least two batches of indices; and (ii) execute a layer of the directed graph, wherein each index of the first batch is applied to the layer of the directed graph prior to each index of the second batch being applied to the layer of the directed graph;
accessing the set of embeddings using the at least two batches of indices;
rearranging, based on a set of latencies for the accessing step, the at least two batches of embedding outputs into at least two batches of rearranged embedding outputs including a first rearranged batch and a second rearranged batch; and
providing the at least two batches of rearranged embedding outputs to a subsequent layer of the directed graph, wherein the first rearranged batch is entirely provided to the subsequent layer of the directed graph before the second rearranged batch is provided to the subsequent layer of the directed graph.
|