US 12,321,855 B2
Graph execution using access request response dynamic batch assembly
Ljubisa Bajic, Toronto (CA); Davor Capalija, Toronto (CA); Ivan Matosevic, Toronto (CA); and Alex Cejkov, Toronto (CA)
Assigned to Tenstorrent AI ULC, Toronto (CA)
Filed by Tenstorrent AI ULC, Toronto (CA)
Filed on Apr. 2, 2021, as Appl. No. 17/221,469.
Prior Publication US 2022/0318614 A1, Oct. 6, 2022
Int. Cl. G06F 9/44 (2018.01); G06F 9/38 (2018.01); G06F 9/48 (2006.01); G06F 9/50 (2006.01); G06F 18/20 (2023.01); G06N 3/08 (2023.01); G06N 5/04 (2023.01)
CPC G06N 3/08 (2013.01) [G06F 9/3877 (2013.01); G06F 9/4843 (2013.01); G06F 9/5066 (2013.01); G06F 18/29 (2023.01); G06N 5/04 (2013.01); G06F 2209/5019 (2013.01); G06F 2209/502 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A method for executing a directed graph, wherein each step is conducted by at least one processor, comprising:
receiving at least two batches of indices including a first batch and a second batch, wherein the at least two batches of indices, when used to access a set of embeddings: (i) provide at least two batches of embedding outputs which correspond to the at least two batches of indices; and (ii) execute a layer of the directed graph, wherein each index of the first batch is applied to the layer of the directed graph prior to each index of the second batch being applied to the layer of the directed graph;
accessing the set of embeddings using the at least two batches of indices;
rearranging, based on a set of latencies for the accessing step, the at least two batches of embedding outputs into at least two batches of rearranged embedding outputs including a first rearranged batch and a second rearranged batch; and
providing the at least two batches of rearranged embedding outputs to a subsequent layer of the directed graph, wherein the first rearranged batch is entirely provided to the subsequent layer of the directed graph before the second rearranged batch is provided to the subsequent layer of the directed graph.