US 12,271,802 B2
	Computing system for implementing artificial neural network models and method for implementing artificial neural network models
Tianchan Guan, Shanghai (CN); Shengcheng Wang, Shanghai (CN); Dimin Niu, San Mateo, CA (US); and Hongzhong Zheng, Los Gatos, CA (US)
Assigned to ALIBABA DAMO (HANGZHOU) TECHNOLOGY CO., LTD., Hangzhou (CN)
Filed by Alibaba Damo (Hangzhou) Technology Co., Ltd., Zhejiang (CN)
Filed on Mar. 18, 2022, as Appl. No. 17/698,648.
Claims priority of application No. 202111345697.2 (CN), filed on Nov. 15, 2021.
Prior Publication US 2023/0153570 A1, May 18, 2023
Int. Cl. G06K 9/00 (2022.01); G06N 3/04 (2023.01); G06V 10/77 (2022.01); G06V 10/82 (2022.01); G06V 10/94 (2022.01)

CPC G06N 3/04 (2013.01) [G06V 10/7715 (2022.01); G06V 10/82 (2022.01); G06V 10/955 (2022.01)]

20 Claims

20. A computing system for implementing an artificial neural network model having multiple layers comprising first and second layers, first layer input data being provided as input for computations of the first layer to generate first layer output data, the first layer output data being used for computations of the second layer, and the computing system comprising:

first and second processing units configured to jointly perform computing operations of the first layer in parallel, wherein:

the first processing unit is configured to generate a first portion of the first layer output data based on a first portion of the first layer input data;

the second processing unit is configured to generate a second portion of the first layer output data in parallel to the first portion of the first layer output data based on a second portion of the first layer input data; and

the second processing unit comprises a structure that is same as a structure of the first processing unit; and

a third processing unit configured to perform the computing operations of the second layer using the first and second portions of the first layer output data as input, the third processing unit comprising a structure same as the structures of the first and second processing units,

wherein the computing system is configured to arrange the first processing unit, the second processing unit, and the third processing unit to improve performance and/or hardware utilization of the computing system when running the artificial neural network model.