US 12,223,561 B2
Apparatus and method for analyzing graphics processor performance based on graphics processor analytical model
Youngsok Kim, Seoul (KR); Jounghoo Lee, Seoul (KR); Yeonan Ha, Seoul (KR); and Suhyun Lee, Seoul (KR)
Assigned to UNIVERSITY INDUSTRY FOUNDATION, YONSEI UNIVERSITY, Seoul (KR)
Filed by University Industry Foundation, Yonsei University, Seoul (KR)
Filed on Dec. 21, 2022, as Appl. No. 18/086,151.
Claims priority of application No. 10-2022-0151838 (KR), filed on Nov. 14, 2022.
Prior Publication US 2024/0161228 A1, May 16, 2024
Int. Cl. G06T 1/60 (2006.01); G06F 11/34 (2006.01); G06F 18/23213 (2023.01); G06T 1/20 (2006.01)
CPC G06T 1/60 (2013.01) [G06T 1/20 (2013.01); G06F 11/3447 (2013.01); G06F 11/3461 (2013.01); G06F 18/23213 (2023.01)] 13 Claims
OG exemplary drawing
 
1. A graphics processor performance analysis apparatus based on a graphics processor analytical model comprising:
a model definition unit configured to define a graphics processor core model for identifying structural stalls of computing and a memory, a data stall of the memory, and an idle stall;
a simulation execution unit configured to simulate an operation of a specific graphics processing unit (GPU) by using architecture parameters of a GPU application and the specific GPU as inputs on a basis of the graphics processor core model; and
a performance analysis unit configured to receive an output of the graphics processor core model as a result of the simulation and analyze performance of the specific GPU,
wherein the simulation execution unit is further configured to divide and define a total cycle of a sub-core as a second active cycle and a second idle cycle through a sub-core model of the graphics processor core model, calculate the second idle cycle through a difference between the second active cycles of a specific sub-core and a longest executing sub-core in a same streaming multiprocessor (SM), and divide and define the second active cycle as first to third partial active cycles,
wherein the simulation execution unit is further configured to apply a condition that all instructions require a single cycle and no hazard occurs, to calculate the first partial active cycle on a basis of a number of cycles required for the sub-core, and
wherein the model definition unit, the simulation execution unit, and the performance analysis unit are each implemented via at least one processor.