CPC G06F 9/3861 (2013.01) [G06F 9/3016 (2013.01); G06F 9/3834 (2013.01); G06F 9/3838 (2013.01); G06F 9/3867 (2013.01); G06F 9/3889 (2013.01)] | 18 Claims |
1. A method of processing instructions in a parallel processing unit comprising a plurality of instruction pipelines, the method comprising:
tracking data hazards using a plurality of counters, the plurality of counters comprising a first set of counters associated with high latency data hazards and a second set of counters associated with low latency data hazards; and
at an instruction decoder:
receiving an instruction for execution that indicates (i) whether the instruction is a primary instruction from which at least one other instruction is dependent and if so, a counter of the plurality of counters the primary instruction is associated with and (ii) whether the instruction is a secondary instruction and if so, the counters of the plurality of counters associated with the primary instructions from which the instruction depends,
determining whether the instruction is a secondary instruction,
if it is determined that the instruction is a secondary instruction, determining from the counters associated with the primary instructions from which the instruction depends, whether the instruction relates to at least one high latency data hazard,
if it is determined that the instruction relates to at least one high latency data hazard, determining from the plurality of counters whether each high latency data hazard related to the instruction has been resolved,
if it is determined that at least one high latency data hazard related to the instruction has not been resolved, causing the instruction to be de-scheduled until each high latency data hazard related to the instruction has been resolved, and
if it is determined that the instruction does not relate to at least one high latency data hazard or that all of the high latency data hazards related to the instruction have been resolved, forwarding the instruction to a queue preceding an appropriate instruction pipeline where the instruction stalls until all low latency data hazards related to the instruction have been resolved.
|