US 12,131,155 B2
Apparatus and method for speculatively vectorising program code
Peng Sun, Cambridge (GB); Timothy Martin Jones, Cambridge (GB); and Giacomo Gabrielli, Cambridge (GB)
Assigned to Arm Limited, Cambridge (GB); and The Chancellor, Masters and Scholars of the University of Cambridge, Cambridge (GB)
Appl. No. 17/597,134
Filed by ARM LIMITED, Cambridge (GB); and THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF CAMBRIDGE, Cambridge (GB)
PCT Filed Mar. 25, 2020, PCT No. PCT/GB2020/050798
§ 371(c)(1), (2) Date Dec. 27, 2021,
PCT Pub. No. WO2021/001641, PCT Pub. Date Jan. 7, 2021.
Claims priority of application No. 1909465 (GB), filed on Jul. 1, 2019.
Prior Publication US 2022/0236990 A1, Jul. 28, 2022
Int. Cl. G06F 9/30 (2018.01); G06F 9/345 (2018.01); G06F 9/355 (2018.01); G06F 9/38 (2018.01)
CPC G06F 9/30036 (2013.01) [G06F 9/3004 (2013.01); G06F 9/3555 (2013.01); G06F 9/3842 (2013.01)] 26 Claims
OG exemplary drawing
 
1. An apparatus comprising:
processing circuitry to execute program code, the program code including an identified code region comprising at least a plurality of speculative vector memory access instructions, where execution of each speculative vector memory access instruction is employed to perform speculative vectorisation of a series of scalar memory access operations using a plurality of lanes of processing;
tracking storage to maintain, for each speculative vector memory access instruction, tracking information providing an indication of a memory address being accessed within each lane;
checking circuitry to reference the tracking information during execution of the identified code region by the processing circuitry, in order to detect any inter lane memory hazard resulting from the execution of the plurality of speculative vector memory access instructions;
a status storage element to maintain an indication of each lane for which the checking circuitry determines an inter lane memory hazard of at least a first type; and
replay determination circuitry arranged, when an end of the identified code region is reached, to be responsive to the status storage element identifying at least one lane as having an inter lane memory hazard, to trigger re-execution of the identified code region for each lane identified by the status storage element; wherein
the processing circuitry is arragned to perform out-of-order (OOO) processing of instructions;
the apparatus comprises at least one OOO tracking structure having tracking entries to track memory hazards introduced by instruction reordering; and
the tracking entries in the at least one OOO tracking structure are augmented such that the tracking storage useable by the checking circuitry to detect the inter lane memory hazards is incorporated within the at least one OOO trackikng structure, to allow the checking circuitry to detect, using the tracking entries in a given OOO trackikng structure, both:
memory hazards introduced by instruction reordering; and
the inter lane memory hazards that occur due to allocating the scalar memory access operations within the series to different lanes of processing.