| CPC G06F 12/0802 (2013.01) [G06F 2212/60 (2013.01)] | 41 Claims |

|
1. A microprocessor, comprising:
a prediction unit (PRU) configured to predict a sequence of fetch blocks (FBlks) of architectural instructions of a program instruction stream, wherein each FBlk is specified by a fetch block start address (FBSA), the PRU comprising:
branch history state (BHS);
one or more branch predictors configured to provide, in response to a lookup of an FBSA in combination with the BHS, information usable by the PRU to predict branch history update information (BHUI) produced by the FBlk specified by the FBSA; and
a buffer for storing accumulated BHUI; and
a macro-op (MOP) cache (MOC) comprising MOC entries (MEs) that hold decoded MOPs of architectural instructions of FBlks, wherein the MEs include an unrolled loop multi-FBlk ME (ULP-MF-ME) that is built based on previously observed occurrences of a loop on a loop body that comprises one or more FBlks;
wherein the PRU is configured to perform a method comprising:
detecting a hit in the MOC on the ULP-MF-ME, wherein the hit on the ULP-MF-ME predicts a current occurrence of a loop on the loop body in the program instruction stream;
in parallel, for N initial iterations of the loop during the current occurrence, wherein N is an integer greater than zero:
updating the BHS with BHUI produced by each FBlk of the N initial iterations, wherein the BHUI produced by each FBlk is predicted using information provided in response to a lookup in the branch predictors; and
accumulating into the buffer the BHUI produced by each FBlk of the N initial iterations; and
using the accumulated BHUI from the buffer to update the BHS to reflect prediction of BHUI produced by subsequent iterations of the loop during the current occurrence rather than performing lookups in the branch predictors to update the BHS for the subsequent iterations.
|