Speculative Execution and Reducing Branch Penalty on a Superscalar Processor

Hideki ANDO
Hirohisa MACHIDA
Tetsuya HARA

IEICE TRANSACTIONS on Electronics   Vol.E76-C    No.7    pp.1080-1093
Publication Date: 1993/07/25
Online ISSN: 
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Issue on New Architecture LSIs)
Category: Improved Binary Digital Architectures
superscalar,  VLIW,  speculative execution,  

Full Text: PDF>>
Buy this Article

Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.