For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Speculative Execution and Reducing Branch Penalty on a Superscalar Processor
IEICE TRANSACTIONS on Electronics
Publication Date: 1993/07/25
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Issue on New Architecture LSIs)
Category: Improved Binary Digital Architectures
superscalar, VLIW, speculative execution,
Full Text: PDF>>
Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.