A Systolic Array RLS Processor

Takahiro ASAI  Tadashi MATSUMOTO  

IEICE TRANSACTIONS on Communications   Vol.E84-B    No.5    pp.1356-1361
Publication Date: 2001/05/01
Online ISSN: 
Print ISSN: 0916-8516
Type of Manuscript: PAPER
Category: Terrestrial Radio Communications
RLS algorithm,  channel estimation,  QR decomposition,  parallel pipelined processing,  ASIC,  

Full Text: PDF>>
Buy this Article

This paper presents the outline of the systolic array recursive least-squares (RLS) processor prototyped primarily with the aim of broadband mobile communication applications. To execute the RLS algorithm effectively, this processor uses an orthogonal triangularization technique known in matrix algebra as QR decomposition for parallel pipelined processing. The processor board comprises 19 application-specific integrated circuit chips, each with approximately one million gates. Thirty-two bit fixed-point signal processing takes place in the processor, with which one cycle of internal cell signal processing requires approximately 500 nsec, and boundary cell signal processing requires approximately 80 nsec. The processor board can estimate up to 10 parameters. It takes approximately 35 µs to estimate 10 parameters using 41 known symbols. To evaluate signal processing performance of the prototyped systolic array processor board, processing time required to estimate a certain number of parameters using the prototyped board was comapred with using a digital signal processing (DSP) board. The DSP board performed a standard form of the RLS algorithm. Additionally, we conducted minimum mean-squared error adaptive array in-lab experiments using a complex baseband fading/array response simulator. In terms of parameter estimation accuracy, the processor is found to produce virtually the same results as a conventional software engine using floating-point operations.