Adaptive Q-Learning Cell Selection Method for Open-Access Femtocell Networks: Multi-User Case

Chaima DHAHRI  Tomoaki OHTSUKI  

Publication
IEICE TRANSACTIONS on Communications   Vol.E97-B   No.8   pp.1679-1688
Publication Date: 2014/08/01
Online ISSN: 1745-1345
DOI: 10.1587/transcom.E97.B.1679
Type of Manuscript: PAPER
Category: Network Management/Operation
Keyword: 
open access femtocell networks,  handover,  reinforcement learning,  Q-learning,  fuzzy logic,  

Full Text: PDF>>
Buy this Article




Summary: 
Open-access femtocell networks assure the cellular user of getting a better and stronger signal. However, due to the small range of femto base stations (FBSs), any motion of the user may trigger handover. In a dense environment, the possibility of such handover is very frequent. To avoid frequent communication disruptions due to phenomena such as the ping-pong effect, it is necessary to ensure the effectiveness of the cell selection method. Existing selection methods commonly uses a measured channel/cell quality metric such as the channel capacity (between the user and the target cell). However, the throughput experienced by the user is time-varying because of the channel condition, i.e., owing to the propagation effects or receiver location. In this context, the conventional approach does not reflect the future performance. To ensure the efficiency of cell selection, user's decision needs to depend not only on the current state of the network, but also on the future possible states (horizon). To this end, we implement a learning algorithm that can predict, based on the past experience, the best performing cell in the future. We present in this paper a reinforcement learning (RL) framework as a generic solution for the cell selection problem in a non-stationary femtocell network that selects, without prior knowledge about the environment, a target cell by exploring past cells' behavior and predicting their potential future states based on Q-learning algorithm. Then, we extend this proposal by referring to a fuzzy inference system (FIS) to tune Q-learning parameters during the learning process to adapt to environment changes. Our solution aims at minimizing the frequency of handovers without affecting the user experience in terms of channel capacity. Simulation results demonstrate that
· our solution comes very close to the performance of the opportunistic method in terms of capacity, while fewer handovers are required on average.
· the use of fuzzy rules achieves better performance in terms of received reward (capacity) and number of handovers than fixing the values of Q-learning parameters.