On the Concept of "Stability" in Asynchronous Distributed Decision-Making Systems

Tony S. LEE  Sumit GHOSH  

Publication
IEICE TRANSACTIONS on Communications   Vol.E83-B   No.5   pp.1023-1038
Publication Date: 2000/05/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (IEICE/IEEE Joint Special Issue on Autonomous Decentralized Systems)
Category: Real Time Control
Keyword: 
stability,  instability,  reliability,  catastrophic failure,  performance,  asynchronous distributed algorithms,  complexity,  decision-making,  

Full Text: PDF>>
Buy this Article




Summary: 
Asynchronous, distributed, decision-making (ADDM) systems constitute a special class of distributed problems and are characterized as large, complex systems wherein the principal elements are the geographically-dispersed entities that communicate among themselves, asynchronously, through message passing and are permitted autonomy in local decision-making. A fundamental property of ADDM systems is stability that refers to their behavior under representative perturbations to their operating environments, given that such systems are intended to be real, complex, and to some extent, mission critical systems, and are subject to unexpected changes in their operating conditions. ADDM systems are closely related to autonomous decentralized systems (ADS) in the principal elements, the difference being that the characteristics and boundaries of ADDM systems are defined rigorously. This paper introduces the concept of stability in ADDM systems and proposes an intuitive yet practical and usable definition that is inspired by those used in Control Systems and Physics. A comprehensive stability analysis on an accurate simulation model will provide the necessary assurance, with a high level of confidence, that the system will perform adequately. An ADDM system is defined as a stable system if it returns to a steady-state in finite time, following perturbation, provided that it is initiated in a steady-state. Equilibrium or steady-state is defined through placing bounds on the measured error in the system. Where the final steady-state is equivalent to the initial one, a system is referred to as strongly stable. If the final steady-state is potentially worse then the initial one, a system is deemed marginally stable. When a system fails to return to steady-state following the perturbation, it is unstable. The perturbations are classified as either changes in the input pattern or changes in one or more environmental characteristics of the system such as hardware failures. Thus, the key elements in the study of stability include steady-state, perturbations, and stability. Since the development of rigorous analytical models for most ADDM systems is difficult, if not impossible, the definitions of the key elements, proposed in this paper, constitute a general framework to investigate stability. For a given ADDM system, the definitions are based on the performance indices that must be judiciously identified by the system architect and are likely to be unique. While a comprehensive study of all possible perturbations is too complex and time consuming, this paper focuses on a key subset of perturbations that are important and are likely to occur with greater frequency. To facilitate the understanding of stability in representative real-world systems, this paper reports the analysis of two basic manifestations of ADDM systems that have been reported in the literature --(i) a decentralized military command and control problem, MFAD, and (ii) a novel distributed algorithm with soft reservation for efficient scheduling and congestion mitigation in railway networks, RYNSORD. Stability analysis of MFAD and RYNSORD yields key stable and unstable conditions.