Options
Distributed control of event floods in a large telecom network
Date Issued
01-03-2010
Author(s)
Jagadish, Chundury
Gonsalves, Timothy A.
Abstract
Events in a failing system can be generated so rapidly that they adversely impact the network as well as the network management system (NMS) manager. They may fail to get delivered and critical information may get lost. This problem becomes worse in a large and congested network. Today, in practice, a management station is often fl ooded with a huge number of redundant events, making it difficult for the operator to process them and take corrective actions. Methods are needed to limit the volume of event transmission and number of events presented to the operator, while ensuring delivery of important information to the NMS manager. These methods need to take care of the operators' changing needs in monitoring abstraction level, for various network elements (NE) based on time and NE severity state. In this paper we propose novel techniques for distributed control of events fl ood, by suppressing transient events at the source. These techniques do not add any delay in communicating a failure, while ensuring that only the important events are presented to the operator. Also, the correctness of event state at the NMS is not compromised. Moreover, these methods give fl exibility to the operator to dynamically change the abstraction level needed from a network element, and limit the number of events presented to the operator. The implementation of these techniques is tested with real fi eld event traces from various telecom networks. Results show that there is a substantial reduction in the event traffic in the network. Copyright © 2009 John Wiley & Sons, Ltd.
Volume
20