Alarm and Event Handling Philosophy

by Jarrod Chesney on the 23rd Feb 2018

Purpose

Lack of alarm configuration can lead to alarm flooding causing increased outage minutes or Network Operator error.

This document is to outline a philosophy of alarm and event configuration for the Network Operator’s correct assessment and interpretation resulting in efficiency and consistency with alarm and event management in Control Systems.

About Events

Any point can be configured to log a change of state. It is important to note that the transmitted changes of state, relating to the source device, are dependent on the protocol being used and the protocol’s configuration.

Systems can be configured to display the state of a point without logging events in the system. A ‘Tap Change In Progress’ indication, used for auto voltage regulation software or manual tap change, provides feedback to the software or operator but would be superfluous in the event logs. In this situation the tap raise/lower controls and ‘Tap Position’ point would log tap change events. Trending ‘Tap Position’ is useful, logging ‘Tap Change In Progress’, is redundant information.

CONITEL Events

Polling slave devices using the CONITEL protocol returns a snap shot the current state of all points in the group being scanned.

A Momentary Capture Detection (MCD) option allows changes of state between scans to be reported.

Blocks of points in the CONITEL scan can be configured to report Sequence Of Events (SOE) time tagging,to the nearest millisecond, for all points within the group. This information is reported independently to the points’ normal scan data.

DNP3 Events

DNP3 manages events based on priority if the classes are configured. A snap shot of all points current state can also be polled.

All digital points’ change of state are logged in the device as an event until polled and the buffers are cleared. If the point is-configured with time, the SOE time is also stored and transmitted along with the new state.

An analogue point records an event when its value has a change greater than the configured deadband. The correct choice of deadbands for analogue points is critical. Too low a deadband results in flooding the comms paths with events. Analogue points not registering events may have their deadbands configured too high.

Deadband count = RAW count range / ENG range * ENG minimum change – 1

About Alarms

Alarms are usually announced on becoming active. Their behaviour from there depends on their configuration. Most alarms need to be acknowledged whether their state returns to normal or not. In an unoptimised system, alarms can be:

meaningless (require no action),
stale (alarm state for over 24 hours) or
chattering (alarm and normal state 3 or more times a minute).

These alarms can distract network operators from dealing with critical events. These points that are meaningless, stale or chattering can be genuine system faults or conditions, but require correct configuration to ensure proper handling.

A meaningless alarm would be a generic alarm return to NORMAL. It is important for this point to disappear from the abnormal list and show when it happened in the event logs but it is not necessary for the Network Coordinator to drop their current task to acknowledge such an alarm. Meaningless alarms also include authorised controls and points that toggle in relation to another alarm such as ‘EF Trip Operated’ does not require an alarm as the alarm for the ‘CB Open’ would require the Network Coordinators attention. This is referred to as ‘Alarm Consolidating’.

Alarms that are going to remain in the alarms list for greater than a single shift should be managed from the abnormals list and not the alarms list.

Chattering alarms such as ‘Radio Comms Failed’, where a point is intermittently failed and which the Network Coordinator has actioned this alarm to be investigated should use a feature such as ‘Shelving Alarms’ until the point has returned to a stable state. The display of ‘shelved’ or ‘inhibited alarms’ requires constant checking to ensure alarms are re-enabled once the issue has been attended to.

Alarm Consolidating

An optimised system should only have a single alarm per event. These examples do not require an alarm in PoF:

Protection trips, the Network Coordinator will be acknowledging the circuit breaker open alarm,
Plant alarms when returning to their normal state,
Authorised or solicited events.

Points that alarm must remain alarming until acknowledge from the Network Coordinator regardless of the point return to it’s normal state. Analogue limit alarms will function differently, HIGH / LOW alarm will be overwritten by a CRITICAL HIGH / CRITICAL LOW but will not alarm when returning to a previous state.

Alarm Delays

Alarm delay are configured to reduced chattering alarms. A good example is an alarm that could be floating between alarm and normal state such as ‘Radio Comms Failed’ where an intermittent signal is successful. A delay will ensure that the point has been in the alarm state for the set time before alarming to notify the Network Coordinator.

Delays for Analogue alarms might also be useful for requiring the point to be in the HIGH/LOW state for 30 seconds or in the CRITICAL state for 0 seconds. Hysteresis may be used.

Shelving Alarms

Shelving alarms is a technique to temporarily suppress alarms. Used to prevent alarm fatigue by silencing chattering alarms. This is not a long-term or indefinite solution but to be used on points that are malfunctioning.

Deleting Alarms

Alarms can be deleted from the alarms list. A technique used to remove stale alarms from the active alarms window. Those points are then required to be monitored from the abnormals list.

Classing Points

Allows the network operator to further filter points by SCADA, NMS or OMS. The class does not appear in the alarm message, it is used to provide a further level of point classification, in addition to the point type, which is also used for filtering purposes.

Categorising Points

Point Type	Description
Switching Device	Telemetered circuit breaker, isolator or earth switch.
Analogue	When an analogue value has reached trip settings.
Plant	Plant equipment that is critical to the operation of the network.
Protection	Results of network protection.
Comms	Communications system points.
Load Control	Load control equipment.
System	System Points.

Alarms can be categorised to specific alarm types. The alarm ‘Type’ column is located on the left of the Alarms window and Events window.

Prioritising Alarms and Events

Priority	Description
Critical	Requires immediate action. Severe consequences if not acted on in a timely manner.
High	Requires action. No immediate consequences.
Low	Minor consequences and no action required. Can effect operation of the network.
Event	Event only points.
Diagnostic	System information used for system support engineers. Network information for network operator diagnosis.
Solicited	Authorised point changes.

Frequent inundation of alarms causes alarm fatigue and the possibility of missing legitimate alarms. There are two determining factors for alarm prioritising. Severity of the consequences and response time to avoid those consequences. The priority of the alarm is used to set the colour of the alarm type text in the alarms window and specific sound alert giving the Network Coordinator a audible and visual representation of the urgency.

About Abnormals

The majority of points in a system will have a normal state for the operation of the network. The abnormals list will generally contain points from parts of the system where plant has failed, plant is temporarily out of service or switching is taking place. The Network Coordinator can better management points in their abnormal state by location or by checking which points are in their abnormal state when crews arrive and leave a work site.

Alarm and Event Handling Philosophy

Comments

Join the discussion! Cancel reply

Recent Posts

Categories

Alarm and Event Handling Philosophy

Comments

Join the discussion! Cancel reply

Recent Posts

Categories

Signup to Newsletter