In control rooms are day-to-day issues and systems that impact the performance of control room operators, the most common are defined as compromises to Situation Awareness. We will review each of the five and provide insight into practical solutions to implement them.
The first is providing Adequate Information, control rooms are full of data, but little is translated into useful information. One of the first steps is to implement a High-Performance HMI; you may ask what is the difference between what we currently have and the High-Performance HMI, why do you add the term High Performance? We insert the term High Performance because we expect if you change from using traditional HMI schematics with lots of data, lots of colors, small text. Multiple colored lines, black backgrounds, poor coding techniques, poor navigation techniques and no hierarchy or disciplined coding system, light backgrounds to minimize glare and allow room lighting to be brighter as recommended by International Standards (700 Lux), easy reading text and numbers with suitable fonts based on viewing distances and angles. Data should be put into context so an operator can see the full range, the PV, the operating limits, and all the alarm limits associated with each tag.
The term High Performance is used to imply a performance improvement measured by three critical operations:
- The operator’s ability to detect a problem before the alarm limit is reached.
- The operator’s ability to quickly diagnose the problem using minimal control moves.
- The operator’s ability to react and resolve the abnormal situation in a timely manner.
Another part of Adequate Information is not overwhelming operator’s with too many alarms, every alarm should have an operator action and should be documented on what the cause of the alarm is. What is the expected action to resolve the alarm, and finally to provide understanding to allow the operator to prioritize their actions based on an understanding of the defined priority of the alarm and the consequence of no action? This covers common issues such as data overload (mainly alarms but could potentially be data points) and misplaced salience due to overuse of bright colors limiting the operator’s ability to discern which is the most important information on a screen.
A second way to equip operators to be able to respond in a timely manner is to ensure they are alert and not sleepy. This is addressed by industrial recommended practices as defined in API 755 which provides guidance for hours of service rules, advice on special conditions which would exceed these hours of service rule through MOC policies and required rest periods to help reset fatigue. Fatigue counter-measures provided to support 24/7 shift workers. One technique which is very popular today is to provide sit/stand consoles allowing the fatigued operator to stand and pace up and down their area as they used to in the old days with pneumatic instruments.
A third way is to provide the correct number and hierarchy of displays, today we have introduced a problem we never had before computer systems came along. That is the loss of the big picture, the overview of the total system. Many accidents have happened since the introduction of computers because of operator’s getting tunnel vision and are not aware that other problems exist which may be more important than the one they are working on, or that the problems they are not seeing are a direct contributor to the problem they are having difficulty trying to resolve. A good example of this is the Texaco Pembroke disaster in the UK where an operator was struggling to address liquid levels in his towers on one unit when he was unaware of a stuck valve on another unit which was contributing to this problem.
The fourth way is to provide tools that help operators overcome common Human Capability problems, one of the most dominant is the short-term memory issues we all battle against, at best we can remember is 5 – 7 things but under times of stress, this is dramatically reduced.
During the early days of resolving alarm management issues, we identified that operators were using some alarms just as memory prompts because of their short-term memory issues, a lot of these alarms had no defined operator action only information. Part of the rules we enforced to get a handle on alarm management growth was if no operator action was defined it was not an alarm and it would be removed from the system, hence, reducing noise within the alarm system.
Another aspect of our short-term memory limitations was inadequacies recorded in shift handover logbooks. A common practice was for an operator within the last hour of the shift to write up their logbook describing what happened during that shift. Unfortunately, the short-term memory issue kicked in, and an operator forgot to record events.
It was not unusual for an operator to get home and suddenly remember they had not communicated that they had opened a drain valve and that it needed to be closed after emptying and before operations restarted. This was environmentally extremely important and could lead to an excursion. Hence, tools have been developed to capture that type of notification is some form of operator alert or notification system.
Also, the shift handover has been improved, and operators fill it in during the shift and not at the end of the shift. Operators tend to take more notes and use these for the logbook, but companies are moving to electronic logbooks which capture more information automatically or prompt the operator for an explanation for the shift handover. A good example is “shelving” an alarm which may be out for maintenance, sometimes called eclipsing. As the operator shelves the alarm the time and reason and the out of service period is captured as defined by a MOC policy.
The fifth was is to avoid taking operators out-of-the-loop (syndrome), this was very common during the initial introduction of computer control automation. Operators initially had shared responsibilities some outside equipment and monitoring and supervising the control system, what was not considered was that the computer allowed a lot more automation and there was a big difference between the old pneumatic days and the new computer automation days.
The operator that left the control room was “out-of-the-loop” and had no idea what had happened while they were outside, or if someone else had changed something by making a control move or silencing an alarm. This led to many accidents and near misses, and soon the industry realized that the control system requires permanent monitoring.
However, many bad practices still took the operators out-of-the-loop here is a list of a few of them:
- Cleaning floors, bathrooms
- Attending meetings away from the control system
- Reading books and newspapers
- Moving to a different position to do IT PC work like logbooks, generating work orders, writing permits
- Smoke breaks and traveling to bathrooms a long way from the control room.
- Talking to other consoles that should have had a better adjacency, closer because they communicate a lot on a day-to-day basis.
- Many other reasons for being taken away from the console one being random drug test without full shift handover to a replacement operator.
- Some operators switch roles part way through a shift on goes inside while the other goes outside and the one coming in has little understanding of the previous shift handover or what control moves may have already been made and are still waiting for the process to respond to the change fully.