Communication networks for critical applications
July 2012, IT in Manufacturing
The rapid expansion of PCs and Ethernet networks to provide and control automation in industrial and utility environments has meant that the ideas and concepts behind IT have had to evolve to keep up. A corporate or home environment has much less stringent requirements for the IT infrastructure, so differences exist, particularly in the way the technology has evolved to cater for mission critical networks.
The biggest differentiating factor is the level of stability and uptime required from the network and attached devices. In a corporate environment, failure of the network is generally not a disaster, employees may be denied e-mail and Web access for a while, but unless the problem carries on for days this loss of functionality is more of an annoyance than a critical problem. The same cannot be said for an industrial or utility site – mission critical sites. In these cases, a failure of the communications network can lead to downtime, possibly of the entire site depending on the severity of the network failure. This will obviously lead to loss of production and income, and in some cases, even once the communications network is back online, restarting the plant can be a costly affair. For this type of network, any amount of downtime is a concern and must be avoided.
Mission critical applications and hardened equipment
One of the differences between a mission critical and non-mission critical site is the hardware that must be used. In a corporate environment, PCs and networking hardware do not have to be able to handle extreme environments, and failure of one device is generally note a concern as a replacement can easily be swapped-in without too much disruption. This is not the case on a mission critical site where failure of a single device can cause shutdowns, damage to property and assets, and possibly endanger the lives of personnel. And when it comes to swapping spare hardware, this can sometimes be a much larger undertaking depending on the effect on the rest of the network.
This has led to an evolution of the hardware for use in these hazardous environments, known as hardened equipment. For instance in a corporate network, PCs and networking hardware (switches and routers) contain fans for cooling. In a hazardous environment, fans can quickly become clogged and stop working, the subsequent overheating can in turn result in downtime. Hardened devices should not contain any moving parts such as fans; instead they use internal components designed to withstand higher (and lower) temperatures, as well as incorporating heat sinks for dissipation.
Another mark of hardened equipment is dual power supplies; if one fails the redundant supply can pick up the load without the device shutting down. Another evolution is resistance to the effects of EMI (electro magnetic interference). In an environment such as a substation, EMI can be high enough to cause data corruption and even loss of communications.
Hazardous area requirements
On mission critical sites, it is not only the communications hardware that requires hardening, end devices such as PCs must be able to operate in hazardous environments as well. In some cases, vendors have even taken to embedding hardened PCs within the Ethernet hardware. This means that a single device can now provide connectivity and also computing power within the application. It can thus be used for secure authentication, VPN server hosting or network monitoring. This can translate into easier remote access and control, whilst still upholding all the security requirements. A secure remote substation or site that relies on a central server for authentication can be extremely difficult to control and troubleshoot in the event of loss of an uplink to the central control room. Having a secondary backup PC within each substation can be a great help to engineers and technicians in this case, and can save by assisting with local authentication and troubleshooting.
Network architecture and protocol
It is not only the hardware that has to change. A mission critical network has very different requirements from a corporate one, and as such, requires different planning and configuration. For instance a corporate network generally focuses on high transfer rates rather than high availability. So the backbone switches will often aspire to be 10 Gigabit, and often edge devices will connect at gigabit speeds. These will often be flat networks with little or no sub-netting for traffic isolation, as it is not generally required. In a mission critical network the opposite is usually true. These networks do not require large amounts of data to cross the network at once, but rather are concerned with the reliability and latency of each individual data packet sent. There are also generally more requirements for separation of data using techniques such as IP sub-netting, layer 1 VLANs and layer 2 multicast controls.
Another important factor is path redundancy. Having hardened hardware is the first step, but is of little help if a single cable break can bring down the entire network. In these cases redundancy protocols such as RSTP (Rapid Spanning Tree Protocol) can be considered. These allow for redundant network paths that will be kept inactive until such time as they are needed. RSTP, a common open standard redundancy protocol, provides recovery times (worst case) of less than 30 seconds. In a mission critical network this can still be too much of a delay and so many hardware vendors implement their own proprietary protocols to provide quicker recovery. The solution is to look for a redundancy protocol providing faster recovery times, whilst still being backwards compatible with open standard protocols. This can allow a user to implement the faster recovery on critical network segments, whilst being able to use other hardware (with slower recovery) for less critical segments.
There are many protocols and networking functionalities available that are important for critical networks including some the following:
* Traffic prioritisation: queuing so that critical traffic is given higher priority across the network. This may be used in a basic format for voice data but is generally not implemented as well as it could be.
* Network monitoring: using SNMP [Simple Network Management Protocol] to assess the status of the network and attend to problems before they cause unnecessary downtime. This may be implemented in a small way on corporate networks, generally more for diagnostics than troubleshooting.
* Faster recovery times: ensuring that redundancy will allow recovery from any mission critical failure in a timely fashion.
* VLANS: layer 1 Virtual Local Area Networks allow users to segment data based on the physical port from which it was received.
* Load balancing: MSTP (Multiple Spanning Tree Protocol) allows users to combine redundancy (RSTP) with load balancing. By making use of the MSTP different VLANs can be assigned to different MSTIs (Multiple Spanning Tree Instances). Different MSTIs will use different redundant links for their RSTP region, meaning that at any one time, all cables in the MST region will be sending traffic, but no loops will be created within any VLAN.
* Network traffic control: using protocols such as IGMP (Internet Group Management Protocol) the multicast and broadcast traffic on the network can be controlled, meaning less erroneous data travelling to each end device.
* Static addressing: most corporate networks will run DHCP (Dynamic Host Configuration Protocol) to assign IP addresses to end devices, and will communicate between end devices using host names. In a mission critical environment it is better to use statically assigned IP addressing, as this is much easier to troubleshoot and control.
A communications network for a mission critical application cannot afford only a basic setup. In depth planning is required to guarantee that the network will provide the stability and redundancy to handle the critical nature of the traffic as well as the harsh environment. The technology and hardware have evolved for this specialised field, so the knowledge and understanding of the administrators and technicians in charge of the network must evolve to match.