March 28, 2024

Key issues for Implementing a Prudent Control System Cyber Security Program

by Joe Weiss, PE, CISM, Applied Control Solutions, LLC

The goals of a prudent control cyber security program should be to help make the utility/entity more secure, maintain and when possible, improve system reliability and availability, and meet regulatory requirements. In the past, these requirements have been met by prudent engineering design considering all appropriate system challenges (this includes the N-1 criteria and appropriate redundancy), appropriate testing, appropriate polices and procedures, appropriate training, etc. However, cyber threats provide new challenges that I believe require a different approach than that being addressed by the NERC Critical Infrastructure Protection (CIP) cyber security standards1. This paper will identify several key areas that are often overlooked or not properly addressed that utilities, regulators, insurance companies, and others can use for assessing if utility cyber security programs are adequate to secure electric utility assets from intentional or unintentional cyber threats. These areas have either potentially affected, or actually caused, control system cyber security incidents. Consequently, adequately addressing each of these areas is critical to securing electric industry operational assets.



Background

There are a number of organizations and standards for establishing a cyber security program. These include the NIST Federal Information Security Management Act (FISMA)2 and associated controls document – NIST Special Publication (SP) 800-53A3 and ISO 177994 and 270015. These documents do not provide exclusions for assets such as telecom.



The NERC CIPs have now been ratified by FERC (with modifications). So why is there a question of prudency? Unlike IT standards, the NERC CIPs include specific exclusions (distribution, non-routable protocols, telecom, and nuclear plants). The NERC CIPs also specify the use of a risk assessment methodology to determine critical assets and critical cyber assets, but provide no details. These explicit and ill-defined requirements have enabled utilities to minimize the number of assets to address; in some cases ZERO critical cyber assets. IT assets governed by SP800-53 or ISO-27001 are actually more secure than our most critical operational assets such as substations and power plants. How can that be?  Consequently, this paper addresses

key areas that may be overlooked in establishing and/or maintaining a prudent cyber security program. Many of these issues were identified in the FERC Technical Staff Assessment of the NERC CIPs6 and the FERC Notice of Proposed Rulemaking

(NOPR) RM06-227.



There are two caveats that should be noted. There are no metrics for performing a control system cyber security audit. The Industrial Control System version of NIST SP800-538 provides arguably the best metrics. Secondly, many control systems have no logging for control system cyber security. Consequently, it may not be possible to identify control system cyber incidents or their causes.



Based on experience and actual control system cyber security incidents, a prudent control system cyber security program should include the following:



Control System-Specific Cyber Security Policies and Procedures.

The biggest payback in electric utility (and other industry) control system security programs is implementing comprehensive control system cyber security policies and procedures. In order to make sure they are

taken seriously, the adherence to these

policies and procedures should be one of the performance goals of senior management (per the NERC CIPs). Almost all utilities have cyber security policies, but many are based on traditional IT policies and technologies.

This can be a problem for the control

system’s environment. While some components of an IT security program can be applied

to control systems, many of these

policies are not relevant to the real time control system environment and

inappropriate when addressing legacy field devices. For example, there have been

numerous cases where inappropriately

applying traditional IT security technologies such as certificates, block encryption,

or even anti-virus have impacted or completely obstructed control system operation.


 


NIST testing has demonstrated that updating antivirus definition files can cause a 2-6 minute denial of service on legacy control system processors. Traditional IT security testing can be even more problematic for legacy systems. Many legacy control systems have been designed without a complete IP communication stack. Scanning legacy control system devices and/or networks utilizing traditional IP scanning tools can lead to broadcast storms as the scanning tool attempts to locate devices that cannot adequately respond. A broadcast storm is a state in which a message that has been broadcast across a network results in even more responses, and each response results in still more responses in a snowball effect. A severe broadcast storm can block all other network traffic, resulting in a network meltdown9. There have been several actual control system cases where scanning control system networks and/or devices resulted in broadcast storms significantly impacting control system performance. In at least one case, scanning resulted in damage to control system equipment requiring replacement before the equipment (in this case variable speed drives) could be reused. Consequently, it cannot be stressed enough how dangerous scanning can be to legacy systems if not performed knowledgeably and with caution. Scanning is not the only issue. A recent case involved the tripping of a 50 MW generator because of inappropriate policies. Inappropriate architecture has also led to cyber incidents including the shutdown of a large nuclear power plant.





























Another common problem is security of dial-up modems. Many users feel that all modems have been identified and disconnected when not needed. When visiting users (not just utilities), I have yet to meet a user that hasn’t told me they know where all of their modems are and they are disconnected when not in use. Conversely, after detailed discussions and walk-downs, I have yet to find a user that hasn’t found at least one modem they didn’t know they had or at least one modem that was connected they thought was disconnected. Any modem that is not secured is a cyber security vulnerability. The recent Idaho National Laboratory (INL) demonstration that was shown on CNN destroyed a diesel generator by using dial-up modems10.



Without appropriate control system policies and procedures, you cannot secure your control system assets. It is also the surest way to fail a “real” (are you really more secure?) control system audit or the quickest path to unintentional control system problems. 



System integration

In the past, identifying relevant stakeholders for a SCADA or plant control system was easy: it was limited to facility and corporate operations and engineering. Today, it is much more complex and tomorrow will be even more so. Part of what makes control systems more productive is also what makes them more insecure – system integration. More and more organizations are finding their most valuable and useful data is the real-time control system data. This is leading to many internal organizations establishing, or wanting to establish, connections to a SCADA, plant control system, programmable logic controller, or control system database without the corporate or facility operations and engineering organizations even being aware. Additionally, productivity can be enhanced by integrating control systems such as SCADA with non-control systems such as customer management or geographic mapping programs. Depending on how the networks are configured, this can, and has, resulted in actual cyber incidents including the only case I know of where a SCADA system was targeted and incapacitated.

Performing vulnerability assessment to prudently identify all electronic connections

Utility organizations are beginning the process of assessing cyber vulnerabilities of their control systems to meet the NERC CIPs. The creation and execution of these assessments needs to be done carefully as there are several significant and frequently conflicting issues at play. The first is scope. NERC is focused on grid reliability. There are many specific scope exclusions in CIP-002 such as telecom, market functions, distribution, and non-routable protocols. Many utilities have excluded these systems in their vulnerability assessments since they have been excluded by the NERC CIP. Many of these excluded systems are cyber vulnerable and directly communicate with systems that are in the CIP-002 scope. Consequently, it is not possible to comprehensively identify the cyber vulnerabilities that can impact these critical cyber assets. Implicitly, there is another exception – small facilities. The NERC CIP implies that traditional reliability criteria can be followed in defining what equipment need be identified and addressed as critical assets which implies large facilities. This makes sense from NERC’s traditional reliability perspective. Since most utilities have provided redundancy in substations, power plants, and sometimes even control centers, many utilities have identified very few critical cyber assets.  In reality, the NERC CIPs are a cyber standard, not a traditional reliability standard. From a cyber perspective, it is the connectivity that determines criticality, not the size. The analogy is 9/11. The terrorist that hijacked the plane in Boston did not originate in Boston. Rather, they boarded in a smaller airport with no security. The same philosophy occurs here. A very small facility that is connected to a larger facility can impact the larger facility or any other facility to which it is interconnected. A common control system network that can shutdown all facilities, be they power plants or substations, can have an impact on the grid.



From a cyber security issue, it is irrelevant how large or critical the system is to normal reliability considerations. From a cyber perspective, what matters is if the equipment is electronically connected.


 


 


 


Even the smallest facility, if electronically connected to a control center, can be a pathway to compromise the control center. Conversely, a very large facility that is critical for reliability considerations but has no electronic connections is irrelevant from a cyber perspective. When addressing cyber security, it is not the size of the device or facility, but the connections that matter.



Another issue that must be considered is the exclusion of telecom. One of the most probable causes or paths for cyber intrusions are the inherent vulnerabilities within the telecommunications environment. The NERC Electric Sector ISAC issued an advisory on the Slammer worm that occurred in January 2003 affected a frame relay system11. The final report of the Northeast Outage also identified wireless and wireline communications12 even though the NERC CIPs excluded them. It has been demonstrated by one of the National Laboratories that 900 MZ spread spectrum, frequency hopping radios can be hacked. These radio systems provide the critical communications within the substation and provide input directly to SCADA. Compromise of these radio systems can lead to compromise of the devices within the substation. If the current exclusions in the NERC CIP are followed, these devices using non-routable protocols will be excluded from the assessment process which represents the vast majority of utility communications. This doesn’t make sense. It should be mentioned that small systems, utility telecom systems, and non-routable protocols have experienced cyber incidents.



The distribution systems are excluded from NERC cyber assessments. However, because they often have undergone the most upgrades, it is the distribution systems that have now become arguably the most cyber vulnerable part of the T&D system. As distribution systems are electronically connected with transmission systems, they should not be ignored. There have been several electric distribution cyber incidents that could, or have, resulted in cascading outages. The market function of an EMS system receives data from insecure meters and also electronically connects with SCADA. As with distribution, the market functions are often excluded by the NERC CIP. These vulnerabilities could lead to very significant economic impacts if meter or billing data is compromised. Additionally, there have been several incidents where nuclear plants have had cyber incidents. Losing large central station nuclear plants does have a significant impact on grid reliability.



Therefore, it should be evident that by excluding systems from NERC CIP programs, it is not possible to identify all of the critical cyber assets much less the vulnerabilities that can impact critical cyber assets. Remember: It’s all about the connections where the real cyber vulnerability exists.



Perform risk assessments for business perspectives

Cyber risk needs to be addressed for grid reliability to meet NERC CIPs requirements. However, cyber risk also affects systems that can significantly affect the business, but not necessarily affect grid reliability.  Many systems that are critical to the economic health of the utility may not be critical to grid operations and are consequently excluded from the NERC CIPs. Facilities such as small power plants, low to medium voltage distribution substations, and automated metering infrastructure are examples of facilities and systems that are “business critical”, but not “grid critical”. There is a significant potential liability to a company for ignoring cyber risks to the business even though these systems are excluded by NERC CIPs.



Interconnections and interdependencies

The last issue is possible the most subtle, but certainly not the least important. That is the impact of interconnections between transmission systems. Electric utilities often share equipment such as RTU’s. Utilities also interconnect with one another. There is an old saying in the cyber community that you are only as secure as your weakest link. In this case, your weakest link could be your neighbor. How this is addressed impacts not only you, but also your interconnection partner. These interconnections need to be addressed comprehensively. This issue becomes even more problematic when one of the interconnections is with a federal power agency such as TVA or BPA. Federal power agencies MUST meet NIST SP800-53 which is more comprehensive than the NERC CIPs. Consequently, any non-federal utility connecting to a federal power agency becomes a weak link. Why should a federal power agency be held to a higher standard?



Summary

The issues addressed in the NERC CIPs have done the utility industry a great service by beginning the process of requiring cyber security to be specifically addressed. However, it has done so in a limited manner. Many of the identified limitations have already led to cyber events. In order to minimize risk to the utility infrastructure and business operations, it is incumbent on the utility to utilize due care and diligence in establishing and maintaining their cyber security programs. Cyber issues can materially affect the utility industry’s bottom line from a positive direction (improving system reliability and availability) or from a negative direction (cyber impacts). The positive direction takes a comprehensive program beyond “just meeting the NERC CIP requirements”. The negative direction can occur because the program was not sufficiently comprehensive and can lead to punitive damages as suggested by NERC. The choice is up to you.



About the Author

Joe Weiss is an industry expert on control systems and electronic security of control systems, with more than 30 years of experience in the energy industry. He is a member of numerous organizations including the NERC CSSWG, IEEE, ISA, IEC, and CIGRE.