The exposure of industrial facilities to cybersecurity threats has never been higher. An analysis performed by IBM security found that the number of attacks on SCADA systems increased 636% from 2012 to 2014, with 675,816 cybersecurity incidents in January 20141. Finding an effective method for evaluating the current level of risk in a facility and implementing additional security risk reduction as needed is becoming an essential part of managing the safety, security, and operability of industrial systems.
The three fundamental activities for the analysis of cybersecurity risk are High-Level Risk Assessments, Detailed Risk Assessments, and Security Level Verification. This is the second of a three-part blog series breaking down the IEC 62443 lifecycle steps for evaluating cybersecurity risk, with this one focusing on understanding the method for detailed risk assessments.
The Detailed Cybersecurity Risk Assessment is the second risk analysis performed for cybersecurity.
Its purpose is to:
- gain a definite understanding of the current level of risk within a facility, factoring in potential threat vectors and existing/planned countermeasures
- ensure that corporate risk criteria are met
- provide detailed input for the Cybersecurity Requirements Specification (CSRS)
- provide further granularity for zone security level targets (including the identification of subzones with higher security requirements).
The steps for completing a detailed risk assessment for a major process area are detailed in the following workflow.
Detailed Risk Assessment Inputs
The results of the high-level risk assessment is the starting point for the detailed risk assessment. The high-level risk assessment provides the segmentation strategy for the industrial control and automation system and the identified areas of highest risk, and it includes considerations of the Process Hazard Analysis (PHA) for the site as well as the corporate risk criteria. In addition to the high-level risk assessment, the full PHA hazards and corporate risk criteria should be available if further questions regarding risk management for the site arise.
The final input for the detailed risk assessment is a vulnerability analysis. This can be done either as part of the detailed risk assessment method or before the detailed assessment begins. The vulnerability analysis reviews the existing network, connected devices, configurations, software versions, and additional factors to identify which vulnerabilities are currently present within a facility and could be targeted by attackers. This provides an important input to the detailed risk assessment when considering the entry points into the system and evaluating how likely a successful attack is and how easily an attacker can move between devices in the control network.
Document Threat Vectors
The first step in the completion of the detailed risk assessment is documenting the potential threat vectors that would provide attackers entry into the system. Depending on the approach taken, this can be a daunting task. I have seen evaluations where asset owners were given a list of over three hundred threat vectors to review in order to identify which could provide entry into the system.
Although this approach attempts to be comprehensive by considering detailed threat vectors, it fails to be effective for a few reasons. First, by breaking threat vectors down into so many parts, the amount of time required to complete the assessment is greatly increased, because even for threat vectors that don’t apply to the system under consideration, many granularities must be considered. Second, that level of detail completely overwhelms plant personnel because they are not familiar with the ins and outs of cybersecurity analysis and are now given hundreds of new terms that they do not understand. Lastly, it does not result in a more complete analysis of the system because the plant personnel with the knowledge required to evaluate the system can’t speak to the same level of granularity as the selected threat vectors.
Instead of overwhelming the first portion of the risk assessment with hundreds of individual threat vectors, it is helpful to look at manageable categories of attacks that provide a complete overview of the ways attackers could enter the system, which are still intelligible to the plant personnel involved with the risk assessment.
The Common Attack Pattern Enumeration and Classification (CAPECTM) database provides common areas of attack that can greatly assist with this process2. If you are thinking that nothing about the “Common Attack Pattern Enumeration and Classification” database sounds less confusing, don’t worry—they provide six areas of attack that can be understood by anyone, regardless of their cybersecurity experience:
- Social Engineering: Getting into the system by manipulating or exploiting people.
- Supply Chain: Altering the system during production of components, storage, or delivery.
- Communications: Blocking, manipulating, or stealing communications.
- Physical Security: Getting into the system by overcoming weak security measures.
- Software: Getting into the system via vulnerabilities in software applications.
- Hardware: Getting into the system by manipulating the physical hardware of network devices.
By starting with broad categories and then moving to the level of detail necessary to evaluate the threats, a detailed risk assessment can be both more efficient and more thorough because the personnel with the critical knowledge for the control system will be able to more actively contribute in the discussion and provide their valuable knowledge about the system. The level of granularity required is a key difference between the high-level and detailed risk assessments.
Determine Threat Likelihood
Another difference from the high-level risk assessment, where the likelihood was assumed to be one, is that for the detailed risk assessment, the likelihood of a threat must be specified. When determining the likelihood of the attack, after considering the area of attack it is typically helpful to start by asking key questions about the threat agent:
What threat agents could execute this attack?
- Internal or external?
- Skilled or unskilled?
- Nation state level resources required?
Considering these questions can be helpful for identifying how likely the attack would be, but it is also important to understand the differences between likelihood from a functional safety perspective and a cybersecurity perspective. A control engineer must consider a loss of containment event that has a tolerable frequency of 10-4years, whereas IT personnel must consider the 675,816 cybersecurity incidents that happened in January 20141. Considering both of these perspectives, it is important to identify realistic likelihoods for different types of cybersecurity attacks.
One method is to pull from the techniques used in functional safety and evaluate intentional targeted attacks in the low demand mode, while general malware attacks that could be executed by attackers with low skill in the continuous demand mode.
Because prepackaged ransomware attacks and many others are available for purchase on the dark web for low cost, the number of unskilled threat agents is constantly rising. For the facilities that still believe they could never be targeted by a cybersecurity attack, understanding how common and easy these attacks are should come as a much-needed wakeup call. Being able to accurately model both types of attacks is essential for determining the security of a facility.
Determining Possible Consequences
Another difference between high-level and detailed risk assessments is that for detailed assessments, all possible consequences are considered instead of just the worst-case consequence. This allows for a better understanding of how the compromise for critical devices impacts all relevant risk receptors (i.e., safety, business, and environment) and provides an understanding of what the overall impact would be. This is critical for calibrating the level of risk reduction needed for a given scenario.
After identifying threat consequence pairs for a system, the next step is to identify what countermeasures are in place to prevent a successful attack. These countermeasures are any protection that reduces the likelihood of a successful attack. This can be done by the following: 1) reducing the likelihood of the attacker being able to enter the system (i.e., properly configured firewalls/ managed switches, devices with better security capabilities/features, or the least privilege method for assigning access accounts), 2) increasing the likelihood that an attack would be identified and stopped before its final objective (i.e., reviewing firewall logs for unusual access patterns, intrusion detection systems, verify code signatures before downloading to the logic solver), or 3) having measures to stop or mitigate the end objective of an attack or means of safety risk reduction not susceptible to attack (i.e., hard-coded endpoints in logic solver configurations, pressure relief valves, pneumatic control loops).
Countermeasures for any of the categories described have the potential to stop an attack; during Security Level verification, each of the countermeasures evaluated will be reviewed further to confirm independence and reach a determination on the overall risk reduction provided. Because a better understanding of countermeasure effectiveness can be achieved during SL verification, it is recommended that the initial credit claimed for countermeasures is not greater than one order of magnitude so that estimates of the remaining risk are not overly optimistic.
Once the detailed risk assessment has been completed, the zone and conduit diagrams and security level targets from the high-level risk assessment should be finalized. The high-level risk assessmentserves to provide a quick understanding of high-risk areas, and the detailed risk assessment provides a robust understanding of what the threats and countermeasures in those high-risk areas are.
The results of the risk assessments provide the key inputs for the CSRS and the subsequent design phase of the IACS, including the security level verification, and promotes the effective flow of information between lifecycle steps. By understanding the level of risk for a facility, informed decisions addressing cybersecurity concerns can be made to promote safe and secure operation.
 Sayfayn, N., Madnick, S., Cybersecurity Analysis of the Maroochy Shire Sewage Spill, MIT, May 2017.
 Common Attack Pattern Enumeration and Classification: A Community Resource for Identifying and Understanding Attacks, MITRE Corporation, 2018.