Adopting strategic patch management tactics across ICS environments amid escalating cyber threats
14 min readAcross the ICS (industrial control system) realm, ensuring security and operational integrity is crucial for organizations in various sectors, amidst the increasing number of cybersecurity threats and attacks. In line with this, the strategies deployed for carrying out patch management across ICS environments should be very carefully planned and performed to ensure minimal operational disruption and risk.
Unlike the typical IT environment, ICS systems are normally continuously in operation and sensitive to any kind of downtime; traditional methods of patch management thus become less practical. First and foremost, a successful patching strategy concerning an ICS is actually an inventory of assets and vulnerability assessment. First of all, it’s important to consider system criticality and system interaction when prioritizing patches. The patching process needs to be focused on systems that are categorized by risk exposure and operational importance, which means applying the patches first to the ones that represent the biggest risk and largest operational impact.
Another good strategy would be to have a test environment established, paralleling the production system. In this way, patches can be deployed and tested without affecting the operational stability of the actual ICS environment. This is an important step to ensure that patches do not introduce any new vulnerabilities or disturb the functioning of ICS specialized equipment. Vendors also need to be coordinated since most ICS components are integrated with proprietary software and hardware. A good relationship with vendors would enable access to timely patches and other support.
Lastly, planning for progressive rollout may be appropriate. Operators can also ensure that only one network segment is updated at any time, thus monitoring the effects of the patch for any disruptions. With this segmented approach, it is possible to roll out patches in case of problems rather quickly, not affecting all functionality. In summary, patching ICS environments needs a balanced approach: enhancement of security measures against the operational requirements that industrial systems have. Doing the proper testing, coordination, and integration with vendors, followed by the staggered deployment, can ensure that the ICS assets of organizations are well protected while their operation remains uninterrupted.
Addressing patching frequency, challenges, best practices
Industrial Cyber reached out to cybersecurity experts to ascertain the frequency of patching in ICS environments and the factors affecting this schedule. They also explored the common tools and technologies employed for patch management in these settings, as well as the significance of automation and orchestration tools in streamlining this process.
Rick Kaun, vice president for solutions at Verve Industrial, told Industrial Cyber that patching varies widely based on industry, company, site, time of year, etc. “Some have not patched in years. Others only patch during small windows of downtime. Still, others patch only under specific conditions such as vendor approval, whether it’s been tested, or if there is a workaround. Free tools like WSUS and SCCM don’t scale well. Some may buy third-party patching tools, and yet others manually patch using network file shares, pull down, and executions.”
He added that automation and orchestration are the biggest pieces of the puzzle many haven’t yet discovered or leveraged fully. “I advocate a ‘Think Global, Act Local’ approach to OT risk reduction: A global view of all assets at all sites with contextual data of that asset (relative to available patches, vulns, exploits attached to vulns, filters to highlight specific vendors, asset criticality to operations, current status of fallback or restoration options). A list of patches (or vulnerabilities) without context is not very helpful. Using the data to create a prioritized path forward based on contextual risk allows for a consistent and measured approach to what to do, where, and how urgently.”
Kaun noted that the activity can also be facilitated with certain patch (or other compensating action) types of tools. “The ability to schedule patch file deployment (maybe just loading locally but not installing), or to enumerate and remove common exploit factors like the guest account or remote desktop services allows for the automation of consistent but also staged actions (i.e., load the patch, notify the user, but do nothing until the user accepts the action). This combination has been measured to reduce effort by up to 70% when compared to manual efforts and stand-alone tools,” he added.
Greg Valentine, senior vice president of solution engineering at Industrial Defender made the point that ICS environments are typically patched much less frequently than traditional IT environments due to several critical factors, particularly the need for ICS to operate continuously to maintain operations, especially in critical infrastructure. “Prioritizing availability and uptime, these environments operate with carefully planned maintenance windows. These maintenance windows are often when patches are planned, which can be monthly, quarterly, or even less frequent.”
He told Industrial Cyber that testing requirements are also a factor in patching timelines. Because of the critical nature of ICS environments, patches are often tested in a controlled environment to ensure they do not disrupt the operation of control systems. Certain regulatory requirements may also influence the patching schedule, depending on industry and region.
“Tools and technologies for patch management in OT environments include vulnerability management systems, patch management software, configuration management databases, and automated patch deployment tools. Automating these processes can significantly improve efficiency, especially with a reliable, up-to-date, and accurate asset inventory as the foundation,” according to Valentine. “However, automation must consider unique OT requirements and the critical processes they support. An OT-specific approach is necessary, as IT-type scans and communications can disrupt these devices and overall operations.”
Additionally, he added that understanding criticality within the context of operations is important, so some tools may integrate threat intelligence and internal data about the system’s context. Reporting tools are also crucial, especially for compliance audits such as NERC CIP.
In ICS cybersecurity, the frequency of patching depends on factors such as vulnerability criticality, asset location, industry, regulations, and corporate oversight, Nick Cappi, vice president for portfolio strategy and enablement for OT cybersecurity in Hexagon Asset Lifecyle Intelligence division, told Industrial Cyber. “Industrial facilities generally follow ICS vendors’ patch bulletins, with deployment times ranging from days to years. Automation technologies, like Microsoft Windows Server Update Services (WSUS), are common in heavy processing industries and reduce manual effort. However, they can sometimes provide a false sense of security. Without checks and balances in your automated patch management process, gaps can go unnoticed for years.”
He added that a key aspect of a patching program is identifying how many devices have vulnerabilities and assessing the associated risks specific to each environment. This data helps prioritize patches to reduce operational and cybersecurity risks.
“Traditionally, patching is slow in industrial environments, though that does vary by industry and might be influenced by regulations or other factors,” Zane Blomgren, director for industrial cybersecurity at Belden, told Industrial Cyber. “For example, utilities in North America must adhere to NERC CIP which has clear guidance regarding patching. Because their guidance is based on knowledge and risk, it does not dictate that utility providers ‘apply every patch.’”
He added that when it comes to the tools and technologies for patching, that also varies across ICS environments. “For the true industrial devices like PLCs, VFDs, and the like, the option for many organizations is patching tools from the automation vendor that is tailored to their specific equipment. For IT devices located in the OT environment — such as engineering workstations based on Windows – traditional IT tools are often used.”
Testing and implementing patches in ICS environments
The executives also discussed strategies for managing scenarios where patches for critical vulnerabilities are absent, and they detailed the procedure for testing these patches before their implementation across live ICS environments.
Kaun identified that contextual risk is most valuable when patches for critical vulnerabilities are unavailable. “By understanding how risk is applied to specific assets and context, you can design and deploy compensating controls like system hardening, registry edits, etc. Most organizations test patches in the lab or in production, though the lab is limited in systems and affordability, so the operations environment often hosts test scenarios. Testing on a lab system or a low-impact operational system is best.”
He added that deploying high-stakes assets (like the primary HMI in redundant pairs) and critical systems (though some may wait until an outage or downtime) needs to happen on all versions of in-scope assets (i.e., Windows 2016 testing is separate from and a duplicate of Windows 2012 testing). “Legacy equipment, multiple dependent processes (up and downstream from the target device) – zero tolerance for downtime.”
“When a patch is unavailable or you’re unable to patch right away, you should look at implementing compensating controls,” Valentine said. “This could involve enhancing network segmentation and access controls as well as deploying intrusion detection systems (IDS), which can help limit potential exposure and reduce the risk of exploitation. Monitoring would be also critical during this time, both for your systems and threat intelligence feeds. Virtual patching through firewalls and application-level security measures can also provide some protection.”
He added that the testing process for patches before deploying them in live ICS environments is rigorous due to complexity and high safety and operational concerns. “This involves building up a replica environment that mirrors the production settings. Patches are deployed in this test environment, going through functional, security, performance, and integration tests to ensure they do not disrupt operations in the actual environment. You have to test several different variables and ensure systems will be stable over time. There’s also the complexity of testing the effects on integrations with other systems, components, and other software. Changes to environments can ‘break’ the system, and patches are no exception. All of this is documented in detail and reviewed by the user before approval for deployment.”
Valentine also mentioned that the deployment is then carried out in stages, often with a pilot and gradually rolling out the patch. “It’s a resource-intensive and time-consuming process, sometimes subject to stringent regulatory requirements and audits, in order to ensure patches are safe for these complex ICS environments where safety and operational uptime are of utmost importance.”
“Patching may not always be the optimal way to reduce risk in ICS environments. Risk assessments should justify actions, as CVSS scores alone don’t represent risk,” Cappi said. “Companies need to evaluate the criticality of assets and underlying vulnerabilities to prioritize patches or use other mitigation strategies like upgrades, segmentation or whitelisting.”
He added that patch testing often involves ring deployments, a progressive method that minimizes operational risk and ensures system availability. Key aspects include Sequential Rollout where updates are rolled out sequentially to different asset groups or rings,’ progressive deployment which starts with low-criticality rings and moves to more critical ones, and testing and validation when each ring undergoes tests and a burn-in period before proceeding.
Blomgren noted that a golden rule of industrial environments is to have a solid safety plan before it is needed. “The same holds true for network safety. Good network design puts an organization in a better position to defend itself when addressing critical vulnerabilities. For example, if the network design allows for system isolation — possibly by an industrial security appliance with native protocol understanding – it becomes easier for OT teams to address the vulnerability while protecting the rest of the environment.”
He also recommends in these situations that organizations increase their monitoring and scrutinize abnormalities more closely. “Once the patches are available, testing can take place. Yet, this is likely not a trivial task because it may not be possible to test in a lab given the cost and complexity of the equipment involved, as well as safety or compliance implications. Last, deploying will often involve waiting for–or creating–a maintenance window, which presents a challenge unique to OT environments.”
Crafting patch management protocols in ICS environments
The executives highlight best practices for developing patch management protocols in ICS environments, focusing on how organizations can prioritize updates, especially in critical infrastructure installations.
Kaun identified that the most valuable component of the ICS patch strategy is the value of developing an understanding of contextual risk but also in being able to provide and track compensating controls such as configuring micro-segmentation, system hardening, etc.
“Developing a robust patch management strategy in ICS environments is crucial for maintaining security while ensuring system availability and operational integrity,” Valentine said. “Key practices include conducting regular risk assessments, classifying and prioritizing assets based on their criticality, and thoroughly testing patches in a controlled environment.”
He also mentioned that security teams must partner with operations and for OT to collaborate with IT. “Effective patching requires strong collaboration and planning, with all sides of the organization understanding what’s involved in each maintenance window and also what the protocols are for responding to emergency out-of-band vulnerabilities. Various parts of the organization must be involved in understanding the associated security, business, operational, and compliance risks.”
Cappi said that significant discrepancies often exist between corporate policy and actual execution at the site level, and even between operating units within the same facility. “However, consistency in following vendor-approved patches is common. If the ICS vendor approves a patch, owner/operators usually deploy it. This approach has pros and cons. The primary benefit is that it relieves owner/operators from prioritizing patches, shifting the focus to implementing vendor recommendations.”
Also, he mentioned that many owners/operators prioritize patching according to vendor recommendations. “The drawback is that this approach doesn’t manage risk effectively, leading to extensive work with a low probability of risk reduction. Real risks may not be properly evaluated in the specific context of the customer’s environment, as the focus is on adhering to ICS vendor bulletins.”
Blomgren said that understanding the systems and risk is a cornerstone of a healthy patch management strategy. “This will allow companies to determine which patches to apply, and what type of priority, or urgency, comes into play. Be ready to roll back if possible or necessary.”
Simplifying patch management in ICS environments
The executives explore prevalent challenges associated with implementing effective patch management in ICS environments and discuss strategies that can be utilized to address these challenges.
Kaun observed that challenges include fear and avoidance of downtime, lack of patch support (vendors test and verify patches with a potential impact on their product – all other published patches are still in scope for end users but without the safety net of vendor testing/approval), lack of downtime to patch, test and recover, lack of security knowledge within OT, etc. “Patching is a necessity, and its timing is dependent upon multiple operational factors,” he added.
“Common challenges in implementing patch management in ICS environments include system downtime, compatibility and testing issues, resource constraints, regulatory and compliance requirements, vendor coordination, and the risk of introducing new vulnerabilities,” Valentine said. “The strategic way to overcome these challenges is with an agreed-upon patch management policy that is comprehensive and officially signed off by organizational leaders from different parts of the business, and then ensuring ongoing collaboration. Automation and orchestration tools also play a part in being able to keep up with all the vulnerabilities and complexity of analyzing the criticality for your specific operation.”
Cappi noted that the primary function of any security program is to mitigate risk to an acceptable level. “Patch management is one tool among many, including asset visibility, vulnerability management, configuration management, obsolescence management, and backups. What’s needed is a recurring process to identify, evaluate, and prioritize risks in ICS environments. With a prioritized view of risks, you can then work from the top and choose the appropriate tool to address each risk.”
He added that if patching is the best tool for addressing certain risks, then consider the timing of deployment in the context of the environment. “For example, applying all vendor-approved patches makes sense during a planned outage or turnaround.”
Blomgren expressed that the highly specialized equipment found in ICS environments can be sensitive to changes and operational uptime is critical to business success. “Therefore, downtime for maintenance is minimal and must be well-planned to ensure success. I recommend organizations take a risk-based approach that prioritizes systems; this will help maximize what can be achieved in a maintenance window and can even help justify a longer maintenance window if the risk of leaving systems unpatched is too high for the business to accept.”
Balancing uptime and patching across ICS environments
The executives look into how organizations can balance the essential task of patching with the operational uptime requirements of ICS environments, while also identifying measures to reduce disruptions during patching activities.
“Make sure your data and inputs to analyze risk span across multiple assets and sites, are brought to a special team of IT and OT to assess and plan for the entirety of the organization,” Kaun disclosed. “Automate as much of the data collection and remediation actions as possible, continually update all inbound data to show before, after, and new/emerging threats for a ‘think global, act local’ approach to risk reduction. It is proven to be the most effective in focusing scope but also in minimizing manual effort and tracking.”
Valentine called for striking a balance between essential patching tasks and maintaining operational uptime in ICS environments requires strategic planning, effective communication, and leveraging technology. “Ensuring critical systems have redundant backups and performing full system backups before patching are essential for continuity. These measures allow operations to continue even if a system needs to be taken offline for patching and ensure systems can be quickly restored if any issues arise,” he added.
“Risk-based patch prioritization is another crucial aspect of effective patch management. This approach focuses on applying patches that address the most severe vulnerabilities and pose the highest risk to the system first,” according to Valentine. “It considers threat intelligence to understand which vulnerabilities are being actively exploited and assesses the context and purpose of each asset within the operational environment. By understanding the role and importance of each asset, organizations can prioritize patches for systems that are critical to maintaining operational continuity and safety.”
He added that effective coordination among teams involves establishing clear roles and responsibilities, forming a cross-functional patch management team responsible for overseeing the process, and ensuring collaboration among departments. “Developing a communication plan with regular meetings to discuss patch management activities, progress, and potential issues is crucial for effective coordination. Regular communication keeps all stakeholders informed and allows for the timely resolution of any challenges that arise.”
Cappi said that operational uptime is crucial for owner/operators and patching remains a necessary task for addressing certain risks in these environments. “A balance must be found between outages, efforts, and expenditures to ensure profitable, sustainable, safe, and secure operations. Achieving this balance requires shifting from simply task execution to identifying, evaluating, and prioritizing risks.”
He added that by “applying the basic risk equation, Risk = Likelihood x Consequence, to your environment, you will likely discover that we are expending too much energy on tasks that do not significantly reduce risk, and too little on activities that would have a meaningful impact.”
Blomgren pointed out that good network design can have a positive impact on an organization’s patch management strategy. “If risk is a component, and the network design includes segmentation, zero trust, and other best practices, this reduces the risk and can help reduce the scope of patches that might need to be applied. Last, good communication is key. When teams work together to define and agree on acceptable levels of risk, or their risk tolerance, having an open discussion about the need for system patching given this risk framework becomes much easier for all,” he concluded.
link