Watch this educational SANS webinar led by Justin Searle, Director of ICS Security at InGuardians and a senior SANS instructor, and Phil Neray, VP of Industrial Cybersecurity at CyberX, to learn about:
- Technical architecture of the TRITON malware — including how the attackers cleverly inserted a backdoor into the firmware memory region of the safety controller without interrupting its normal operation or being detected
- Threat models showing how the attackers may have compromised an engineering workstation to deploy malware that communicates with the safety controller using its native protocol
- How to defend against similar attacks in the future via a multi-layered active defense model incorporating continuous monitoring, vulnerability management, threat intelligence, and automated threat modeling
An industry game-changer, the TRITON ICS cyberattack exhibited an entirely new level of Stuxnet-like sophistication. In particular, the attackers exploited a zero-day in the PLC firmware in order to inject a Remote Access Trojan (RAT) with escalated privileges into the controller itself.
Moreover, the attackers cleverly inserted the backdoor into the controller’s firmware memory region without interrupting its normal operation and without being detected.
TRITON exposed yet another breed of ICS systems that attackers can now target to compromise industrial operations, the physical safety control systems or Safety Instrumented Systems (SIS) that provide automatic emergency shutdown of plant processes, such as an oil refinery process that exceeds safe temperatures or pressures.
The likely intent of such an approach would be to disable the safety system in order to lay the groundwork for a 2nd cyberattack that would cause catastrophic damage to the facility itself, potentially causing large-scale environmental damage and loss of human life.
Although TRITON was a targeted attack specifically designed to compromise a particular model and firmware revision level of SIS devices manufactured by Schneider Electric, the tradecraft exhibited by the attackers is now available to other adversaries who can quickly learn from it to design similar malware attacking a broader range of environments and controller types.
In this educational SANS webinar led by Justin Searle, Director of ICS Security at InGuardians and a senior SANS instructor since 2011, and Phil Neray, VP of Industrial Cybersecurity at CyberX, the ICS security company founded by military cyber experts with nation-state expertise defending critical infrastructure, you’ll learn about:
- The technical architecture of the TRITON malware
- Threat models showing how the attackers could have compromised the engineering workstation
- How to implement a multi-layered active defense to defend against similar attacks in the future
Mr. Searle is Director of Industrial Control Systems (ICS) Security at InGuardians, an independent information security consulting company providing high-value services including penetration testing, security assessments, threat hunting, and incident response. He is also a Senior Instructor for the SANS Institute, having taught core ICS security courses including “ICS/SCADA Security Essentials” and “Assessing and Exploiting Control Systems.” Justin led the Smart Grid Security Architecture group in the creation of NIST Interagency Report 7628 and played key roles in the Advanced Security Acceleration Project for the Smart Grid (ASAP-SG). He currently leads the testing group at the National Electric Sector Cybersecurity Organization Resources (NESCOR).
Phil is VP of Industrial Cybersecurity for CyberX, a Boston-based OT cybersecurity company founded in 2013 by military cyber experts with nation-state experience defending critical national infrastructure. CyberX is the only OT security firm selected for the SINET Innovator Award sponsored by the US DHS and DoD; the only one recognized by the International Society of Automation (ISA); and the only one selected by the Israeli national consortium providing critical infrastructure protection for the Tokyo 2020 Olympics. Prior to CyberX, Phil held executive roles at enterprise security leaders including IBM Security/Q1 Labs, Guardium, Veracode, and Symantec. Phil began his career as a Schlumberger engineer on oil rigs in South America and as an engineer with Hydro-Quebec. He has a BSEE from McGill University, is certified in cloud security (CCSK), and has a 1st Degree Black Belt in American Jiu Jitsu.
Hello everyone and welcome to today’s SANS webcast, Anatomy of the TRITON ICS Cyberattack, sponsored by CyberX. My name is Carol Auth of the SANS Institute, and I will be moderating today’s webcast. Today’s featured speakers are, Justin Searle, SANS Senior Instructor, and Phil Neray, VP of Industrial Cyber Security for CyberX. If during the webcast you have any questions for our presenters, please enter them into the questions window located on the GoToWebinar interface at any time. Please note that this webcast is being recorded, and a copy of the slides and recording of this webcast will be available for viewing later today, and can be found on the SANS registration page. And with that, I’d like to hand the webcast over to Phil.
Thank you Carol, and welcome everyone. Happy Easter. Happy Passover. And thanks for spending some time with us this afternoon. As usual we have a content filled hour here. I’m going to start off by talking about the TRITON Cyberattack. I’m going to give some of the results of the reverse engineering that our threat intelligence team did on the TRITON malware, put it into sort of a global context of other activities that we’ve been seeing, and then hand it over to Justin who’s going to talk specifically about how to secure safety systems as well as how to implement an active cyber defense strategy.
Before I start, I need to start with this slide because I do feel if you’re in the cyber security community, and especially if you’re in the ICS security community, you may be feeling like this. There’s news, and events, and new information almost on a weekly basis that really cause us to raise our eyebrows, and I’ll give you some examples here. So this is an article that showed up about two weeks ago. It was about the TRITON cyberattack, it had some new information that had not previously been revealed, and we’re going to talk about that in a second. And one of the interesting quotes here was something that had been mentioned by us and by others in the early days of the TRITON new being released, which was December.
But really, this was not the entire attack. What the attackers really intended to do was to disable the safety systems so they could cause a lot more damage than they did which was they simply shutdown the plant, which I’m sure was still quite disruptive for the asset owner. Let’s look at a couple of other things that have happened in the last couple of months, and again I’ve shown this slide before. I do feel like sometimes we feel that we’re out in Wonderland and we’re not sure what we’re seeing, and what it means. I’ll give you some examples here. So if you look at the first couple of months of the year, TRITON was talked about for the first time in December, but then at the S4x18 conference, Dale Peterson’s conference in Florida, we had some new information from Schneider and from others about what had happened.
I’m sticking in here some other activities that have happened that are kind of part of the overall cyber context. The Department of Justice’s indictment of Russians, and the Internet Research Agency. It’s sort of part of this overall context on a global basis. But then few weeks ago there was an FBI/DHS alert confirming for the first time that Russians threat actors had successfully compromised our critical infrastructure and accessed human machine interface for example, in one of our energy facilities. There was the Times article that talked about two things. One that it might be linked to a string of cyber attacks on other petrochemical facilities. TRITON was. And then comments from semantics suggesting that Iran perhaps helped some other sophisticated adversaries that had executed the attack.
Then there was the announcement that Guccifer 2.0, again not an ICS issue but it is part of the overall global context of the guys that supposedly had breached the Democratic party and had pretended to be a Romanian hacker. It was revealed that he really is part of the Russia military intelligence and that came about because of a mistake in forgetting to use a VPN to log into various social media. And then last Friday the Department of Justice indicted nine Iranians for hacking. Most of the news focused on their hacking of university libraries and stealing intellectual property which is pretty bad.
There didn’t seem to be an ICS security component to it until I dug into the indictment itself, which maybe I’m strange. I just enjoy reading these things. They’re very legalistic. There’s not a lot of adjectives and hyperbole, but what I found was that one of the victims of the Iranian hackers indicted last week was an industrial machinery company. Not named. Could have been any of the industrial automation vendors that are widely used. But if I were a hacker trying to get into critical infrastructure and understand how the devices work in those facilities, this would be one way for me to do it, would be to hack the mail box at one of these companies and just look for information that I might not otherwise be able to obtain about the memory structure for example of some of these devices which we’re going to see play a key role in the TRITON cyberattack.
If you go back to two years ago, the Department of Justice indicted another group of Iranians, amongst other things, for compromising the SCADA system in the Bowman Dam in New York. Now it didn’t cause any damage. It’s no big deal. Our infrastructure was not at risk. But in mind this might have been part of the training exercise that Iranian cyber attackers were doing to learn how to compromise ITS and SCADA systems. And then this article showed up based on a very detailed report by the Carnegie Institute about Iran. Most of it was about their hacking of dissenting people in Iran and other diplomats, but there was a section in this article in the New York times where they talked about how hard it is to hire hackers in Iran and that they were asking hackers during the interview, the job interview, “Do you have any experience working with SCADA?” So yet another data point that might point to the fact that have a concerted campaign to go after our ITS and SCADA system.
So let’s go to TRITON specifically. So the first we heard about TRITON was in December. It was actually the day before our previous SANS webinar that we did, and it was a very well written blog post by a group of well-known researchers at FireEye Mandiant about TRITON. They made a few statements like, “Their long term objective was actually to cause physical damage. They compromised the safety system. And then really the goal was to disable the safety system in order to cause much more damage.” So as you know, the safety system, these safety controllers are used to shut down the plant when dangerous thresholds are reached such as pressure or temperature that rises to a dangerous level in an oil storage tank for example, and to prevent the whole thing from blowing up. By disabling the safety system, they would be able to then launch a second cyberattack. That is a hypothesis that we have of the purpose of this attack.
And then a month later at the ICS Cybersecurity Conference, S4x18, we saw presentations by two folks from Schneider Electric. They went into a great level of detail about the malware. They said that the reason the attackers were able to insert a backdoor into the safety controller, the PLC, was because there was a zero-day that they were able to exploit. That was the first time we heard it was a zero-day. We just know that for whatever reason, perhaps due to the design of the controller and the threat models that were in place at the time it was designed, which was years ago, it might have been possible for an attacker with access to the ladder logic portion of the controller to then write into the firmware memory region of the controller. We don’t know really how they did it, but I’m going to show you specifically what they did in a few minutes.
Shortly after the conference, there was a Wall Street Journal article. This is the first time we heard officially that it was a petrochemical plant in Saudi Arabia. And then the article that I mentioned before from a few weeks ago that linked TRITON to a series of cyberattacks on other petrochemical plants in Saudi Arabia. These were not ICS related. They were kind of like Shamoon in terms of being destructive but on the corporate IT side rather at the OT side, and they gave two examples. A company called Tasnee, and another company at Sardar Chemical which is a joint venture between Saudi Aramco and Daut Chemical. That triggered for me an interesting connection to an article that showed up in January about Chris Bing in CyberScoop where he said Saudi Aramco helped investigate TRITON and it was directed at one of their facilities, but somehow they were related to the victim’s business. So that would seem to tie into the first bullet up there about Sadar.
The article also claimed that the attack was quite sophisticated. We’re going to see that when we look at the malware. But that Iran, if they were the adversary, could have improved its capabilities by working with another country like Russia or North Korea. The article went on to repeat this hypothesis that it was really intended to cause an explosion that might have killed people. There was an interesting statistic that the same controllers are using 18,000 plants around the world not just in petrochemical but in lots of other industries. And of course what other folks have said, including ourselves, is that the same technique and trade craft could be used against other ITS systems from other OT vendors and that the attack is being investigated by a host really smart folks from the US government.
Before showing the scenario that the adversaries used, I want to talk about this report that came out in the summer from Cisco Payload, and was echoed in the DHS/FBI alert a few weeks ago about Russians compromising our energy sector, in which Cisco Payload’s talked about phishing attacks that used a resume claiming to be from a control systems engineer with experience in what you see there. CMEN, SCADA…. And what the attachment did, which was unusual. We had not seen before. Which is that instead of relying on macros, delicious macros in the document itself, what the document did was attempt to connect to an external SMB server that was controlled by the attackers. And so if you have your firewall permitting outbound SMBS’s, which you should not anymore. Certainly after seeing this. The attackers would then download a file to the machine through this outbound connection and that file would then be used to harvest credentials of a control engineer inside the company, or perhaps one working for a third party contractor or maintenance supplier for the OT network, which would then give them access to the OT network.
And we’ve seen this time and time again that the easiest way for an adversary to get into our OT network is by stealing privileged credentials and then using those to get remote access to the OT network. That bypasses any network level perimeter security you might have. Even any segmentation you might have. So it’s a great way to get right into the OT network and since most of the devices in these networks are very insecure, having very little protection in terms of authentication, once an attacker gets in they can use standard remote access protocols to get to them and manipulate them and compromise.
So I show this to set up this situation here which is a hypothesis about how the attackers might have gotten to the safety controller in this TRITON cyberattack. We don’t really know, but we could expect that they compromised somebody on the IT side to steal their credentials, and then used those credentials to go past the MC directly into the OT network. Of course there are the hypotheses. It could have been a website, drive by attack that also stole their credentials. Could have been an insider with a USB drive or an infected laptop. But we’re going to go with the phishing attack. Seems to be favorite. So after they compromise the OT network, they were able to get access to an engineering workstation. Many of these are older window boxes. So they have lots of vulnerabilities. They may have weak authentication. And there they install PC based components of the malware. As we’ll see it was Python based components that they installed on this PC from which they were then able to get access to the safety controller using the Triconex native ITS protocols.
They were smart enough to build a piece of malware that could speak directly to its controller in its native protocol, and they used that install the remote access Trojan or the backdoor on the device. The last two pieces are known and have been validated. The first piece about how they got in is a hypothesis. And then this piece is also hypothesis. What was the purpose of installing the backdoor on the PLC? We assert that it was so they could establish a permanent tunnel to the PLC regardless of whether the switch was in the run position which prevents any uploads to the controller, or whether it was in the program position which is considered a bad practice, but in reality turns out to be in that position more often than not for convenience. And from there we surmise that they were planning to disable the safety controller and then launch a second cyberattack that would cause massive damage and potentially loss of human life, and certainly potentially some environmental damage as well.
So let’s get into the specifics of the malware itself. The content I’m about to present really is due to some of my colleagues David Ach of the VP of Research at CyberX. He was formerly the team lead in the Incident Response Center for the Israeli Defense forces. And George Lashenko who served in the IDF intelligence forces. These are the guys who actually went through and reverse engineered not only the Python code which is relatively simple, but the power PC code that was actually downloaded to the controller. And they couldn’t be here today, but I’m presenting on their behalf, and the full details of what I’m presenting can be seen in the blog post URL at the bottom there.
So in summary, and I’m going to show you some examples, what the adversary did was pretty smart. From the Python based malware on the PC workstation, they looked at the latter logic code on the controller and built a link list of those programs so they could safely append their own programs, not bending the one that installed the backdoor, without overwriting any of the original code and latter logic code in the safety controls. So there’s a way to basically be on the controller and let it continue doing whatever it was supposed to be doing. Then they figured out the memory layout and the offsets in the controller firmware so that they were able to insert the backdoor into the firmware part of the controller without interrupting its normal operation. As I said, that really required some detailed knowledge which they could only have obtained through extensive cyber reconnaissance, perhaps through an insider but probably a combination of the two. But at least the cyber reconnaissance.
And as I said before, we believe the remote access Trojan provides consistent access to the controller even when the switch is in run mode which would prevent any updates to the ladder logic code. And then I’m going to show you this pretty cool custom protocol they designed to communicate with the RAT. It uses an official Triconex network call, but modifies it in some way to fit their purposes and enables them to then perform read/write and execute operations on the controller. So let’s look at some of these. So just couple of components. Trilog.exe is the py2exe compile. That’s a component that takes ordinary Python and converts it to Windows executable. So that’s where they were able to run Python on a Windows box that may not have had a Python infrastructure. They built, as I said, libraries that could allow them to communicated using the standard TriStation communication libraries. They had a component called inject.bin which places the backdoor in the right place.
And the Python part. Given an IP address, which is the safety controller, they were able to install a first payload, which I’ll show you was kind of a test to make sure they access to the controller, then their main payload, and then they sort of cleaned up their tracks. Okay. Here we go. And as I said before, they constructed a linked list of the ladder logic programs already on the controller so that they were then able to append their program to the last item on the list without interrupting any of the normal flow that the ladder logic may have had. And we also found that whenever they updated their own ladder logic program, they overwrote their ladder logic so they didn’t see a bunch of messy programs in there, just the most recent one being theirs.
So one of the things they do is they check to see if they have read/write access to controller memory. What they do is they upload a magic number using again a little used network call called GetCPStatus, and then they look to make sure that that magic number shows up in the right place, and that verifies for them that they have read/write access to the controller memory. This is pretty interesting. The protocol that they used. So they used a different network call called GetMPStatus, and it has the structure shown there with the colors. So it’s some standard packet headers, an opcode, a special identifier that the hackers put in there, and some data.
The special identifier is here. This FF buried in the packet structure. And so what they do is they check to make sure that that special identifier is in the GetMPStatus packet. Afterwards they look for a specific opcode shown there, the 17, and that tells them what operation is going to be executed. Is it read? Is it write? Is it execute? In this case, it’s 17 so this is a read. And if there is no identifier they know that it’s a standard, legitimate, GetMPStatus call and they just go through and pass it to the normal handler in the safety control.
I want to show you some context here which I think is important with respect to other nation-state attacks and the global environment that we live in today from the point of view of critical infrastructure. And I’m going to show you a picture in a second that has a couple of flags. I just want to remind you what some of these flags are. The vertical lines at the top are Russia. The Democratic DNK, North Korea, is the one with the star. And then Iran is the one with the green. One way to think about it. And I’m going to show you this chart that I adapted from Sean McBride who wrote a blog post on the publication war on Iraq. It was really excellent. What I’ve done is I’ve flipped this up into an animation as well as added some more recent events that were not part of the original blog post.
But if you go back and you look at a history of nation-state cyberattacks, I’m talking here not about breaches or data gaps. I’m talking about an attack that caused some type of destruction. Either DDoS … Not destruction. But some kind of real impact that was either DDoS, Wiper which wipes the hard drives, or physical destruction. And those are the three categories that Sean shows. And we’ll go back to 2007. The Estonians were trying to remove a statue of a Russian soldier. The Russians weren’t happy about it, and the website of the president and several other government websites were DDoSed. That’s going back to 2007. Might be the first time cyber was used in a geopolitical context. Then a year later Georgia. This was a few weeks before an actual military incursion in the country where there was a DDoS attack on several other government websites.
2009 of course we saw Stuxnet, which most folks consider the first known cyberattack on ICS. And we also have something called Dozer which was North Korea getting into the picture with DDoS attacks both on sub-South Korean targets, but also on the White House and the Pentagon. That was 2009. 2011 we saw an attack called Koredos. Again it was North Korea attacking South Korean sites. In this case the occasion was the anniversary of the Korean War. 2012 we saw the first couple of attacks from Iranian hackers Shamoon which was very destructive on hundreds of thousands of PCs in Saudi Aramco. So that’s the Wiper part. And then Ababil which was an attack on US financial institutions and in that March 16th Department of Justice indictment, the March 2016 Department of Justice indictment I mentioned at the beginning, that was the big thrust of their indictment which was to get those guys … Name and shame the guys who did those attacks on US financial institutions.
Okay. I’m just waiting for the screen to catch up here.
Click on it one more time Phil.
Okay. Thank you Carol. Okay here we go. 2013 some more attack from the North Koreans. I won’t spend a whole lot of time on those, but in 2014 we had some interesting attacks. The attack on Sony, both destructive and a data breach. Caused lots of problems for Sony. The attack on Ukraine in advance of their elections bya Russian linked group that tried to influence the Ukrainian election which might have been the first time we saw the Russians trying to insert themselves into the electoral process of another country. And then the Sands attack by Iranians who weren’t happy with the owner of the Sands Hotel and Casino in Las Vegas, and who destroyed a lot of computers there.
In 2015 we saw an attack TV5, which is a French television channel. They actually destroyed their equipment. We’re really not sure what that was about. Some people theorize it was a practice run. But we also saw the first Ukrainian grid attack by groups like, most people think it’s Sandler linked Russia, where they took down the power grid by stealing credentials, getting remote access to the HMI and then turning off the breakers. And they did a bunch of other things at the same time. In 2016, we saw the second Ukrainian grid attack. Much more sophisticated, also known as in destroyer, or crash override in which they built custom malware that knew how to talk to the controllers in the environment using their native protocol, and caused lots of damage. And also made it very hard for the folks to recover.
And then 2017 as I said, we’ve seen a busy year here. We’ve TRITON up at the top. We’ve got WannaCry which is now attributed to North Korea, and NotPetya which is now attributed to Russia. We saw two days ago that Boeing … There were reports of Boeing facilities have been impacted by WannaCry. We know that both WannaCry and NotPetya caused a lot of damage and actually shutdown production facilities in many plants around the world causing hundreds of millions of dollars … Actually over a billion dollars in damage. A sort of impact, from the point of view of lost production and cleanup cost. Merck Pharmaceuticals being one of the ones most prominent in the US that was hit with these attacks. Shamoon 2.0 which was an attack on Saudi Aramco, again by Iran. And then this was … Not too many people picked up on this at the end of the year, the NSA went after North Korea with a DDoS attack, which forced them to open up a second internet connection via Russia to get a little more bandwidth in case of future similar attacks.
So we’ve talked about Iran, talked about Russia. Just want to mention North Korea. This came out last fall about North Korea targeting the power grid. Not like the Russians targeting the power grid. And then talking about Russians, this is a quote by a gentleman in Mr.Putin’s cabinet late last year in which they were asserting their power in the cyber domain. And it’s not surprising that we’re seeing Russia assert itself in the cyber domain. This is a newspaper article that was written a couple years ago by a general who is now part of Mr. Putin’s cabinet in which he clearly stated a strategy and philosophy of asymmetrical war, or nonlinear war, or hybrid war which combines cyber and physical as a way to assert power in other places. And then this excellent article, if you haven’t read it yet, came out last fall in Wired and it was a really great analysis of many of these events.
And this quote here I think is very relevant today if you think about everything that’s going on in the world. Not just in the cyber domain, but in other areas in which geopolitics are being invoked, which is that our adversaries are testing to see what our red lines are and how we’re going to react. And I think we saw this with the Ukraine 2.0 and we’ve seen it in some other recent events.
So quickly now, how would you protect yourself against these kinds of attacks? Justin’s going to talk about the SANS Active Cyber Defense model, which is a multi-layered model. This is a similar model from Gartner called the Adaptive Security Architecture. Came out a couple of years ago. Wasn’t ICS specific, but I’m going to talk about how it does apply to ICS environments. At the time Gartner said that IT security folks are spending too much time on blocking and prevention, and not enough time on detection and response. Carol … Yeah. Thank you. And that attackers were easily bypassing traditional firewalls and signature base prevention mechanisms and instead needed to focus on continuous monitoring and other technologies that you see there.
I would say that the same is very true today when you look at ICS security. ICS security for years depended on either a perception that there was an air gap between ICS networks and the rest of the world, or on simple perimeter level security. As we saw in TRITON and in the Ukrainian grid attack and many others, attackers can now bypass these perimeter level detections pretty easily. Especially in a targeted attack. But even in an untargeted attack, like NotPetya and WannaCry where they used the SMB protocol to cross over from IT to OT very quickly. So I’m going to talk about each of these quadrants now. So first of all, talk about detection. I think the biggest question today if you are responsible for ICS security in your organization is, “How do I detect, how do I even know, if an attacker is in my network?” Right? And I’m going to show you some screenshots from our technology just to give you an idea of how it might work.
This is an alert that would pop up on your console. This would be another alert. You can see here this firmware update might be legitimate, so in your SOC when this alert shows up in their sim, we would advise you to have a workflow that has SOC analysts know who they’re supposed to call to find out if this was a legitimate firmware update by a control engineer, or something potentially malicious as we saw in the TRITON attack, this would have helped tip people off that something was going on that shouldn’t be going on. Second aspect is response. This is an example of an event timeline from our platform. It shows different alerts. Some are serious, some are more informational. But it would be a way to go back and do some forensics and say, “What else happened just before and just after this alert showed up on my screen?” And then there’s some extensive data mining tools that allow you dig down deeper, down to the packet level if required to understand from a forensic point of view what happened.
The next quadrant is about prediction and the idea here is, “How do I predict breaches? How do I model risk with the goal of being better at vulnerability management?” And an interesting quote here. We know that ICS environments are notoriously insecure. They’re very difficult to patch. They have Windows boxes that can’t even be patched any more. They have PLCs that almost never get patched. How do you deal with such a massive challenge, and Gartner says is, “Take a risk based approach, prioritize what you need to patch. If you can’t patch, find other mitigating or compensating controls such as monitoring so you can quickly be alerted if someone’s trying to exploit a vulnerability that you were unable to patch.” And so for this we’ve developed something we call Automated ICS Threat Modeling. It starts with this screen which says, you know, pick the asset that you consider your most critical asset, your target asset, your crown jewel, and we’re going to show you all the possible attack paths with path vectors ranked by risk that an adversary could use to compromise this device.
And so you look at the top there. There’s the top rated attack vectors. Starts with Control Center number one, and then it draws a picture and it says, “Well it turns out that Control Center number one and engineering workstation, is actually connected to the internet. You may or may not have known this. But this would be the way the hacker would get in. They would then exploit a series of Windows vulnerabilities on different devices that you see in the diagram, finally get to PLC 11 at which point they would exploit another vulnerability. This is now a PLC vulnerability known as cross URL which turns out our threat intelligence team reported to the CERT a couple years ago and it has since been patched. But you may not have patched it. And what this allows you to do is then go and simulate how you might mitigate this attack path.
You might choose to do a better job of segmentation. You might choose to only implement the patches that are shown here in this attack path. And then you could see what other attack vectors show up and decide if you’re willing to live with that risk or whether you need to keep going. And that’s what we mean by Automated ICS Threat Modeling. And then finally the prevention quadrant up at the top right. Certainly blocking a firewalls are essential, but there are other aspects to prevention which are tech drive like hardening. So first of all we have a vulnerability assessment report that provides an overall objective risk score with recommendations about how to improve the score over time. It provides an asset map view and showing the connectivity between devices. So for many organizations this may be the first time they actually have an accurate inventory of all their ICS devices. It shows detailed information about each device. What type of device it is. Who’s the manufacturer. What ports are open, and what CDs are associated with the device. And then finally as we’ve been talking about, remote access.
CyberX has been integrated with CyberArk which is the leading remote access privilege session monitoring solution on the market so that we can check to make sure that when someone comes in over a remote access session, FSH, VNC, that are an authorized session. That they came in through the privilege sessions manager and that it’s an authorized session. You could even see if the session is being recorded which is important for audit purposes. And then finally I just wrap up with this question of who owns OT security, which was the last seminar that we did with SANS in December. It generated a lot of interest in my conversations with security professionals in the ICS community. This turns out to be a big one because what we’re finding is that the CISOs want to unify IT and OT security. They don’t want to have to build a separate SOC. They don’t want to have rebuild all of the processes, workflows, that they’ve spent years building for the IT SOC. But they need visibility into the OT environment that they’ve never had before.
And so what we’ve found works for creating a unified strategy around IT and OT security is number on it needs top down attention. We’re all in this together. When a plant gets shutdown it affects everyone. Everyone’s careers. Everyone’s SOC options. Everyone’s growth path is affected by shutdowns in the production infrastructure. Number two, you need to do some cross training because IT security folks don’t understand how OT systems are different. They don’t know about the different approaches to patching them and configuring them. And similarly OT folks, by being put into the corporate SOC can help analysts there understand what the right workflow would be when an alert shows up in the corporate SOC related to an ICS device. And obviously establish two lines of communication to the OT engineers who really know what’s going on in that environment is essential.
And then finally platform like ours can provide deep and granular visibility into the OT environment to the corporate SOC for the first time. What assets are there? How’s the network set up? What protocols are being used? What vulnerabilities exist? Is there any malware? And what that does is allows the SOC analyst to use their existing workflows and processes, but with the added visibility they’ve never had before in the OT environment. And as part of that we built a native app for QRadar. The first ICS Threat Monitoring App for the QRadar environment that provides a much richer interface from the SIEM than your standard syslog interface which is fairly bare bones and has limi
With that I invite you to check out our ICS and IoT security knowledge base on the website where you’ll find transcripts from past SANS webinars if you don’t want to sit through the whole video. You’ll be able to get free downloads from ICS Hacking Exposed. By the way our threat intelligence is featured in chapter seven of that book. You’ll get something called the global ICS and IoT Whisper Port where we analyze vulnerability data from 375 production ICS networks worldwide and showed you sort of what the prevalence of different end points and network vulnerabilities was so you can compare yourselves to what you see there. You’ll see recordings that our threat researchers have made at sessions at Black Hat and Explore 18. And then a couple of other events happening down there at the bottom including the next one in London where we’re going to have the CISO from Teva Pharmaceuticals and the CISO from Scotia Gas Networks that were talking about how they are addressing ICS security. And with that, I’d like to hand it over to Justin for the second part of the presentation. Justin.
Yep. So I want to really wrap up this session by really talking about what are some of the overall strategies or overall concepts that we need to be concerned with as we’re going into and trying to defend against some of these attacks, especially in light of some of the newer attacks that are there. So if we go ahead and we look at the vulnerabilities that Phil presented, and the malware that Phil presented over time, we actually definitely see an increase in the frequency of these different types of compromises. Now one thing that we need to understand, any time that we have a single compromise, there’s going to be impacts to that compromise, and we’re going to experience whatever those impacts are be it the loss of a system, loss of connectivity, or any other type of disruption to our processes, we’re going to continue to experience that impact throughout the time period that we’re trying to remediate it.
So really the greater the duration that it takes for us to try to remediate our vulnerabilities, the greater the damage we’re going to have into our infrastructures, our processes, or our data sets themselves. And this can definitely mean both physical and digital loss. So while often inside of the ICS world we are very hyper focused on physical because of course loss of a process can actually lead to life and limb … Threat to life and limb as well as damage to the environment, but there’s also that point of digital loss as well. For instance, pharmaceutical companies losing their historian data with all their recipes for the different drugs that they’re actually doing. Or oil and gas companies losing information to foreign nations or other competitors about how much oil and gas they’re pulling out at specific spots.
All very, very confidential and very highly crowned jewels type of information to these organizations. So really until that incident is addressed, these losses will continue to increase as we go along. Now specifically, we’re really talking today about safety and some of these protection systems we have. Now if you’re not familiar with safety systems inside of the ICS world, generally the concept is this. If you have some type of a critical process that can have some catastrophic outcomes if we lose that process, or lose control of that process, often what we’ll do is we’ll have a second controller or series controllers, with their own connectivity, their own valves, their own controls, their own actuators, and often their own sensors, trying to monitor the health of this infrastructure.
So if you see in this little diagram on the left hand side, those are going to be two different controllers controlling some type of a process on a pipeline, and you can see we have a smaller controller kind of off to the right hand side that has its own connectivity down to its own valve. And so the concept here is that anything goes astray, so there’s some type of a large weather event, there’s going to be some type of a malfunction or a miscommunication, or a mistake on an operator’s part. You know the idea of the safety system is to immediately step in and try to mitigate the worst of the risk that could ever occur. Right? So we could definitely experience degradation of our processes, loss of material, loss of time, loss of revenues.
But the worst case scenarios of explosions and loss of life, loss of limb, damage to the environment, should be mostly mitigated by this safety instrumented system. And so we often build our risk models inside of ICS around this assumption that that safety instrumented system is going to be there to try to mitigate that last mile or the greatest issues that are there for us. Now when we have these safety systems, there’s also very strong international standards as well as certification of these safety systems trying to measure how effective they are in mitigating those worst case scenarios. Right? So we end up having TUV that can go through and do our certification. We can actually assign different safety integrity levels or SILs, rated from one to four to determine how important that system is. And that all allows us to go through and have some type of a risk measurement, a risk assessment process to help us understand where are the most important risks and the least important risks inside of our systems.
Now when we look at our traditional systems, if you look on this map on the left hand side you kind of see more of a traditional safety system. So we have our safety technology kind of on the far left hand side where it has its own sensors and actuators and it’s evaluating the current health of the process, and if it identifies some type of an issue it can immediately respond to be able to try to respond to try to save that issue. Where just to the right of that, you’ll see that automation box. That automation box represents the process itself that is being monitored. Right? So our traditional safety systems would be something that’s completely islanded, that’s independently monitoring that process and can stop at any time to try to save it, often with no way for a manual influence to affect that.
Well over time we started trying to realize that we can gain some benefits if we have a closer communications between our safety systems and the controllers themselves. So kind of the next phase, or the generation two of our safety systems, started having wiring and communications between our safety systems and our automation systems to be able to facilitate a lot of these different benefits for having these devices communicate with each other and be able to understand what each side is doing. And if you look at our third generation, or our latest type of safety systems, we now have vendors selling different types of controllers with safety technology integrated into the controller itself. So maybe it’s going to be a separate card inside of the same chassis, or maybe separate internal functionality in the controller that is dedicated for that safety itself.
Now there’s a lot of benefits to having that integration. By having it integrated there’s no additional hardware, we don’t have to go through and run multiple lines of wire and cabling. We can have shorter response times by having that integration. And it also makes it very easy for us to go through and create that proof of safety. The one biggest problem though is in all areas, minus cyber security, there’s a lot of benefits. When it comes down to cyber security that’s something we’re always afraid of. The more connectivity the closer these technologies are, often the easier it is to try to bridge some of those technological defenses or separation areas inside some of these systems. Now these safety systems can be a tattoo, and that’s really what the whole presentation that we were talking about here is with the TRITON vulnerabilities and malware that’s actually come out. And if you haven’t had a chance to learn about TRITON, there’s a lot of really great reports that are out there.
Go check out the FireEye report that did a great writeup on it. CyberX as Phil mentioned actually had a great blog post that goes through and talks about some of their findings, a lot of which Phil actually just presented to us. And then also go back to Schneider, the ones that actually hold the Triconex systems to understand and read some of their security vulnerability bulletin reports and some of the articles form them. Then of course no matter where you are in the world, whichever government organization or CERT team that you are associated with, they also will have a lot of good information out there for you as well.
I think one of the biggest things that we have coming out of our safety systems is kind of a list of strategies, or what are some things that we can do just from a defensive perspective specifically around the safety systems themselves. So we need to remember, specifically for the cyberattack surface for our safety systems, the more integration we have generally means the more attack surface we have. The closer that’s together, and the more communication going between the safety and the controller, the greater risk that we’re going to have there. And we can try to mitigate that risk by trying to do … Limiting the communications, try using something like a data diode so the information can only go one way. Try using other types of security defenses like Siemens has actually put in their Triconex systems internally to devices, try to limit some of the control and the flow of information internally.
But regardless, once again, on the most critical systems themselves we just need to understand the more connectivity there is there, the more risk that’s actually going to be there as well. And part of that risk is not going to be ale to be mitigated 100%. So when we’re going through and do a traditional SIS connected to a primary controller is something that prevents, or creates some of those risks and the communication channels going back and forth that we can go ahead and look at, try to address, and consider to see if there’s some type of defensive control that we could place there. And then of course remote access to our traditional SIS models or any type of a controller that has an integrated safety system is something that we also want to be very careful of. That remote access is something we should definitely try to avoid if at all possible to the safety instrumented systems.
Of course some of our business requirements might mandate, or might require that. So even having some type of a physical mechanism, very similar to the little switch, the key switch that Phil was showing on the programming mode and the run mode on the Triconex systems themselves for remote access, we could do something very similar as well so that an engineer on site will be able to actually flip a switch to provide some level of remote access or some other type of a mechanism where we can control exactly when that remote access occurs and does not occur. And of course monitoring that is going to be something that’s very, very important.
So ultimately the biggest recommendations is try to isolate safety networks as much as you possibly can. The more insulation you have, the more layers of defense, and the more your teams have to do to actually get to the controllers themselves, the more the attacker’s going to have to go through to be able to get to those systems, hopefully providing us many ways to identify that the attacker is there. Now one thing that we have in our SANS curriculum for ICS curriculum here at SANS is we talk about different types of models that we can use to be able to try to organize what we’re doing in a defensive nature. One of those is having our sliding scale. We’ll call it the sliding scale of cybersecurity. Different areas that we can go ahead and look at to try to identify where we can be effective and where a lot of effort’s spent. Now if we see on the left hand side, architecture, this the architecture that we’re actually building both our IT and OT systems, but as well as all the security that we’re actually placing then as well.
So basically architecting, connecting them appropriately and maintaining those systems by effective patching. Then you have the next two, passive defense and active defense. Now if you’re not familiar with this model, your more used to a traditional IT model, it’s very easy to misunderstand what we mean by passive defense and active defense here. So passive defense and active defense in this specific model is actually talking more about how much manual effort is involved. Passive defense is going to be something like a firewall that we go ahead and we set protections on and that maintains those protections with very little maintenance by any of our cyber security staff. Where active defense is going to be the types of analysis that we do from our different monitoring solutions and how we’re actually trying to identify those breaches or trying actively change and modify our defenses during an incident itself.
And then intelligence, intelligence is how we gather information not just from external entities for cyber intelligence, but especially from inside of our infrastructure inside of our systems themselves and having that feedback into our active and passive defense. And then finally offensive. Different types of techniques that’s usually very minor what we’re doing, but more the legal actions we can perform as well as some of the more self-defense countermeasures that we can perform inside of our infrastructures.
Now of these, that active cyber defense cycle where we actually have more humans involved and more of a manual effort. That’s something that always needs a little bit more explanation and this is kind of the graph, or the zoom in of that active cyber defense. So what are we actually doing with this. So really when we talk about active cyber defense, these are staff in our environments that are actually performing different types of active defenses, trying to identify and stop these attackers. That’s all based around threat intelligence. We need to be able to identify what intelligence from our systems we have, identify the indicators that compromise, and then trying to proactively go after these attackers that have breached our infrastructure. We want to go ahead and leverage our network monitoring security systems, trying to pull information and detect exactly where those attackers are. We need to able to be able to provide an organized systematic instant response so that we can systematically go through and try to contain those attackers and eradicate those attackers, and do it in a very graceful way where we can recover our systems with as little interruption as possible.
And then take whatever information we’re gathering along the process with a threat and environment manipulation modifying our environment to try to make it more difficult for that attacker to spread or to go further locations. So kind of more in the traditional idea of a containment process in our instant response. Now one way we can actually do to try to have a systematic program to do this, is by following some type of a cyber security framework. Now of course one of the most popular one inside of the ICS world is IEC62443. But that’s one that’s not always … It’s not fully ratified. There’s aspects of it that are sometimes very expensive to try to be able to purchase. Some of the documents that are there as well as the detailed guidance is sometimes a little bit of a challenge to try to go through and identify and follow.
One of the options or one of the solutions you could look at that we’ve had a lot of success here in North America, is actually using the NIST Cyber Security Framework, which is CSF. The Cyber Security Framework from NIST basically goes through and provides kind of a nice summarize … Think of it as a summary or a simplification of the IEC62443, providing different major categories of identify, protect, detect, respond, and recover of our overall systems, and then they will break them down into subcategories, and subcategories below that mostly integrated around different questions and questions we can ask ourselves about what our maturity level is and what we’ve actually done inside of our infrastructure, or what types of possible locations we may be missing defenses.
One thing that’s very interesting about he NIST CSF is they are building different profiles for the CSF for different parts of critical infrastructure. And here’s an example of one they just did last fall for the manufacturing profile for CSF that goes through and tries to adapt this CSF model back down into something very specific to OT environments.
Now in summary, the last thing I wanted to mention was some of you on this call may have taken the ICS410. That’s our security essentials for ICS and SCADA systems. We’ve been running that course here at SANS for the last five years and that’s the course that prepares us for the GICSP certification, and we update that course twice every year. So I’m the author of that course, and we usually have one major update and one minor update every year that we have that course. But one thing, since it’s been out for five years, we did do a very major overhaul of this course just recently. In fact it’s still going through the quality controls right now. But it’s estimated to be launched in June of 2018, so just a few months away. And we wanted to bring that up to any of you that are interested in taking this course or for any that you have taken this course, if you may have other individuals inside of your organizations that might be interested in it.
So we literally have about 30% new content we’ve added to the course, and we’ve also taken another 33% of the remaining content and did major overhauls and major upgrading of that content themselves. We really try to focus on reordering and having a very systematic method and approach to presenting these ideas, improved over the last one to really try to facilitate learning inside of the course. In fact we’ve taken every single section of the course and actually map each section back to the standards, IEC62443, NIST CSF, as well as other standards like ISO27001 and Cobit. So we have a lot of different options there depending on what types of standards your organization is following. Other major and sections that you might be interested in that we’ve included, we now have some strong discussions and even an exercise around fieldbus protocols and how do we defend against these traditional serial fieldbus protocols. We have exercises on that fieldbus. We also end up talking more about recommendations for how to specifically build ICS programs.
So staffing, organization reporting infrastructures, as well as frameworks and in this system we also will breakdown and are now discussing much more in depth the NIST Cyber Security Framework, breaking that down and providing that as a recommendation alongside IEC62443 as something that we can do and be very effective inside of our environment. And with that, we’re going to go ahead conclude the WebX and go ahead and pass that back over to Phil to see if there’s any other questions.
Hey Justin. Thank you very much Justin that was excellent. I had a comment from one of the participants emphasizing a point that you mad that safety systems should be logically and physically separated from automation systems and that’s really something that you did emphasize. We’re right at the top of the hour. I think I’ll answer one or two questions about our monitoring platform. “Does it impact the OT network?” The answer is no. It uses passive monitoring, connects up through a network switch, a SPAN port, and has zero impact on the OT network because it’s looking at a copy of the traffic, it’s not in line and it doesn’t require or use any agents on the devices. And then another question is “How many resources are required to run the system, how easy is it to deploy?” And because we use a self-learning engine that is specifically designed for ICS environments, it has behavioral analytics that were specifically designed to detect anomalous behavior in an ICS network.
It also has a deep embedded knowledge of ICS devices and protocols, within an hour of connecting our system to your ICS network, you’ll have insights about the assets on the network, vulnerabilities in the network, how the assets are connected, and whether there’s any threats or malware in the network as well. So wrapping up today’s webinar. Thank you again for your time. I hope you found it useful. Feel free to reach out to me directly if you have any additional questions. [email protected] Thank you Justin for your help, and thank you Carol at SANS for making all the logistics work for us. Happy Easter. Happy Passover.
All right. Well thank you so much Phil and Justin for your great presentation, and to CyberX for sponsoring this webcast which helps bring this content to the SANS community. To our audience we greatly appreciate you listening in. For a schedule of all upcoming and archived SANS webcasts, including this one, please visit SANS.org/webcasts. Until next time, take care and we hope to have you back again in the next SANS webcast.