This SANS webinar describes a new approach to securing critical industrial infrastructure called consequence-driven cyber-informed engineering (CCE). Andy Bochman, Senior Grid Strategist for National & Homeland Security at the Idaho National Laboratory (INL), presents the four key steps in CCE:
1. Identify Your “Crown Jewel” Processes — critical functions or processes whose failure would be so damaging that it would threaten your company’s very survival.
2. Map the Digital Terrain — all the digital pathways that could be exploited by adversaries to compromise your “must not fail” processes.
3. Illuminate the Likely Attack Paths — identify the most likely paths attackers would take to reach their targets.
4. Generate Options for Mitigation and Protection — identify and prioritize options for engineering-out highest-consequence cyber risks.
Phil Neray, CyberX’s VP of Industrial Cybersecurity, also discusses how a modern OT cybersecurity platform can provide new visibility into your digital terrain, prediction of the most likely attack vectors, and a spectrum of mitigation and protection options for reducing key risks to your company’s most critical functions.
About Andy Bochman
Senior Grid Strategist for National & Homeland Security at the INL (Twitter: @andybochman)
Andy is the senior grid strategist at the Idaho National Lab. He provides strategic guidance on topics at the intersection of critical infrastructure security and resilience to senior U.S. and international government and industry leaders. His career began with the U.S. Air Force, and before joining INL, was in several cybersecurity startups, was global energy & utilities security lead at IBM, and a senior advisor at the Chertoff Group in Washington, D.C. A member of the global advisory board for the Control Systems Cyber Security Association International (CS2AI), Mr. Bochman is on the advisory committee to the SANS security training institute and a cybersecurity subject matter expert listed with the U.S. State Department Speakers Bureau. In 2018, his publications include “The Missing Chief Security Officer” (CXO), “Internet Insecurity: The Brutal Truth” (HBR), and “Supply Chain in the Software Era” (Atlantic Council).
Hello, everyone, and welcome to today’s SANS webcast: CCE – INL’s New Approach to Securing Critical Industrial Infrastructure, sponsored by CyberX. My name is Carol Auth of the SANS Institute. Today’s featured speakers are Andy Bochman, Senior Security Strategist, National and Homeland Security, and Phil Neray, VP of Industrial Cybersecurity from CyberX, who will also be moderating today’s webcast. If during the webcast you have any questions for our presenters, please enter them into the questions window located on the GoToWebinar interface at any time. Please note that this webcast is being recorded, and a copy of the slides and recording of this webcast will be available for viewing later today, and can be found on the SANS registration page. With that, I’ll turn the webcast over to Phil.
Thanks, Carol, and good afternoon, everyone. Welcome to another SANS webcast. We’re thrilled today to have as our guest Andy Bochman from Idaho National Labs. He’s going to be kicking off the session with a description of a new approach for securing critical infrastructure, and then I’m going to be following that with some updates on recent cyberattacks that are relevant to some of the things Andy’s going to be talking about, and then how continuous monitoring and automated threat modeling can help implement the methodology that Andy’s going to talk about. So now, let me hand it over to Andy. Andy, start off please by telling us a bit about your background.
Okay. Hi everybody. Thanks, Phil. Thanks, Carol, and I sure will. Let me just make sure I can get the slides working. Okay, so I’ll skip really quickly through the origin story. I started off in the Air Force, doing comms and computer work. A little bit in the US, a little bit in Europe. I was in multiple cybersecurity startups, which is an exciting opportunity, as Phil can probably attest, and anybody else should try it if they haven’t already. Was acquired into IBM, and for four years, I parlayed my experience in those cybersecurity startups. I had also been blogging on energy subjects at night, moonlighting into a really great position at IBM, helping herd our security cats on energy sector projects around the world. Became an independent consultant for a year, a year and a half. Worked in the Chertoff Group in Washington, and finally was presented with a job description from the Idaho National Lab that was too good to turn down.
It was such a perfect fit and opportunity, so I said yes, and I just had my four year anniversary in July of this year. That’s how I got there. I’ve been in the DoE and National Lab Complex long enough so that I’m not a complete neophyte, constantly tripping over myself and our policies, but not so long that I feel super well versed. I still have a lot to learn, as largely an outsider to government. So, that’s the background. The title of the slide is at least my attempt to begin to suggest pivoting from—I’m going to use a term now that rankles some people. I’m going to bring up cyber hygiene. To that, I’ll also add some ways it’s a synonym, in some ways it’s an extension of cyber best practices, as outlined by the SANS Institute and others, that describe all the things that a company or a government organization needs to be doing if it wants to make itself as secure as possible. Nobody ever stays perfectly secure, not nearly so, but if they want to do all the things that experts recommend they do, there are the SANS top 20. And thanks again for SANS, for supporting this presentation today. The title suggests that even if you were to do all of those things perfectly, comprehensively, and to never stop doing them, to always keep them completely up to date, even in a large enterprise, that still will not guarantee success. I think I will get to it a little bit later on, but I won’t be talking too long, I promise. Phil has some good stuff to share, as well. I’m saying, there’s a reason when we see successful, targeted attacks time and time again, there is a reason to not trust solely in doing cyber hygiene and best practices for cyber security. There are a couple things that are being developed that go beyond that, that include that work, but go beyond it, and are specially targeted for things that simply cannot be allowed to be reached by digital means by attackers.
Let me see if I can advance the slide. A quick update on Idaho, for people that aren’t familiar with the Idaho National Lab. It’s one of 17 or so DoE National Labs in the complex. We do work for energy, and energy sector utilities. Certainly for the Department of Homeland Security and Department of Defense, and we work with suppliers closely as well. Suppliers of energy infrastructure equipment, as well as security suppliers on the products and services side. One way to think about these types of places, these national labs, is that part in the middle of the slide there that says we do what universities and industries can’t, won’t or shouldn’t. That may mean there’s not an economic reason for doing a particular type of research, it’s too far in the future, it can’t be monetized yet, or it’s too dangerous because we deal with things that blow up or radiate, or we deal with national security and classified types of information.
Just things that are beyond the means of a standard university or company to be able to grapple with. That’s in the southeast corner of Idaho, which is a beautiful state. If you’ve never had a chance to visit yourself, I recommend it. Okay, why is Idaho? Maybe some people who aren’t familiar will say, “Why did this expertise emerge in this far-flung state, in a remote part of the country?” The short story, at least my version of it is that this place has been dealing with dangerous nuclear processes for quite a long time. It’s built 52 nuclear test reactors since the ’50s, and it’s pretty soon going to be building its 53rd in the small modular reactor. When you’re dealing with materials and processes that are this dangerous, this hazardous to human health, imagine in the early days, well before digitization and automation. It still behooves the engineers to monitor and control these processes from a safe distance, as far away as they could get, to be honest.
As the computer world emerged, they saw themselves using some of that technology to be able to do an even better job working from a distance. Again, monitoring and controlling these dangerous processes once Clifford Stoll’s book came out, The Cuckoo’s Egg, and everyone started to realize that these systems could be used for nefarious purposes. Idaho, I think, is among the very first places to say, “Hey, industrial control and dangerous processes by digital means, there is a way that could be usurped by adversaries, bad guys, or people just playing around, and bypass all the safety that we’ve built into these systems. We better get expert on that real quick.” That’s precisely what started happening a few decades ago, and now continues apace at the Idaho National Lab. I have a couple heads to show you here, because I think these guys provoke the right type of thinking for what follows in the rest of my half-hour presentation.
This is Dan Geer, of the unorthodox sideburns, perhaps, but a completely brilliant person if you happen to have a chance to read his work or hear him speak. About once a year, he produces a keynote address, which is just stellar. This particular piece, the reference here is to about a 12- or 13-page short paper he wrote for the Hoover Institution called A Rubicon. Out of all of it, it’s just fantastic and pretty scary. It all boils down to this one quote from him. “All of risk, the source, the wellspring of risk is dependence.” So, you build yourselves a process, your company runs on critical processes, and they’re automated, and you’re getting even more dependent on automation, and perhaps machine learning and AI. That’s all fine from a risk point of view, as long as you have a plan B, and maybe a plan C.
If you’re able to switch over into something that’s perhaps a little bit less efficient, a little slower, a little less perfect, but you can have high confidence that it will be there if a black swan or dark, cloudy day comes your way for your software-based systems, then that’s fine. It’s only if we make ourselves 100% completely dependent on these systems that we are hanging over a cliff, either a little bit or a lot. We need to do something about it. It shouldn’t be just accepting it as the only way to play in 2018. That’s something to think about from Mr. Geer, Dr. Geer I believe.
Now here’s another one for you. The very provocative, at the bottom of the slide, Bruce Schneier with his new book, Click Here to Kill Everybody. He himself admits that that’s overkill of sorts, but it certainly will get attention. It actually is related to the theme of his new book, that as we put computers in everything, everything becomes a computer. It becomes a network computer, and we are piling on the risk again, and making ourselves dependent on technology that’s so complex that we can’t understand it. If we can’t understand it, we struggle to do a good job knowing how to secure it and keep it secure. His point is in the quote up top, isn’t to put everybody into the fetal position. However, he is saying in ways similar to Rob Lee, who says defense is doable. He’s saying these challenges of securing these systems, and thinking about how to design them for the future is super hard, but it’s of a doable type hard, like man on the moon, not impossible hard like faster than the speed of light travel.
So, there’s Schneier for you. I’m about halfway through the book, by the way, and I do recommend it. He’s a decent writer and a good thinker. Okay, now we’re going to start to pivot into, I think you take over at the CCE, which stands for consequence driven cyber informed engineering. I’ll get to that more in a second. But in a sense, this is stuff that comes from the INL playbook. Our place, our organization is populated by engineers, and by cybersecurity experts and safety systems experts and human behavioral experts. Many of them have been on the offensive side of cyber operations in the past, so they know how to think like an adversary. Here we go, a couple easy steps. This is the part that causes some problems for some folks, that perhaps I overdo it.
If you are a critical infrastructure provider, you will be a target of adversaries. Nation state adversaries, terrorist adversaries, criminal syndicates potentially, but certainly on the nation state level. If you run critical infrastructure, you are a target. Now, that doesn’t mean that you will necessarily be exploited, here’s my little caveat, but you should definitely be well aware that you are a ripe target for an adversary and should be preparing accordingly. The second bullet here, current understanding of cyber risk, which usually means, it’s usually informed by having a chief security officer or a CISO reporting to a CIO in an electric utility. In some industries, the chief, the most senior person with the word “security” in their name reports higher up than the CIO. They’ll look straight to the chief executive officer, or the COO, or the general counsel, or Chief Risk Officer.
But the way governance is run today, and the way we do cyber hygiene and best practices, from what we can see, people don’t fully understand the amount of risk that they are carrying. Even as they attempt to do the best things that they know how to do, hire and train the best people, and increase their budget every year. The third bullet then follows. If you’re to focus, if you’re in critical infrastructure, and you focus solely on cyber hygiene/best practices, you need to do that. Sorry, I mean you need to focus on those things, but for things that must not fail, that’s not going to be a guarantee that those things won’t be compromised in some way, and that the worst-case scenarios will come to pass. The very bottom part of this slide is referring to the pivot into engineering practices. Not just looking at going to the Moscone Center and buying defensive security products, the best, the new ones that come out every year, but actually bringing some good old-fashioned proven engineering practices to bear, the combination of which can actually make us much more defensible.
All right, so here it is, then. You don’t need to pay too much attention to the colorful charts on the right hand side, but I’ll walk you through the phases briefly. The first step of this four step or phase methodology is called consequence prioritization. It simply means, say you have a large enterprise with thousands, or probably more likely millions of end points, and you have a security team in the dozens, or maybe hundreds, depending on how you’re funded. You do your best to do those best practices all across the enterprise waterfront. It’s a good idea to try to keep everything secure, because you can get in, in one place, an adversary can, and navigate across a network, and reach a target. But what we try to say right up front in this process, by sitting down with the CEO and his or her first-ranked senior officers, by chatting with the board, and then by working our way down a little bit into the mid-rank operational and engineering type folks is, try to get us the comparative handful of processes and functions that simply must not fail.
The reason we use this term is, we’re in the realm now of corporate viability or strategic business risk. You know, there’s things that can hurt a company, and then there’s events of such a nature that they can end a company. We try to get as quickly as possible, and I’ll show you how we weight those in a second. We try to get as quickly as possible to a handful of scenarios that we can work with the customer on, and help illuminate the true level of risk that they have there, and then begin to develop some practical mitigations for those things. Consequence prioritization is admitting you have a huge bunch of stuff to protect, but identifying the subset of it all that no one in leadership would tolerate any type of exposure to, because they couldn’t imagine losing those things and keeping the company afloat.
The second step is system to systems analysis, phase two here in the middle of the chart. That means that for those handful of most critical processes or functions, identifying in a way that’s more detailed than probably most, or maybe any companies have really put themselves through yet, a full-blown inventory of all the hardware and software, and networking and people and processes that support those most critical functions. We’re talking, for example, at the particular computer, let’s say there’s an operating system on it and some applications. For that operating system, we’d want to know who made it. If it’s Microsoft, for example, then we can say which version is it, or which kind. Is it NT, or XP, or 2013 Server, things like that. Then we’d want to know what version it was on.
We’d want to know what patch level it’s at, and then we’d go even further, right? We’d look to see what kind of vulnerabilities are known to exist for that particular version of that operating system. Then we fan out into the applications, and do something very similar for them. Then we look at device drivers and DLLs, and that level of detail, it sounds agonizingly intensive. The reason why we recommend performing that and capturing that information and keeping it up to date, that’s the same type of information that adversaries go after and use to their advantage. After they’ve spent a significant amount of time resident in target systems and networks, they come up with that information, and use it to their advantage to craft their attacks. You need to know the same thing, too.
We put that to use in phase three, consequence based targeting, and that’s where we turn the tables and say, “All right, now that we’ve created that whole map of the real estate that the adversary would have to navigate in order to do their business, we’re going to find the ways through that maze that the adversaries would take.” And we prioritize it ultimately from the easiest pass-through with the highest confidence, to ways that are tougher for them or would take longer and they’d be less confident about. That is informed because INL is part of the intelligence community. It’s informed with the latest intelligence on the systems in question, and we’re able to present to the customer in a really visceral level of detail, these are the ways through this maze that an adversary would take to turn off, or damage, or destroy equipment that supports your most critical functions.
The last step is phase four, mitigations and tripwires. Here now, once we’ve shown that there are ways in as described, we turn to the engineers. Both our own, but particularly the ones of the customers, and say, “What can we do here to make it so that even if the adversary were to get into the network, were to reach certain target systems, they nevertheless wouldn’t be able to create the damage and the destruction that they were seeking to create?” Here, we turn to good old fashioned engineering first principles. This can be things like a tripwire, so that if you can think of Stuxnet as a public example, and Centrifuge were told to spin way out of tolerance, in ways that would destroy them. You can make it so that a particular high value, long lead time to replace a piece of equipment is told to kill itself by digital command, that the tripwire just says, “Hey, go to sleep, turn off gracefully. We’ll do forensics later on and figure out what happened.”
But they’re relying on analog means, which are generally not visible to the adversary coming from a digital pathway. We are able to protect this equipment, and learn, live to fight another day. A tripwire is something that I will refer to in a subsequent slide, but essentially it just means the reduction as much as possible to a particular target. So that if you had, say, 20 different network paths that an adversary would have the opportunity to navigate to reach a goal, and there are business reasons to have those, or at least there were when they first were built. Reducing as much as possible the number of pathways and, because it’s so much easier to monitor fewer of them than it is to try to keep a constant eagle eye watch on all of them. In many cases, the company isn’t even aware of all the different paths into a particular target system.
We help illuminate those, and then talk through with them, which ones could they start to shed without really having a substantial or unacceptable impact on the way they do business? That’s it. That’s it in a nutshell. It looks like an assessment. It is an assessment, but it’s really intended to be an awakening. To be introduced to a new way of thinking about cyber risk, and new ways, once it’s fully comprehended, of mitigating those risks, while still allowing companies, and the military, of course, to fully embrace technology. Use the latest and greatest in differentiating AI, and automation, and IOT, IIOT types of technologies, knowing that the things they really rely on most aren’t going to be able to be taken out by an adversary.
I mentioned that INL is full of folks who were once on the offensive side of operations, and so this slide here shows you something we help the client see, is what they look like from the outside. How inviting a target they are, even though it feels like by doing comprehensive hygiene and best practices, that they are doing a great job, and they’re very busy, and it costs a lot. That’s all true. There are still going to be ways that talented, well resourced, adaptive adversaries can find to navigate through the hygiene, okay? When we try to do that prioritization in the very first step of CCE, here’s some of the factors that we use for calculating a high consequence event score, right? It’s really hard to ultimately rack and stack, and choose the highest priority scenarios without having some type of numerical basis for it.
We ultimately convert, in these different categories, starting at the top, the area impacted, how much it will cost to recover, potential safety impacts downstream for the public, system integrity aspects. How broad the attack is going to be, and how long the damage would cause. How long it would cause, and how long it would take to get back on your feet. All these things can factor into ultimately coming up with a score for a particular scenario, and we use that to help pick the scenarios that we are going to then advance into phase two, where we do the system to system breakdown, okay? That’s a little bit more detail on that first phase of the methodology. Let me see if I can move the slide here. Let me go back. So, the previous slide was, “Think like an adversary,” and this slide is, “Act like an engineer.” Engineers, the best definition I ever heard for engineers was, their job is to solve problems.
So this is a problem. If the best cyber defenses can ultimately be defeated by certain adversaries, what do you want to do about it? So here we bring that type of engineering, problem solving mindset to bear, and bring in, not just digital technology, but whatever we have in our toolkit from the past, from the present. These include some analog controls, like the analog trip I described. Potentially introducing trusted humans back into the loop where they were removed, because of the desire to be more efficient. Several different types of things. A device called an attack surface disruptor that’s under development by Tim Roxy and Mike Wallace. There’s a number of different options you have here, and many of them are not very expensive for engineering out the cyber risk from things that must not fail. Ultimately, it’s not just for the operators, and engineers, and the technicians in the company. It’s for everybody, because you don’t want to keep deploying insecure systems for the processes that must not fail.
You want to ultimately bake some of this thinking into HR when you’re hiring people, and bake some of this thinking into procurement when you are buying new stock, or designing a new system. Obviously most people on this call probably have heard, you don’t want to bolt security on. You want to have it there from the very beginning, and CCE and another concept subset of it, cyber informed engineering, are the ways that we try to inculcate this type of thinking into the companies that we end up working with. Okay, I think this is my last piece here before we segue over to Phil. Absent an actual house call from INL, INL is not a very big organization. So while we are piloting these things, and learning how to perfect the delivery of a CCE engagement in the electric sector, for the military and over time in other sectors. At the end of the Harvard Business Review whitepaper that some of you have probably seen, and we can probably get to those of you who are interested in reading it on the CCE, here’s some of the basic concepts that you could do now if you are trying to achieve some of these benefits. The first one is to make sure that the message gets through. CCE is not called to stop doing the absolute best you can on cyber hygiene and best practices, cybersecurity and defense. You’ve got to keep doing that, otherwise you will be constantly nipped in the heels, and slowed down by the Wannacry and Notpetya and ransomwares of the day. So, please keep doing that. Don’t take that other message away. We fully support all of the folks that are trying to make better products, and deliver better services to help companies out in these areas. The second part is, obviously you can see, if you have been paying attention related to the first step of the CCE methodology: think broadly about protecting everything, but try to identify to the extent you can, systems, and processes, and functions that you just cannot tolerate an adversary being able to reach and do something to, or at least have the effect they are trying to achieve that would be a disaster for the company.
Start asking yourselves, and asking around what those things might be inside your organization. It’s not always known to the people at the very top. Sometimes you have to go down a couple layers to fully understand that. This is supposed to be a number three, I think, at the top of this middle box here, but I think I mentioned this: to reduce the number of digital pathways into and out of a particular process function to the absolute minimum that you can still work comfortably with, because it would be a lot easier to monitor them. Number four is related to really in the SANS, number one and two of the top 20, know what you have. Know what you have, in exquisite detail, especially for the things that must not fail. Because the other organizations that likely will have that information are not your friends. If you are going to be able to combat them and defend against them successfully, you are going to have to know a similar level of detail that they have.
And then lastly, as I mentioned a couple of these throughout the talk, if you do find systems that you think are targeted, potentially susceptible to a cyberattack that you simply cannot allowed to run to destruction, there’s some options which we can describe in more detail some other time about backstops. Again, having an analog trip, perhaps, that will help that system shut down gracefully when it’s given the signal to act way out of tolerance, and backups. Nonidentical backups. A plan B that is not exactly like plan A, so that the adversary has not been able to go to school on it yet. And even if it’s not perfect, it can keep you alive long enough that you can do forensics, push that adversary out of your systems and live to fight another day. That is the sum total of my presentation. I am saying thank you to you here, and we are segueing over now to Phil Neray.
Thank you Andy, that was awesome. Okay, let’s keep going here. Whoops. I’m going to talk about a couple of things. NotPetya, of course, happened over a year ago, but there was recently an article in Wired Magazine by Andy Greenberg that revealed some very interesting aspects. I’m going to talk about that. I’m going to talk about VPNFilter, which we covered in our last SANS webinar, but there’s some new information that we’ve gotten through Cisco Talos. A company put an ICS honeypot on the web. I know it was a bit controversial, but I think there’s some things we can learn from that, and then finally how all these things relate to what Andy has been talking about, which is consequence driven cybersecurity. So just quickly, let’s remember, why does all this matter?
Well, if you’re a security professional, or an OT person, your number one responsibility is towards your organization, and there is a business need for digitalization that we’re not going to be able to do anything about. I’m not sure what happened there, let’s go up. Carol, if you could just full-screen that, please. There we go. So, number one, we need to support digitalization as a way to make our processes more efficient and more effective. Number two, there are many adversaries out there, nation states being among them, but there are cyber criminals, as well, and hactivists, as Andy mentioned. We need to think about them. Finally, what Dale Peterson coined the term many years ago, insecure by design. Many of the networks that run our factories and our critical infrastructure were designed many years ago when security was not a primary consideration.
So, they’re missing many of the things that we now take for granted, like authentication and segmentation. How do you prepare for that? Sorry about that, you guys. Okay, let’s talk about this NotPetya article. Andy Greenberg is probably one of the best investigative reporters in security today, and he wrote a detailed article on this. He’s also writing a book on Sandworm, the Russian threat actor group that did many things in the past, including crashing the Ukrainian grid twice. But some interesting quotes here on NotPetya from Thomas Rid about how this is not necessarily a one-time event. It could happen again, because of the interconnectedness of our networks, and the complexity. Some of the things Andy was just talking about. Then Cisco saying this was not accidental, it was very deliberate. Unlike Wannacry, just as a reminder, NotPetya spread through intranets.
It was Eternal Blue, which was an SMB vulnerability which was stolen from the NSA. That was similar to Wannacry. It also used an SMB vulnerability, but unlike Wannacry, NotPetya used Mimikatz, which grabs credentials from memory and then uses those credentials to spread internally to other machines that share the same credentials. That’s why it spread so quickly, and that’s why it was very hard for anybody to stop it once it got into the network. There’s some examples in the article, including a large pharmaceutical manufacturer that lost $870 million from production that was down, and from cleanup costs. So, it was very, very widespread. You might ask yourself, “Wow, this thing spread very quickly. It was a sophisticated attack by a nation state. It was a destructive worm, it wasn’t even a targeted attack. How would I protect against it?”
I think some of the concepts that Andy talked about would help here. We’re going to talk about those when we get to the end of the presentation. We talked about VPNFilter last quarter. Believed to be by Fancy Bear, a Russian GRU organization according to the FBI. As a recap, it’s multistage malware that affects routers. Tens, many different types of routers. What we already knew last quarter was it has a packet sniffer for Modbus, which got everyone’s attention in the ICS security community. It can wipe the firmware of the devices, so cause a lot of chaos in your environment, and it uses Black Energy malware, which was the same malware used in the first Ukrainian grid attack.
In this most recent update from Talos, we learned a couple things. There’s seven new modules that are part of this malware, so someone took a lot of time and effort to build it. There is an endpoint exploitation tool that looks at the traffic going through the router, inspects it and redirects it, and could be used to look … It specifically looks for Windows executables, and it could be used to patch Windows executables as they fly by, so it’s an endpoint exploitation tool built into the router malware. Number two, port scanning and network mapping. We’re going to see that this is one of the very first things bad guys do, is they look around the network to see what’s available, what ports are open, how they’re connected. It’s part of cyber focused critical engineering to look at what ways attackers would try to move through your network.
An interesting denial of service on specific forms of encrypted communication like WhatsApp. The authors of the blog post theorized that’s to direct folks to use other forms like, excuse me, I can’t remember right now. The one that Russians typically use. Then a number of new ways that the big guys are using to obfuscate and encrypt their traffic, and build a distributed proxy network that would hide the source of command and control. So, some very sophisticated malware here. And again, I’m mentioning it because it has an ICS focus, and it’s very possible that many of these even Soho-style routers are used in ICS facilities because somebody at the plant decided that was the easiest way to solve a problem they were trying to solve. Then finally, this honeypot experiment. I know it got some flak from some folks, actually including from Andy, who were wondering about details that were missing, but it’s interesting that what this cyber security company did was simulate an ICS environment with a honeypot.
The environment had an IT network, or more probably like a DMZ type zone, it sounds like, and then an OT network with an HMI and the firewall in between, which is how most of our networks are configured. They had three Internet facing servers, and we know that RDP has become a favorite mechanism for threat actors to get into our networks, specifically for ransomware, but I’m sure they are using it for all kinds of other reasons. Those ports had weak passwords, and then they registered the DNS names of those servers, and used internal names that resembled what they called a well-known electric utility. So, all of the things to make bad guys interested. Within two days, these servers were compromised by a tool called xDedic. Which if you look up the top, that’s what I saw when I googled it. It’s a tool that’s sold on Russian forums, and what it allows you to do is to continue to have the asset owner use RDP to get into the system, but then you can also get into the system at the same time, using RDP to do bad things.
In 10 days, the access to that backdoor was sold to a new owner, presumably via the black market. Now somebody bought access to an ICS related asset, and began to do multi-point network reconnaissance to look for ways into the OT environment. The blog post basically doesn’t make it sound like they actually got in, but we know that this is a very common way for folks to get in. To first get into the IT environment, either using one of these open ports, or to use compromised credentials, and then to look for ways to get from IT to OT. Many times, there are ways from IT to OT that the IT and the OT organization aren’t even aware of. Again, doors between IT and OT that were opened by folks just trying to do their jobs, but not thinking about the consequences. There is a link down there to the original blog post, and also a great interview with the research team that you will find on the cyber wire.
So, in Andy’s Harvard Business Review article, he talked about four steps in this methodology. He touched on them in his presentation. The first one is, identify your crown jewel processes, or your “must not fail” processes. Depending on the organization, that could be different things. It could be the production line that your company depends on for the bulk of its revenue. It could be an ICS asset that, if compromised and exploded, would lead to environmental damage, and safety issues, and lawsuits. Anything related to your brand reputation, or one here that most people don’t think of, theft of intellectual property. A lot of interesting information about your proprietary manufacturing processes are stored on the OT side, not just on the IT side, and then compliance violations, the example being a petrochemical facility gets blown up, or a pharmaceutical facility that’s using dangerous chemicals, that then get leaked into the environment.
So obviously, identifying these processes is something that will require conversations with the business, and with the OT team and the IT team. Some examples are here at the bottom, including safety systems. The second part of the methodology is, map your digital terrain. Here, CyberX can help with an automated discovery of all your assets, and automated mapping of your network topology. That’s what’s showing up at the right. At the bottom left, you will see information about all the devices we have discovered, such as open ports, and then what we often find with our clients is they discover that there are all kinds of ways into this environment that they didn’t know about. Wireless access points, VPNs that were set up. In the middle there, there’s a picture of a cable modem. One of our clients actually found the cable modem that had been installed in the environment to facilitate remote management and maintenance by the vendor of certain equipment.
So there’s all kinds of ways into these environments, and in the past, nobody really worried about them, because you assumed that your suppliers and your contractors were trusted. But we now know that malicious threat actors will compromise third parties like your maintenance vendors, and steal their credentials so they can get into their environment, appear to be trusted employees, but they are really someone who stole trusted credentials.
The third step in the methodology is entitled, “Illuminate your most likely attack paths.” I found this interesting tweet here from The Grugq, one of my favorite guys on twitter. Of course, not everybody has the same threat models, but there are many ways to do this. I’m sure most of you are already doing it, tabletop exercises, pen testers, and what we introduced about a year and a half ago, which we call automated ICS threat modeling, which we not only mapped the topology as I have shown and identified vulnerabilities such as weak credentials, such as unauthorized connections, such as connections to the Internet, but we have then used that information with analytics to calculate the most likely attack paths on your critical assets. This is an example here from our console, where the person using the product has chosen a PLC down at level one, and right mouse clicked, and then pressed on the button called “Simulate attack vectors.”
Then the system goes through and says, “What are all the paths an attacker would use to get to this critical asset? Let’s rank them by risk.” That’s what we’re going to show you on this slide, where you pick your most critical asset, either the way I showed you, or in this view. The system comes back and says, “I found three attack vectors, ranked by risk.” Then it draws a visualization of that attack path. In this case, there was an Internet connection from one of the subnets, exposed to the Internet, which allowed the attacker to get in, and then exploit a number of Windows vulnerabilities in the middle, and then finally exploit a vulnerability in the PLC itself to compromise it. Although we now know that most PLCs have vulnerabilities that rarely get patched, and most of them, many of them have weak authentication, so it wouldn’t necessarily require exploiting a vulnerability.
You can then go back and simulate what if scenarios for mitigating or remediating this attack vector. That might involve better zoning or segmentation. If you can patch some of these intermediate systems, you would certainly want to do that, and you can then look and see whether this eliminates that attack vector. Which attack vectors then remain, and then decide if that’s a level of risk you are willing to live with. That’s what we call automated threat modeling. The fourth and final aspect of the methodology that Andy describes in his Harvard Business Review article is called, “Find options for mitigation and protection.” The first thing he talks about is reducing the number of digital pathways to a minimum. Now, we’ve talked about some of those unauthorized connections, segmentations, people with remote access that shouldn’t, or remote access that’s not being properly managed. There are a number of very well-known, and very mature privileged identity management and secure remote access solutions out there. One of them, for example, being CyberArk, and we’ve integrated the platform in CyberX with the CyberArk platform so that we can immediately identify both authorized and unauthorized remote access sessions.
These platforms do a lot more than just act as jump servers. They can also make sure you’re not reusing passwords. They can record the session, so you can look at what your contractors did when they were actually connected to your OT network. Obviously there’s an audit trail, and various other things that are really important for securing remote access. Obviously, one of the ways of reducing number of digital pathways is addressing vulnerabilities. Then finally, implementing compensating controls. We know you can’t patch everything, so you try to minimize the number of digital pathways to your most critical assets. If there will still be some that remain because a determined attacker will eventually find his or her way to those pathways, you want to put in compensating controls such as continuous monitoring to immediately detect when they have compromised your infrastructure.
I’m going to show you an example of that in a few slides, when I talk about the TRITON attack. Then the other thing that we’ve done is integrate with firewall infrastructures, so you can rapidly block sources of malicious traffic. And so, here’s some examples of alerts. You can see up at the top left, scanned device was detected. As we saw in the honeypot example, it’s the first thing the bad guys do. They start scanning the network, looking for information about what devices are there, how are they configured? What ports do they have available that they can use to compromise them? We’ll see in a few slides how the TRITON attack relied on an update of PLC code, ladder logic code to ensure the back door into the PLC that would also be detected.
Of course, it could be a legitimate update, but these things rarely happen, and you have to have a workflow so that when this type of update to the PLC code is detected, you can check with the appropriate OT engineer to make sure it was a legitimate update. Some other things up there. If someone is sending stop requests to your PLC, that should rarely happen. Again, something that you should have a workflow for in your sock, to be able to detect. Then threat hunting is all about going back in time. This is the timeline view of alerts and other notifications that you would see in our platform, so you can go back and see what else happened around the time an incident happened. This will help you investigate, it will also help you do threat hunting. Of course, there is a full data mining interface, so you can query past traffic for very specific events or incidents based on MAC address, IP address, commands that were used, so that you can go back and investigate the incident.
Let me tell you a bit about the firewall integration that we’ve done with Palo Alto Networks. This is part of our shipping product today, and it’s being used by real-world customers. Obviously, no one wants to automatically change a firewall rule without having a human in the loop. The issue becomes, how do you quickly change a firewall policy with a human in the loop, when malicious traffic is detected? So the way it works in the workflow we’ve designed, working with Palo Alto Networks is, CyberX sends an alert to the SIEM, of course, and at the same time creates a new policy for the Palo Alto next-generation firewall. So, this policy has all of the relevant information about the IP address of the host that we need to block, or the port, or the protocol that is being used. The human then approves that policy, the firewall administrator, and pushes it, either directly to firewalls, or in the case of Palo Alto Networks, Panorama is their centralized management interface. It’s the enterprise interface to all of the firewalls.
That’s the way to block the traffic very rapidly. Let me just wrap up with a quick summary of CyberX. We were founded by the military, CyberX. Of course, these are Blue Team cyber experts who defended critical infrastructure in Israel, and have that expertise defending against nation state threats. Our headquarters are in Boston. Our R&D and threat intelligence teams are in Israel. I’m in Boston today, as I speak to you. Three key use cases for the platform, which was purpose-built for OT asset management vulnerability and risk management, which includes the threat vectors I was talking about before, and continuous monitoring. It’s noninvasive, it’s agentless. It uses patented behavioral analytics and self-learning to very quickly learn your environment without requiring any configuration of rules or signatures, and it and integrates with your software flows and your security stack.
Because as we’ve seen, attackers will often go into IT and then move to OT, so you need unified monitoring across IT and OT. We do that with integrating with your existing SIEMs and other workflows. We’ve partnered with all of the major security companies and MSSPs worldwide to simplify your workflows, but also to offer MSSP services to supplement your own internal teams, if that’s required. Our threat intelligence team has discovered a number of zero day vulnerabilities in ICS devices over the years, going back to, you can see here, 2015. We work very closely with the automation vendors to validate the vulnerability, make sure they have developed a patch that they can then issue to their customers before the vulnerability gets released on the ICS cert website. We’ve worked with all of the major OT automation manufacturers for that.
We offer a multi-tier architecture with a centralized management interface to give you a unified risk view across your entire environment to all of your facilities worldwide. As I explained, it integrates with the systems you already have in your corporate sock. I’ve talked about the SIEMs, we also integrate with ServiceNow, for example, for ticketing. I just want to wrap up with this TRITON example. TRITON was discovered by Mandiant, a unit of FireEye, they’re very well known, when they responded to one of their clients in the Middle East. We later found out in Saudi Arabia, and earlier this year, there was a New York Times article that revealed some new information about this, that made it sound very much like, although it was circumstantial, that the victim was a joint venture between Dow Chemical, a US company, and Saudi Aramco, the largest oil company in the world.
We don’t really know what the goal was. We believe it was to disable the plant safety systems, that it was connected to the Shamoon attacks, which were launched by Iranians on Saudi Arabia in the past, and that even though it was Iran, it displayed a high level of sophistication. Some experts theorized that they got some help from Russia or North Korea. If you look at the way the attack unfurled, looking at here in the Purdue model view, we don’t know exactly how they got in, originally. We’re going to guess that they stole OT credentials, either from a contractor, from an employee, and then used those credentials to move through the firewall, into the OT environment where they deployed malware on one of the Windows-based machines. We’re guessing here it was an engineering workstation, it could have been an HMI or some other Windows-based machine.
It was Python-based malware that was recompiled to run as Windows executable. This part, we know for sure. Our threat intelligence team has reverse engineered that malware, and what we’ve found is that it used the native protocol of the PLC, the safety PLC called TriStation, to upload new ladder logic code into that device, and then insert a backdoor into the PLC which resided in the firmware of the PLC. So, very sophisticated. They had to know the memory layout of that device, they had to know the exact firmware revision number, and they had to use the TriStation protocol in a way that would allow them to communicate with their backdoor, but without breaking anything up that was going on in the environment. We theorized, as I said, that their goal was to actually disable the safety PLC, launch another attack that would cause temperature or pressure to rise above normal thresholds, which would then lead to massive safety and environmental damage.
So, the way continuous monitoring would help against this type of attack. Number one, we detect that remote access connection. Again, it could be legitimate, but there would be a workflow in place to detect whether it is legitimate or not. Number two, we detect the scanning that inevitably occurred once they established a foothold on that Windows-based machine, to look around and see what devices were installed in the environment. Number three, we would detect the update of the PLC ladder logic code that contained the payload to install the backdoor in the memory of the firmware. Then finally, the protocol that they used, used the actual TriStation protocol, but with parameters that were undefined by the vendor. So, these were parameters that allowed them to communicate to the backdoor whether they wanted to do a read or a write, for example, but it used parameters that are really illegal according to the way the vendor defined the protocol.
We would detect that, as well as pop up that little button at the bottom called “block source” that would automatically create a firewall policy to block the source of that traffic. Wrapping it all up, I want to direct you to our knowledge base, where you’ll find information on our vulnerability research, transcripts from past webinars, the global ICS and IOT risk report, and to encourage you to come visit us at some of these upcoming events. Two that I want to point out, the Palo Alto Network’s Ignite Europe conference, happening next week in Amsterdam. It’s going to feature a joint session with one of our customers, the CSO of a leading manufacturer who has implemented our integration with Palo Alto Networks, as well as with a number of SIEMs including QRadar. Then the ICS Cybersecurity Conference in Atlanta later in the month, where CyberX and Palo Alto Networks are sponsoring a free half-day, hands-on workshop on ICS security on Monday, the first day of the conference.
We’re also doing a joint session with Emerson Automation Solutions that will go through the process we work with them collaboratively to reveal the existence of a vulnerability in one of their devices. We’re calling it building mutual trust. I want to thank you for your time today. I’m going to take a quick look at the questions, we only have a few minutes left. Question for, okay. Andy Ginter has a question. Hi, Andy. He says, “Analog backstops may prevent some kinds of equipment damage, because Andy has talked about using analog backstops like real switches or humans. What do we do when a single unplanned shutdown is consequential enough that a business needs to eliminate the possibility of the event?” Andy, I’m not sure if you got that question. Do you want to take a shot at it?
I heard you, but I’m not sure I understand it. I’m familiar and I’m a fan of Andy Ginter, but would you repeat it again one more time?
Well, maybe you’d rather try to decipher the question. Talk about analog backstops, and your thoughts on analog backstops, because I know you’ve generated a bit of controversy around those.
Yeah, just the idea that … My dog is beginning to protest here. The idea is that if you can have analog in the mix, at some point it’s going to be not visible to the adversary. They won’t know that that was something that they had to go to school on, that’s something resident in or near their target. It can be a very nice, quiet way. In most cases, as I think I said, it doesn’t have to be expensive. Some of the mitigations that were put in place in our first pilot were surprisingly inexpensive, and didn’t affect operations at all. I think maybe Mr. Ginter and I will hash that out and make sure I completely understand, more clearly, more fully understand his question offline.
Okay, thanks. We’ve got one minute left, but I’m dying to ask you a question about, I think you made the point about cyber hygiene. I think some people might have misinterpreted some of your comments earlier this year as saying cyber hygiene is not important. Do you want to just repeat your thoughts on why cyber hygiene is simply insufficient on its own?
Sure, sure. It is so hard to both critique it without, how can you critique something without some people feeling like, taking it to the next level, taking it to the logical extreme and saying, “Andy says that’s stupid, that’s unnecessary, it’s a waste of time”? That’s not it at all. That key word that comes after that phrase, critique of cyber hygiene-only strategies for defending the parts of an operation that cannot be allowed to fail. You cannot allow it to be touched by an external actor, but you can imagine what would happen, what would befall the large, mid-sized and small companies, and their government counterparts in the US and elsewhere if they were to simply say, “You know what? Since somebody, some expert said they can get through my hygiene defenses, I’m just not even going to bother worrying about it at all.”
Can you imagine how quickly all your systems in all of your departments would stop functioning in a semi-productive way, both on the IT side and on the OT side? I think I’ll just have to use the power of repetition, some mea culpas for when I’m not communicating clearly, or I’m forgetting to reemphasize that point. You’ve got to have the best hygiene you can afford, the best hygiene that you can educate your folks on producing for your organization, day in and day out. I think products like CyberX, the new cohort of interesting OT security companies are all pushing the bar forward, and I’m excited watching them, and partnering, and rooting for them, and everybody else that’s trying to keep themselves secure, all at the same time trying to advance the whole community, if we can.
That’s great. Andy, thank you so much for your time today. Thank you to all the attendees. The slides will be available, the archived recording will be available both on the SANS website and on our website. One of the other things we do is provide a transcript of the presentation, so if you would rather just scan the transcript than listen to the whole audio, you can do that. Once again, Carol, thanks for your help. Andy, thanks for your help. Have a great day, everyone. Take care.
Thanks, Carol. Thanks, Phil. Thanks, everybody, for listening.
All right, and thank you so much, Andy and Phil, for your great presentation, and to CyberX for sponsoring this webcast, which helps bring this content to the SANS community. To our audience, we greatly appreciate you listening in. For a schedule of all upcoming and archived SANS webcasts, including this one, please visit sans.org/webcasts. Until next time, take care, and we hope to have you back again for the next SANS webcast.