The Future of SecOps Is Automation and AI

LISTEN ON:

In this episode of Petri Dish, Petri’s Editorial Director talks to Ram Vaidyanathan, Product Marketer and security expert at ManageEngine, about automation in security operations (SecOps), including machine learning and AI, and how they will affect the security landscape and the products and services companies use to protect data and IT systems.

🔗 Links and resources

ManageEngine Log360
ManageEngine Expert docs

Transcript

Hello everybody and welcome to this edition of Petri Dish. And I’m happy to say that we’ve got Ram Vaidyanathan with us today, who’s a Product Marketer at ManageEngine. And we’re going to be talking to all you security managers out there, people working in security operations as well, about the future of security operations and automations, all those good things connected to machine learning, artificial intelligence, and how they’re going to be affecting the security operations landscape. And of course the products and services that are built for those functions.

Thank you, Ram, for joining us today.

Thank you so much for having me here. Thank you.

Thank you. So let’s get straight into it. So when we talk about security operations and automation, what exactly do you mean by automation? Yeah, definitely. I mean, the security team, the SOC team, they are the no center of any organization. And of course, in this day and age, attackers are getting more and more sophisticated.

So it takes a lot out of the security team. It’s pretty much a 24 seven job. And a lot of these things that they do, you know, day to day, when it comes to their work, it is not really automated. There’s a lot of manual work involved. There’s a lot of repetition involved. So wherever there is code to automate, wherever there is code to use machine learning or what they call security orchestration, automation and response, we should be able to do that. So anything that helps a security analyst make their job easier when it comes to detection, investigation or response, automation has a part to play there. This is really analyzing huge sets of data. Is that what we’re talking about and trying to find patterns and detect anomalies in that data? Is that what you mean?

So, yes, that’s part of it. It all kind of boils down to analyzing data when it comes to security, because there’s a lot of events that happens in any network. And at the end of the day, you want to kind of make sense of it all. And if you’re able to automate that using machine learning, for example, you know, so that’s what you are alluding to when you are, you know, looking at, looking, you know, looking at different kinds of patterns there. So you can actually have a machine learning system. Look at what is happening in your network over time.

So it takes about two weeks for any machine learning system to learn what’s happening, kind of set a baseline for every user and device in the network and figure out what is now, what is normal and what’s not normal for every user and device.

And once it has that baseline set, it will actually look for any deviations from from the normal. So in case you have a user who usually does not perform a particular set of actions and suddenly you have that same user performing that, you know, set of actions, you know, that that’s an anomaly. So that’s again, automation, right? Because here you have the system using machine learning to figure out that that is not normal. And it is able to alert the security analysts that this is going on. The security analysts can then come in, take a look and see if this is okay or not. So this is part of it. Yes. Okay. So some kind of behavior analytics, essentially. That is absolutely right. And like I was saying, in terms of behavior analytics, you’ve got to have a baseline down. So you need to know what is normal for a particular user or device. And you should be able to continuously track what that user or device is doing over time. And if there is an anomaly in the behavior that’s noticed, then you’d want to, you know, highlight that to the security analyst can also bring in risk scoring here. So depending upon the number of deviations that a user or for that matter, a device exhibits, you can give a risk score to the user or device.

So obviously we have things like security and information and event management systems, you know, obviously the big famous ones out there, like I don’t know, Splunk and Sentinel and, you know, many others, I guess, that viewers are probably going to be familiar with. But I’m hearing a lot recently about SOAR. So that’s S O A R. So it’s security, orchestration and automation and response.

So how does that fit into these kind of, if you call them legacy SIEM systems that we might already have in place?

Yes. So SIEMs have been around since 2005. So that’s, you know, close to 18 years ago that they have, you know, they were first SIEM. Nowadays, what we have is SIEM, UVBA, that is anomaly detection, user and entity behavior analytics, and SOAR all coming together. So there was a time maybe six or seven years ago when the three of these kinds of solutions operated separately. So you would have SIEM solutions, the traditional SIEM solutions, which would help with log management.

And then you have UVBA, which did your anomaly detection. And then you’d have SOAR, which would automate wherever possible. Now, a new eight SIEM or a next gen SIEM would have to bring all three together. So you’d have to necessarily have a system in place when it comes to detection, which incorporates features capabilities of a SIEM, UVBA as well as SOAR. That is what is going to make lives a lot more simpler for security analysts.

What is the lack for somebody like me, who’s a bit of a layman in this area? What does the landscape look like at the moment? You said that these three pillars need to be brought together essentially. Is that currently the case with the kind of more well-known SIEM solutions or not really at this stage? That is absolutely right. So a lot of SIEM solutions, including managed engines. So we make a SIEM solution called Log360, which does exactly that. So we have the traditional SIEM, so that’s a log management part where you’re able to archive logs, manage your logs, everything that happens in your network. You’re able to investigate, you’re able to search through logs, you’re able to do all of that.

But apart from that, like we were talking about earlier, you have the UVBA bit, that is user and entity behavior analytics or anomaly detection, which is able to set up a baseline and look for anomalies. And then of course you have SOAR, and that is security orchestration, automation, and response, which is able to automate every bit of your threat detection, investigation, and response wherever possible. So with SOAR, for example, you’re looking at more and more integrations.

So, you have your SIEM solution like Log360, which can integrate with your threat intelligence feeds, which could be a third party provider. So you bring all of those threat information, threat intelligence information in, and suddenly you have more context as a security analyst. You’ll be able to put two and two together, you’ll be able to see the bigger context, you’ll be able to see that, okay, these are the threats that are out there as identified by your threat intelligence feed, and then you’d be able to see what is happening within your network, and you can put the two together.

Likewise, you can also integrate your SIEM solution with third party ticketing tools or help desk solutions, where let’s say that an incident is being recognized by your SIEM, you will be able to send that information over to your ticketing tool, and you can take that to its natural conclusion or closure through your ticketing tool. So that’s how the integration with the ticketing tool would work. So these are things, of course, there’s one more thing which is very, very important.

It’s about, you know, how do you respond to something? So let’s say a threat has been identified, how do you respond to it? And of course, you know, parts of it can be automated as well. So let’s say that there’s a particular threat, and you want to take a machine that is associated with that threat of the network, you can do that automatically, because if you think about it, you don’t want to be waiting until, you know, you receive an alert, and then you come back, you investigate, and then you take corrective action. As soon as something wrong is noticed, you want to take action immediately, and you want to use or you want to rely on automation to do that.

So, you can set up workflow rules, where you just say or you program your SIEM solution to look for certain conditions. And as soon as those conditions are met, it will go ahead and do certain actions in terms of response. So you can take out the user, you know, right off the network, or you can take a machine off the network, you can disable the user, and so on.

Okay, so how does that work exactly? Is there some kind of an agent that you install on the endpoints that’s required to provide that kind of isolation? Or how is that set up exactly? So there are different ways of achieving this, actually.

So, you have the agent-based way or the agentless way. And depending on what you really want to achieve, that’s how it works. But usually, there are also APIs that are used. So let’s say that your SIEM solution integrates with a third party solution in order to, you know, help you with an automated response. So you can actually make use of APIs, which can, you know, empower you with two-way communication.

So let’s say, for example, you want to change a firewall rule, you know, as soon as particular conditions are satisfied within your network. So you can get that done as well. So you can go ahead and, you know, program your SIEM solution. And whenever a particular condition is satisfied, it will go ahead and change that firewall rule. Right. Got it. So I mean, if I’m right in thinking this, there’s a lot of this kind of machine learning, AI stuff that’s now coming into security operations.

A lot of this is being enabled by the cloud, I suppose, and being able to process bigger and bigger amounts of data and more processing power required to, you know, to do the machine learning aspect of it. Am I correct in thinking that? Is this a solution like Log360, for instance, that is dependent on the cloud or can it run just on premises or can it be a bit of both? How would that be set up? Yeah, yeah, that’s a great question. Now, when it comes to the cloud, there are two aspects to it. One is that a lot of organizations are moving to the cloud. So the first thing is your SIEM solution should be able to monitor things that are happening in the cloud.

And secondly, another trend that we are noticing it is that, you know, organizations also want their SIEM solutions to be on the cloud. So that is they want a SAS based SIEM solution. Right. So Log360 is available in two versions. So you have the on-prem version as well as the cloud version, which is called Log360 cloud. And yeah, so with Log360 cloud, you’d be able to do pretty much, you know, everything that you do on-premises and you’d also be able to monitor the cloud itself. Like I was saying earlier, that becomes more and more important. Right. Okay. Can you tell us a little bit, I think you touched on it a little bit more about

risk scoring? What exactly does that mean? Right. So I was saying that in context of anomaly detection. So let’s say that you have a user, a user aid, and this user aid is known to do a certain set of activities. So what is going to happen is that maybe within two or three weeks, the UEBA solution or the UEBA component of a SIEM solution will be able to set a baseline for user aid. So let’s say three weeks have elapsed. Now user aid performs a set of activities.

A comparison is going to be made. Well, are these set of activities being performed by a user aid? Is it okay? Is it actually expected from user aid? In case it’s not, the risk score of user aid is going to be increased. I’ll give you an example. Let’s say that a user is known to log on between 9 a.m. and 10 a.m. every day. And this has been noticed over several weeks. And this seems to be, this is set as its baseline. Now on one fine day, this same user account shows a log on at 10 p.m. Right. So that’s kind of late at night. So this seems to be an anomaly.

And your SIEM solution powered by UEBA will be able to recognize this. And we’ll be able to adjust the risk score of this user upward. So that’s how risk scoring works. So what I was telling you right now is what is called a time-based anomaly. Because we are actually looking at times, log on times, for example.

Just like time-based anomalies, you can also have count-based anomalies and pattern-based anomalies.

So with count, what happens is the number of activities that are done on a particular server, for example, is looked at. So between 11 a.m. and 12 p.m., maybe 100 activities are being done on a particular file server. And that is absolutely normal. But let’s say that on a particular day, you have a thousand activities being performed on that same file server. Now that is abnormal. That is a counter-normally. Then of course you have patterns where we look at sequences.

And we see if that particular sequence is okay or not. If that kind of goes with the established profile or not. That becomes a pattern anomaly. So can you set those thresholds that might trigger an alert? Are they adaptive? Or how do you set those thresholds? Or is that kind of automatically determined by the machine learning system? So there are two ways of doing it again. With machine learning, you want to reduce the amount of human intervention. So what you would ideally want to do is you want to let the machine learn on its own.

So this is called, basically, it’s dynamic in nature. So what would happen is it would kind of look at what is normal for a particular user. Over time, it would learn what that threshold would be for that particular user. It would know. Over time, with experience, it will learn that, okay, this is the threshold to set for this particular user. Of course, based on the algorithm used, there could be a little bit of a here and there in that, I mean, you don’t want to kind of set it exactly as a threshold. Then it is not machine learning anymore. So it would be adaptive, of course. So let’s say that over time, a particular threshold has been reached. And then let’s say that suddenly, over the next two weeks, the same user is exhibiting certain kinds of behavior that goes against that baseline. And over time, the system will learn that, okay, now we have to adjust this user’s baseline. So it learns with time and the threshold keeps getting revised.

Got it. So assumedly, you could also, obviously, we’ve been focused on a particular user or entity and how changes in patterns or behaviors might change its particular risk score. But I assume that we could also get an overall picture of the organization. And that’s something that we could report to the chief information security officer, so that we get a more high-level overview of the security posture and risk. Would I be right in thinking that as well?

Right. This is something that a lot of CISOs have told me, that it is risk that is the biggest metric, or the most important metric rather for them. So at the end of every quarter, let’s say the CISO goes to the board, the board is always going to ask them about the risk posture, or some way to quantify the risk. So it would always be good for the CISO and also the security team as a whole to look at the risk of the user. And over time, they want to kind of use that as a measure to look at the overall risk of the company. Because the users make up the company, so that is a very good way of doing it. And of course, you can also use certain kinds of frameworks to go ahead and reduce the risk. So one of the best frameworks used to reduce your cyber risk is the NIST framework, the NIST.

Okay. You talked about SOAR and the importance of integrations with ticketing software and that kind of thing. Could you give me a couple of examples or a few examples of the kind of integrations that Log360 provides with other commonly used pieces of software to enable some of this automation and response to happen outside of the theme system itself?

Right. Yeah. So I was telling you earlier about threat intelligence and how it’s so very important for threat detection. So Log360, for example, integrates with WebRoots Bright Cloud Threat Intelligence, where we are able to bring in all of that information and then put that along with what information we collect in the network itself. And the two can actually come together, and the security analyst will get more context to make a proper decision whether a particular activity is okay or not. So this is one sort of an integration, the integration with threat intelligence speed providers.

You get to see things like the reputation score. You get to see things like the category of threat, the source IP address, the destination IP address, and so on. So that’s a very, very important integration to have. And this is a point that I make to a lot of people. I tell them that whenever you want to integrate with the threat intelligence feed, at least go with three different threat feeds, because what one threat feed gives you, the other will not. The rule of thumb usually is three different threat feeds that you want to integrate with. The other integration that’s very, very important is the one with your ticketing tool, your help desk software. Log360, for example, integrates with BMC Remedy, Atlassian,

Jira. Of course, ManageEngine has its own help desk ticketing solution called Service Desk Plus that we integrate with as well. Kayako is another one. So these are ticketing tools that we integrate with. And what happens here is that, let’s say a threat is noticed in the network, you’re able to send it automatically to your ticketing tool. So the alert gets received in your ticketing tool, and your technician who works on the ticketing tool can take a look at the alert and take it to its logical conclusion, can work on the ticket from there itself. So that’s the integration there.

Yeah. Okay, well, that sounds really interesting. There’s a great best practice there, I think you had with using three different sources of information for threat intelligence. Intelligence, that’s not something that I’ve heard before, but sounds like really sound advice. So Ram, what would you say to a security manager today, who is not enabling these kind of machine learning, AI, and all of this stuff connected to SOAR as part of their own security operations? Because it seems to me that, especially with the advances that we’ve made in the past here with AI, that we’re all going to have to be moving faster and in a more intelligent way to stay one step ahead of the bad guys. Okay, so there are two types of organizations, right? So there are organizations that have been hit by a cyber attack.

And of course, there are organizations that don’t know that they have been hit by a cyber attack yet. So, of course, cyber attacks are a plenty right now, and attackers are getting more and more sophisticated. Like I was saying earlier, it all starts with your cyber risk. That is something that every SOC manager has to look at. What is the risk exposure of the company? And is there a way to quantify that? If they can quantify that, that’s amazing, you know, that’s something really, really tangible, and they can work towards reducing it. So now that you have your security risk quantified, or if you’re not able to quantify it, that’s okay. But at least try and do it in a qualitative way.

That’s fine. And you can have your own measures to do that. But once you have it down, once you have a degree of your security risk, go ahead and look at what gaps you have. What can you actually implement to reduce that risk over time? And is there a dollar value that you can put to filling that gap? So now what you can do is you have a dollar value to fill the gap, you also have a dollar value to quantify your risk. Now you can look at your ROI. So in case you are able to adopt a particular security solution, what is the benefit in terms of your dollar value that it’ll give you? So if you’re able to quantify everything, that’s exactly what you need, then you can actually make a very strong case for implementing a security solution. And of course, the point that you made about AI and ML, that is imperative, that is absolutely necessary.

Gone are the days where you can just rely on signature based threat detection, that is not going to cut it anymore. You have to have AI and ML in place, because attackers are always evolving, they’re always looking for new ways to get into networks. And therefore, you’ve got to have AI and ML that will empower you and that will just do a lot of the heavy lifting for you and detect these threats. The other thing that you could do is go ahead and whatever SIM solution you’re using, if you’re able to leverage the MITRE ATT&CK framework, if your SIM solution is able to bring in all that information that the MITRE ATT&CK framework gives you, because the MITRE ATT&CK basically lists out 14 different tactics that adversaries can perform on an organization. And if you’re actively looking for those tactics, in case they have been done within your network, then you’re keeping yourself proactively secure. A good SIM solution should be able to help you do that too.

Right. Yeah, I mean, I think putting a dollar value on something and then establishing what the return on investment might be is critical for getting the right decisions made in most organizations. So that’s a really good place to start, I think, with all of this. Okay, so thank you very much, Ram. That was really interesting. Where can people find out more about Log360? Oh, definitely. So please go to www.manageengine.com slash log dash management. That’s log hyphen management. And you can find more about us. I would also highly recommend that people go on Google and just search for ManageEngine Expert Docs.

There’s a whole lot of articles and blogs that we are working on. It’s called ManageEngine Expert Docs. And you can learn more about us. And you can also learn about what’s happening in the cybersecurity, the world of cybersecurity. Great. And we’ll put links, both of those links, in the description for this video below. So do go and check that out. Okay, thank you very much, Ram, for joining us today. It was really interesting. Thank you so much for having me. It was a pleasure. Thank you so much. Okay. And that’s it from us for today. And we’ll see everybody on Petri dish next time.