I just read this post by @hexacorn which is related to threat hunting and how impractical it may be to implement due to the vast amount of data generated on a corporate network. He states that this data is good for post mortem analysis or during an investigation, but to use it for hunting is largely ineffective. First I would like to say that I agree with @hexacorn in a sense that people generally look at threat hunting as singular events which can produce huge amounts of data to sift through and may be largely impractical. I think it’s natural for people that are new to this area of work to view it this way and eventually want to give up because it’s not proven to be effective.
I like to look at hunting in a different way. How can I use the knowledge I have about actors and behaviors to apply to hunting so that I can get closer to the events that need to be investigated. By not focusing on singular indicators, but more accurately focusing on behaviors of attackers, I can vastly reduce the amount of data and get closer to the anomalous events that may need attention.
In the blogpost @hexacorn described hackers using ipconfig.exe, netstat.exe, net.exe, cscript.exe, etc, but said that it’s used so often that it’s impossible to sift through. I consider the above, as well as many others, to be lateral movement tools and are often used in conjunction with each other. A good way to look at this differently would be to look for combinations of these tools being executed within 10 minutes from a single process name with a single user on a single host. This will vastly reduce the amount of data being returned as well as surfacing a behavior that attackers commonly use. This is far different than surfacing a single tool, used legitimately thousands of times a day and also that attackers commonly use.
In the same sentence, psexec was mentioned. It’s true that this tool is often used by administrators and can be difficult to look at a single process with the ability or confidence to say this is bad. If I step back and think about this differently I can say that when an attacker moves through my network they often use the same credentials and tools to perform lateral movement. Applying this thought to the problem I can say that it’s odd for a single user to be identified on multiple machines, executing psexec against different hosts. Again, this will reduce the amount of data needed to be looked at while surfacing, what may be, anomalous behavior.
For identifying odd processes, I agree that it’s difficult to look at these en masse and pick out the bad ones. I like to look at this a couple of different ways and both are from a parent process. The first is, knowing that cmd.exe is probably the most commonly used executable during any intrusion, what processes are parents of cmd.exe and how many times was it executed by the number users (a single user executing cmd.exe on multiple machines from a rare processes may be anomalous). The second way is to think about how commands can be executed. Tools such as powershell, wmic, at.exe, schtasks.exe… and what users and parent processes are executing those (would if be odd for a local admin or service account to be scheduling a task or the owner of an IIS processes to be executing powershell?). I would also suggest that if you want to find odd occurrences of svchost.exe, which is mentioned in the post, to look at the parent processes of svchost and not necessarily the location on disk.
I truly feel that if you want to be successful at hunting you need to look at the problem differently. In an IDS/IPS world we deal in singular events that match a specific signature. We need to throw this model out for hunting because it simply doesn’t work. As @hexacorn described in his post, it produces huge amounts of data that are simply impractical review effectively.
Another issue I see, that wasn’t talked about in the post, is knowledge transfer. I may think about hunting in the way i described above, but when I transition something to a different team will they view the data in a way that was intended, or will they even understand the data that is being given to them. If they don’t understand the data, their confidence may be low as to the severity of the event. An example would be looking for anomalies in msrpc pipes. Host or user enumeration of windows machines likely happens over smb and the communication between machines during these events will often occur over named pipes. If I generate an alert for multiple samr pipes being created with multiple remote machines in the UNC path over a given period of time, what will the analyst do with this? Don’t assume that an analyst, regardless of experience , will know what your thought process was when you created this hunt. Give the analyst context because these can be just as difficult to understand as an IP address that alerted with no explanation as to why. If the analyst doesn’t react with the same sense of urgency to the alert as you feel it deserves, it may not be a failure on their part, but more of a failure on your part for not giving them the proper knowledge transfer and ensuring they understand what they are looking at.
My hope is that @hexacorn doesn’t take offense to my rebuttal to his post. I think about hunting as being creative in the ways that I look for threats. This is not an extension to an IDS/IPS, but an entirely different method and mindset of finding bad.