Saturday, September 8, 2018

Thoughts After the Sans 2018 ThreatHunting Summit

Over the past few days I've had the pleasure of attending the Sans ThreatHunting Summit.  I thought, not only was this a terrific event, but also gave me the opportunity to see how others in our community are tackling problems that we all are dealing with.  I was able to look at the things I am doing and see if there are ways that I can improve or things that I can incorporate into my current processes.  The summit speakers and attendees also helped me spark new ideas as well as things I would like to dig into more.

One of the thoughts I had during the summit was when Alex Pinto (@alexcpsec) and Rob Lee (@RobertMLee) were discussing machine learning.  I believe ML may be hard to implement into a detection strategy, unless it’s for a very narrow and specific use case.  As the scope widens, the accuracy of your results may suffer.  What would happen though if we start building models based on a wider scope, but built them in a way that would cluster with other models?  Would we be able to cluster the results of these different models in a way may then highlight an attacker performing different actions during an intrusion.  I’m spit balling here, but as an example: 
  1. A model looking at all flow data for anomalous network patterns between machines.
  2. A model that is looking for anomalous authentication patterns.  
Can the results of these 2 models then be clustered by src ip or dest ip (or some other attribute) and the cluster would be a higher fidelity event than the results of each individual model?  I’m not sure as I don’t have a lot of experience with ML, so just throwing that out there.

Rick McElroy (@InfoSecRick) was also talking about something similar during his keynote.  Analysts need context when they are looking at events as it’s often very hard to classify something as malicious until you have aditional supporting evidence (I summarized).  I believe we can often build multiple points of context into our our alerting though.  By building visibility around triggers (actions), regardless of how noisy they may be individually, we can then generate alerts where there are multiple data points and therefore produce higher fidelity alerts while reducing the overall number.  An example may be: 
  1. Powershell initiating a http request.
  2. First seen non alpha-numeric character pattern.
  3. Multiple flags where the flag is 3 characters or less
By being able to generate an alert on any of the 3 characteristics, but not doing so until I have met a threshold of 2 or more, I have dramatically increased the fidelity of the alert.  Or we could generate a suspicious powershell event based on any 1 of the 3 occurring and send an alert when and additional suspicious action on the host has been identified within a certain time frame.  An executable being written to a Temp directory may be an example (or any other detection you may have that will map to the host).  The cool thing about this is that you can start to dynamically see behaviors vs singular events.

ATT&CK was discussed quite a bit throughout the summit (@likethecoins and @its_a_feature_).  This is such a cool framework.  Analysts can wrap their heads around the things that they can (and should) be hunting for.  I’m curious how many companies have adopted this framework and are using it to build and validate their detection.  If you start building the visibility around the types of things listed in ATT&CK, can you then start clustering events generated and map them through the framework?  The more data points that map, does that raise the confidence of the behavior, machine in question or user associated with the events?

My flight was delayed today, so I’ve been sitting at the airport for the last several hours.  This is a quick post, but I wanted to get these thoughts jotted down while I had some time. 


  1. Jack,

    Thanks for sharing your thoughts...very helpful for those of us who were not able to attend.

    I agree with your thoughts regarding the ATT&CK framework, and I think I know why analysts are able to wrap their heads around it. Going back to the DoD CyberCrime Conference in 2012, I remember sitting in a presentation when lateral movement was mentioned by the speaker, and an attendee asked, "...what does that look like..?" Couple that with another comment between two attendees at the same conference, where one lamented to the other that there were six presentations that included "APT" in the title, but none included anything actionable by boots-on-the-ground folks. Those comments stuck with me, and I started to realize that when someone mentions "lateral movement", I visualize a bunch of different "what it looks like" in my head, and key off on those things that are shared, or which I wasn't aware.

    I think that the point is that MITRE's framework, to a degree, illustrates to folks what something "looks like", to the point that detections can be written. Even if someone talks about the framework, but doesn't care what something looks like, they can point to a line and ask, "...does your product/process detect this, or that?" and get a nod. Or not.

  2. WRT Mitre's ATT&CK, an additional benefit that I see is that there can be some type of measurement across vendors in the EDR space using a standard that most are familiar with. This not only helps to inform the perspective buyer, but also allows them to measure their current capabilities and what positive impact a new product may have.

  3. Agreed. Some of already taken up the mantle of illustrating where they fall out against that standard.

    The logical follow-on would be an independent party doing their own tests against those vendor claims.

  4. I have been reading many articles on ML deployment in security products, and I see there are narrow differences between every product. None two are same, so that's the beauty. Detection criterias for specific threats being drilled down to the finest levels using ML.