Thursday, March 29, 2018

C2 Hunting

For an adversary to be successful in your environment they will need a way to enter and leave your network.  This can obviously happen in many different ways.  One way may be an attacker utilizing 3rd party access, another possibly gaining access through an externally facing device, but more often than not, this is facilitated by a backdoor being placed on a machine within your network, or at least the initial stages are.  Going with this assumption, it then makes sense that we spend a large amount of time and effort trying to identify indications of backdoors.

So when you sit down and think about the problem, ask yourself, what does a backdoor look like.  What does it look like when it’s initially placed on the machine?  What does it look like when it starts?  What does it look like when it beacons?  What does it look like when it’s actively being used?  For this post I will be focusing on beacon behaviors, but  remember that there are many other opportunities to hunt for and identify these.

When we investigate IDS alerts that are related to C2 activity, what are some of the indications that we look for that may help tip the scale in saying that the alert is a true positive.  Or to put it another way, what are some of the things that may be common about C2?

  • User-Agent is rare
  • User-Agent is new
  • Domain is rare
  • Domain is new
  • High frequency of http connections
  • URI is same
  • URI varies but length is constant
  • Domain varies but length is constant
  • Missing referrer
  • Missing or same referrer to multiple uri’s on single dest.

All of the above will not be true about every beacon, but in a far majority of instances, more than one statement will be true.  If I look for multiples of the above, by source and destination pairs, I believe that I will have a higher chance of identifying malicious beacon traffic than by analyzing each individually.

Next we need to generate some traffic so that we can validate our theories.  If you are wondering about a list of backdoors that would be good to test, have a look at attack.mitre.org and the backdoors that have been used by the various actors that are tracked.  I also can’t emphasize enough the importance of having an environment that you can use for testing out theories.  Being able to perform and log the actions that you want to find can often lead to new ideas when you see the actual data that is generated.  You also need to know that your queries will really find what you are looking for.  For this testing I setup 3 vm’s which are listed below.

Machine 1 
  • Ubuntu 16.04
  • InetSim
  • Bro
  • Splunkforwarder

Machine 2
  • Ubuntu 16.04
  • Free Splunk

Machine 3
  • Windows 7
  • Default route and DNS is set to the IP address of Machine 1.

Flow
  • Obviously the malware will be executed on Machine 3.  For backdoors that communicate with a domain based C2, a DNS lookup will occur and the dns name will resolve to Machine 1.  For IP based C2, the traffic will follow the default route on Machine 3 and Machine 1 will respond (using an iptables redirect and nat rule).
  • InetSim will respond to the C2 communication.
  • Bro will log the http traffic and forward logs to the spunk server.
  • Scheduled queries will run within the Splunk environment to identify C2 behaviors that we define.
  • Results of queries will be logged to separate index within Splunk.
  • Scheduled search will run against this new index in an attempt to identify multiple behaviors on either a host or destination.


I used the file for this blog post from the link below.  It’s named Cobaltstrike.exe, but I don’t believe it’s a Cobaltstrike backdoor.  I believe it serves the purpose for this post though.  How can we go about finding unknown backdoors, or backdoors that we don’t have signatures for.


https://www.hybrid-analysis.com/sample/5b16d3c8451a1ea7633aae14c28f30c2d5c9b925d9f607938828bf543db9c582?environmentId=100

The result of executing this particular backdoor can be seen in the screenshot of correlated events below.  To get a better understanding of how this correlation occurred I'll go over the queries that got us here.




When an http based backdoor communicates, it will reach out to a URI.  The URI or the URI structure is typically coded into the backdoor.  If the backdoor beacons to multiple URI's on the same C2 host, these URI's are very often the same character length.  This query looks for source/destination pairs with greater than 6 connections to multiple UR's of which all are the same length.


Just as the URI or URI structure is often coded into a backdoor, a User-Agent string is as well.  These User-Agents are very often unique due to misspellings, version mismatches or simply random naming.  By stacking User-Agents you will find rare ones, but very often, after investigating these, they will wind up being legitimate traffic.  By combining rare UA's with additional C2 behavior you can quickly focus on the connections you should be looking at.  This query looks for less than 10 source hosts, all using a single UA, communicating to the same destination.


When you want to identify how a host wound up visiting a specific URL you would typically look at the referrer field.  Very often the referrer is left blank with C2 traffic or can be hardcoded with a single referrer for every beacon.  It can be odd to see the same referrer field to multiple URI's, all on the same destination host.  This query identifies a single referrer listed for multiple URI's on a single destination.


This query simply looks at volume of traffic between a source and a destination.  When combined with additional behaviors, this can be a good indicator of malicious traffic.


There are many additional signs of malicious beacon traffic.  By spending time identifying these behaviors and incorporating them into some type of detection workflow, your chances of spotting malicious over benign becomes much greater.  By applying this methodology you gain additional coverage over signature based detection or new capability where you currently don't have detection, but have the data (i.e. proxy logs). 

All questions and comments are welcome.  Feel free to reach out on twitter @jackcr.



5 comments:

  1. "This can obviously happen in many different ways."

    From a host-based perspective, this trips up many hunters and responders alike. Yes, there are different ways, but start by finding and following the evidence.

    ReplyDelete
  2. Another great insightful post. Thanks, Jackcr

    ReplyDelete
  3. Is the last image, uri_length.png, the correct one? It doesn't do what the text describes and looks like a earlier version of ulength.png

    ReplyDelete