Sunday, May 17, 2020

It's all in the numbers


In my last few posts I talked about hunting for anomalies in network data.  I wanted to expand on that a bit and specifically talk about a way we can create metadata around detectable events and use those additional data points for hunting or anomaly detection.  The hope being that the metadata will help point us to areas of investigation that we may not normally take.

For this post I'm again using the BOTS data from Splunk and I've created several saved searches based on behaviors we may see during an intrusion.  Once the saved searches run, the output results are logged to a summary index.  More on that topic can be found here: http://findingbad.blogspot.com/2017/02/hunting-for-chains.html.  The goal is to get all of our detect data into a queryable location as well as a way that we count.

For our saved searches we want to ensure the following.

Create detections based on behaviors:
  • Focus on accuracy regardless of fidelity.
  • A field that will signify an intrusion phase where this detection would normally be seen.
  • A field where a weight can be assigned based on criticality.
  • A common field that can be found in each detection output that will identify the asset or user (src_ip, hostname, username...).
Once the output of our saved searches begins to populate the summary index we would like to have results similar to the screenshot below:













The following is the definition of the fields:
(Note: the events in the screenshot have been deduped.  All calculations have taken place, but am limiting the number of rows.  Much of what is identified in the output is data from the last detection before the dedup occurred.)
  • hostname: Self explanatory, but am also using the src_ip where the hostname can't be determined.
  • source: The name of the saved search.
  • weight: Number assigned that represents criticality of event.
  • phase: Identifier assigned for phase of intrusion.
  • tweight: The sum weight of all detected events.
  • dscount: The distinct county of unique detection names (source field).
  • pcount: The number of unique phases identified.
  • scount: Total number of detection identified.
  • phasemult: An additional value given for number of unique phases identified where that number is > 1.
  • sourcemult: An additional value given for number of unique sources identified where that number is > 1.
  • weighted: The sum score of all values from above.
There are a few points that I want to discuss around the additional fields that I've assigned and the reasons behind them.
  • Phases (phase,pcount,phasemult): Actors or insiders will need to step through multiple phases of activity before data theft occurrs.  Identifying multiple phases in a given period of time may be an indicator of malicious activity. 
  • Sources (source,scount,dscount,sourcemult): A large number of detections may be less concerning if all detections are finding the same activity over and over.  Actors or insiders need to perform multiple steps before data theft occurs and therefor fewer numbers of detections, where those detections surround different actions, would be more concerning.
  • Weight: Weight is based on criticality.  If I see a large weight with few detections, I can assume the behavior may have a higher likely hood of being malicious.
  • Weighted: High scores tend to have more behaviors identified where those behaviors reach multiple behaviors.
Now that we've performed all of these calculations and have a good understanding of what they are, we can run k-means and cluster the results.  I downloaded a csv from the splunk output and named it cluster.csv.  Using the below code you can see I chose 3 clusters using the tweight, phasemult and scount fields.  I believe that the combinations of these fields can be a good representation of anomalous behavior (I could also plug in other combinations and have the potential to surface other behaviors.).












The following is the contents of those clusters.
















Based on the output, the machine in cluster 1 definitely should be investigated.  I would also investigate those machines in cluster 2 as well.

Granted, this is a fairly small data set, but is a great representation of what can be done in much larger environments.  The scheduling of this method could also be automated where the results are actioned, correlated, alerted on ...).

Again I would like to thank the Splunk team for producing and releasing BOTS.  It's a great set of data to test with and learn from.




Thursday, May 7, 2020

Hunting for Beacons Part 2


In my last post I talked about a method of hunting for beacons using a combination of Splunk and K-Means to identify outliers in network flow data.  I wanted to write a quick update to that post so that I can expand on a few things.

In that blog post I gave these different points that help define general parameters that I can begin to craft a search around .   This helps to define what it is I'm trying to look for and in a way,  builds sort of a framework that I can follow as I begin looking for this behavior.

  1. Beacons generally create uniform byte patterns
  2. Active C2 generates non uniform byte patterns
  3. There are far more flows that are uniform than non uniform
  4. Active C2 happens in spurts
  5. These patterns will be anomalous when compared to normal traffic

Using the definition above, I went out and built a method that will identify anomalous traffic patterns that may indicate malicious beaconing.  It worked well for the sample data I was using, but when implementing this method in a much larger dataset I had problems.  The anomalous datapoints were much greater and therefore the fidelity of the data I was highlighting was much lower (if you haven't read my last post I would recommend it).  The other issue was that it took much longer to pivot into the data of interest and then try and understand why that pattern was identified as an outlier.  I then decided to see if I could take what I was trying to do with k-means and build it into my Splunk query.  Here is what I came up with:

The entire search looks like this:


index=botsv3 earliest=0 (sourcetype="stream:tcp" OR sourcetype="stream:ip") (dest_port=443 OR dest_port=80) |stats count(bytes_out) as "beacon_count" values(bytes_in) as bytes_in by src_ip,dest_ip,bytes_out |eventstats sum(beacon_count) as total_count dc(bytes_out) as unique_count by src_ip,dest_ip |eval beacon_avg=('beacon_count' / 'total_count') |stats values(beacon_count) as beacon_count values(unique_count) as unique_count values(beacon_avg) as beacon_avg values(total_count) as total_count values(bytes_in) as bytes_in by src_ip,dest_ip,bytes_out |eval incount=mvcount(bytes_in) |join dest_ip [|search index=botsv3 earliest=0 (sourcetype="stream:tcp" OR sourcetype="stream:ip") |stats values(login) as login by dest_ip |eval login_count=mvcount(login)] |eventstats avg(beacon_count) as overall_average |eval beacon_percentage=('beacon_count' / 'overall_average') |table src_ip,dest_ip,bytes_out,beacon_count,beacon_avg,beacon_percentage,overall_average,unique_count,total_count,incount,login_count |sort beacon_percentage desc

Breaking it down:


Collect the data that will be parsed:
  • index=botsv3 earliest=0 (sourcetype="stream:tcp" OR sourcetype="stream:ip") (dest_port=443 OR dest_port=80)


Count the number of times a unique byte size occurrs between a src, dst and dst port:
  • |stats count(bytes_out) as "beacon_count" values(bytes_in) as bytes_in by src_ip,dest_ip,bytes_out


Count the total number of times all bytes sizes occur regardless of size and the distinct number of unique byte sizes:
  • |eventstats sum(beacon_count) as total_count dc(bytes_out) as unique_count by src_ip,dest_ip


Calculate average volume of src,dst,byte size when compared to all traffic between src,dst:
  • |eval beacon_avg=('beacon_count' / 'total_count')


Define fields that may be manipulated, tabled, counted:
  • |stats values(beacon_count) as beacon_count values(unique_count) as unique_count values(beacon_avg) as beacon_avg values(total_count) as total_count values(bytes_in) as bytes_in by src_ip,dest_ip,bytes_out


Count the number of unique bytes_in sizes between src,dst,bytes_out.  Can be used to further define parameters with respect to beacon behavior:
  • |eval incount=mvcount(bytes_in)


*** Below is additional from original query ***

Generally there will be a limited number of users beaconing to s single destination.  If this query is looking at an authenticated proxy this will count the total number of users communicating with the destination (this can also add a lot of overhead to your query):
  • |join dest_ip [|search index=botsv3 earliest=0 (sourcetype="stream:tcp" OR sourcetype="stream:ip") |stats values(login) as login by dest_ip |eval login_count=mvcount(login)]


Calculate the average number of counts between all src,dst,bytes_out:
  • |eventstats avg(beacon_count) as overall_average


Calculate the volume percentage by src,dst,bytes_out based off the overall_average:
  • |eval beacon_percentage=('beacon_count' / 'overall_average')


And the output from the Splunk botsv3 data:
















You can see from the output above, the first 2 machines were ones identified as compromised.  The volume of their beacons were 1600 and 400 times more than the average volume of traffic between src,dst,bytes_out.  By adding the bottom portion of the search I've basically built the outlier detection into the query.  You could even add a parameter to the end of the search like "|where beacon_percentage > 500" and only surface anomalous traffic.  Also, by adjusting the numbers in these fields you can really turn the levers and tune the query to different environments. 

(beacon_count,beacon_avg,beacon_percentage,overall_average,unique_count,total_count,incount,login_count)

If you were to apply this to proxy data you could also run multiple queries based on category.  This may increase the speed and take some of the load off Splunk.

I've also not given up on K-Means.  I just pivoted to using a different method for this.

*** Adding an update to include a Splunk search with a risk scoring function ***

index=someindex sourcetype=somesourcetype earliest=-1d 
|stats count(bytes_out) as "i_bytecount" values(bytes_in) as bytes_in by src_ip,dest_ip,bytes_out 
|eventstats sum(i_bytecount) as t_bytecount dc(bytes_out) as distinct_byte_count by src_ip,dest_ip 
|eval avgcount=('i_bytecount' / 't_bytecount') 
|stats values(i_bytecount) as i_bytecount values(distinct_byte_count) as distinct_byte_count values(avgcount) as avgcount values(t_bytecount) as t_bytecount values(bytes_in) as bytes_in by src_ip,dest_ip,bytes_out |eval incount=mvcount(bytes_in) 
|join dest_ip 
[|search index=someindex sourcetype=somesourcetye earliest=-1d 
|bucket _time span=1h 
|stats values(user) as user values(_time) as _time dc(url) as distinct_url_count count as distinct_event_count by dest_ip,dest 
|eval time_count=mvcount(_time) 
|eval login_count=mvcount(user)] 
|table dest,src_ip,dest_ip,bytes_out,distinct_url_count,distinct_event_count,i_bytecount,distinct_byte_count,avgcount,t_bytecount,incount,login_count,user,time_count 
|search t_bytecount > 1 login_count < 3 
|eventstats avg(i_bytecount) as o_average 
|eval above=('i_bytecount' / 'o_average') 
|eval avgurl=(distinct_url_count / distinct_event_count) 
|eval usermult=case(login_count=1, 100, login_count=2, 50, login_count>2, 0) 
|eval evtmult=case(distinct_event_count>60, 50, distinct_event_count>300, 100, distinct_event_count<60, 0) 
|eval beaconmult=case(above>5, 100, above>100, 200, above<=5, 0) 
|eval urlmult=case(avgurl>.06 AND avgurl<.94, 0, avgurl>.95 ,100, avgurl<.05, 100) 
|eval timemult=case(time_count > 7, 100, time_count<=7, 0) 
|eval addedweight = (evtmult+usermult+beaconmult+urlmult+timemult) 
|dedup dest 
|search addedweight > 250

Friday, May 1, 2020

Hunting for Beacons


A few years ago I wrote a post about ways that you can correlate different characteristics of backdoor beaconing.  By identifying and combining these different characteristics you may be able to identify unknown backdoors and possibly generate higher fidelity alerting.  The blog can be found here: http://findingbad.blogspot.com/2018/03/c2-hunting.html

What I didn't talk about was utilizing flow data to identify C2.  With the use of ssl or encrypted traffic you may lack the required data to correlate different characteristics and need to rely on other sources of information.  So how do we go hunting for C2 in network flows?  First we need to define what that may look like.

  1. Beacons generally create uniform byte patterns
  2. Active C2 generates non uniform byte patterns
  3. There are far more flows that are uniform than non uniform
  4. Active C2 happens in spurts
  5. These patterns will be anomalous when compared to normal traffic

I've said for a long time that one way to find malicious beaconing in network flow data is to look for patterns of beacons (uniform byte patterns) and alert when the patterns drastically change (non uniform byte patterns).  The problem I had was figuring out how to do just that with the tools I had.  I think we (or maybe just me) often get stuck on a single idea .  When we hit a roadblock we lose momentum and can eventually let the idea go, though it may remain in the back of your head. 

Last week I downloaded the latest Splunk BOTS data source and loaded it into a Splunk instance I have running on a local VM.  I wanted to use this to explore some ideas I had using Jupyter Notebook.  That's when the light went off.  Below is what I came up with.












This Splunk search performs the following:

  1. Collects all flows that are greater than 0 bytes
  2. Counts the number of flows by each unique byte count by src_ip, dest_ip, and dest_port (i_bytecount)
  3. Counts the total number of flows between src_ip, dest_ip (t_bytecount)
  4. Counts the unique number of byte counts by src_ip, dest_ip (distinct_byte_count)
  5. Generates a percentage of traffic by unique byte count between src_ip, dest_ip (avgcount)

The thought being that a beacon will have a high percentage of the overall traffic between 2 endpoints.  Active C2 will be variable in byte counts, which is represented by distinct_byte_count.

I then wanted to identify anomalous patterns (if any) within this data.  For this I used K-Means clustering as I wanted to see if there were patterns that were outside of the norm.  Using the following python code:

import pandas as pd
import numpy as np
import matplotlib.dates as md
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
from sklearn.covariance import EllipticEnvelope
from sklearn.ensemble import IsolationForest
from sklearn.svm import OneClassSVM
from mpl_toolkits.mplot3d import Axes3D

df = pd.read_csv("ByteAvgs1.csv")
df['t_bytecount'] = pd.to_numeric(df['t_bytecount'], errors='coerce')
df['i_bytecount'] = pd.to_numeric(df['i_bytecount'], errors='coerce')
df['avgcount'] = pd.to_numeric(df['avgcount'], errors='coerce')
df['distinct_byte_count'] = pd.to_numeric(df['distinct_byte_count'], errors='coerce')
df['bytes_out'] = pd.to_numeric(df['bytes_out'], errors='coerce')

X = df[['avgcount', 't_bytecount', 'distinct_byte_count']]
X = X.reset_index(drop=True)
km = KMeans(n_clusters=2)
km.fit(X)
km.predict(X)
labels = km.labels_

fig = plt.figure(1, figsize=(7,7))
ax = Axes3D(fig, rect=[0, 0, 0.95, 1], elev=48, azim=134)
ax.scatter(X.iloc[:,0], X.iloc[:,1],
          c=labels.astype(np.float), edgecolor="k")
ax.set_xlabel("Beacon Percentage")
ax.set_ylabel("Total Count")
ax.set_zlabel("Unique")
plt.title("K Means", fontsize=14);


I was able to visualize the following clusters:


























While the majority of the traffic looks normal there are definitely few outliers.  The biggest outlier based on the Beacon Percentage and Total Count is:




There were 3865 flows with 97% all being the same byte count.  There were also 19 unique byte counts between these 2 ip's.

Taking a quick look into the ip we can assume that this machine was compromised based off the command for the netcat relay (will take more analysis to confirm):








Obviously this is a quick look into a limited data set and needs more runtime to prove it out.  Though it does speak to exploring new ideas and new methods (or in this case, old ideas and new methods).  You never know what you may surface.

I'd also like to thank the Splunk team for making the data available to everyone.  If you would like to download it, you can find it here: https://www.splunk.com/en_us/blog/security/botsv3-dataset-released.html.