I'm always looking for different ways to look at data. Things that will give me insights into how users act as they go about their daily activities. Having done this for a while now I can say that people typically like routines as they go about their work, but how they go about their work is largely dictated by their job role. Examples may be the developer that often creates archive files or the sales person that authenticates from multiple locations per day.
Operating under the assumption that people will generally look the same from day to day it would make sense to understand why someone may deviate from their normal routine, especially from a security context. One way we can do this is to measure their own, unique behavior and apply some anomaly detection to highlight days of interest.
For the example I'm going to show, I wanted to focus on insider from a data theft standpoint, but this method can be applied to many different scenarios as long as you can define the points of interest to observe. The hypothesis that I'm using for insider data theft is:
1. Users who knowingly steal data will often use deceptive actions.
2. They will perform actions that are new or rare for them.
3. They will use uncommon exfiltration paths.
4. Rare actions across multiple phases can be identified.
The phases (categories) I'm using here are Deception, data staging and exfiltration.
Below I'm pasting screenshots of the relevant portion of the Jupyter notebook I'm using. The cool thing about using Jupyter is it's independent of the log source. I can use this as long as the data can be exported or accessed via api. Code blocks below are commented to describe the function.
Archive_Count: The number of archives created by day