Foreword
Many Information and Cyber Security professionals have overlap with their responsibilities regardless of title. Itâs important to break down the one skill that every technical security professional should have, which is the art of High Impact Security Analysis and Communication
.
Intro: Improvise, Adapt, Overcome
âOnce you know how to do something itâs easy, but the moment something changes in how you do it, it becomes hard.â
Think of the skills you have and include those you take for granted (e.g. walking, cooking, driving, communicating), did you learn them overnight? Of course not! You spent time and energy to master them in a way that was natural to you before it became second nature.
Now what happens when you fundamentally change your ability to perform these actions:
- Walking: Imagine you break your ankle and cannot walk, so you need to use crutches.
- Cooking: Imagine you have no power available or your stove/oven breaks.
- Driving: Imagine you jump in someoneâs manual vehicle when you have only driven an automatic car.
- Communicating: Imagine you need to resort to sign language or another form of communication because you lose your voice or are trying to communicate with someone in another language.
Whilst still possible, without some extra training or tools to get you started the task becomes exponentially more difficult; the same applies to being a security analyst.
Defining an Analyst
The definition of an analyst is someone who performs analysis, and the definition of analysis according to the Cambridge dictionary is:
âThe act of studying or examining something in detail, in order to discover or understand more about it, or your opinion and judgment after doing thisâ
Because security of a system, identity, or an organisationâs information relates to everything that poses a risk to it, the idea of a security analyst is akin to that of a miracle worker, where the scope of what you are responsible for and trying to protect can be as small or as large as someone makes it.
With a seemingly endless amount of software, systems, and threats posed to these systems itâs no wonder that security analysts can experience burnout and not understand exactly what they need to know to be successful in their role. Some other roles with unique titles have spawned from this ambiguity, but at the end of the day they simply indicate a niche area that someone is to focus their time performing analysis.
Some examples are:
- Vulnerability Analyst
- Malware Analyst
- SOC Analyst
- EDR/MDR Analyst
- Intelligence Analyst
- Threat Analyst
- Forensic Analyst
- Information Security Analyst
So, what generally happens in these roles? Their required skill sets become blurred and generally wind up requesting similar forms of training and certifications.
On the Job Training and Certifications
Often when beginning in a new role you receive some on the job training that may teach you about workflows, tools available, and how to retrieve the information required to perform in your role. Sometimes a business may also put forward money to get you industry recognised certifications, and although this may teach you some technical skills and tools that can be used to solve certain problems it doesnât necessarily teach you how to perform analysis on something you have never seen before.
Coming back to the definition of analysis, an analystâs job is to study or examine something in detail to understand more about it and make a judgement on it. If you have seen something before then your unique experience means you will rapidly be able to make a decision and know what analysis needs to be done to come to an outcome; however, if you havenât seen it before then how do you perform effective analysis?
Fast, Good, and Cheap - Pick Any 2
- Image from: Pintrest - Matt Crawford
In project management thereâs a concept known as the iron triangle or project management triangle. A common derivative of this which is used in many businesses today is simply that you cannot have something that is fast, good, and cheap, everything comes with a trade-off. This same premise can easily be applied to working as a security analyst, but it comes with some caveats weâll touch on in just a second.
From here on out Iâm going to use the example of a Security Analyst who works in a Security Operations Centre (SOC) or a managed EDR/MDR function and use these terms interchangeably. At the time of writing I have spent the majority of my career having held Senior, Principal, and Manager roles in this space, and something I frequently get asked is âwhat skills do I need need to excel in this fieldâ, so figure this is a good opportunity to shed some light.
Imagine youâre a business and pay for a new SOC analyst, letâs call him Jack, and thereâs already a senior SOC analyst, letâs call her Sally. Both analysts are working a shift together and see an alert come through from a server:
EDR Alert: Mimikatz was detected - A command commonly associated with Mimikatz was run on the system
Process Executable: C:\Users\Gavin\Music\1\mimikatz.exe
Process Command Line: C:\Users\Gavin\Music\1\mimikatz.exe privilege::debug
Signature: Unsigned
User: Gavin (SID: S-1-5-21-1001356378-1477238915-642007331-1000)
Logon Type: 10
Host Services: RDP (3389), HTTPS (8080)
In this instance Jack picks up the alert and begins to perform analysis on what he is seeing.
- He first takes a look at what the executable hash is on VirusTotal, but it hasnât been seen before.
- He begins to google mimikatz, as this isnât something he is familiar with and begins reading up on it.
- Whilst Jack is reading into this, Sally notices the alert and immediately isolates the server from other systems in the environment, locks Gavinâs user account, and logs him off of the system.
- Sally begins working with Jack to explain what this is and write up a report on what they are seeing and further actions that can be taken by the impacted organisation.
Why did Sally take an action to lock the system down when Jack didnât? Fast, Good, and Cheap - Pick Any 2
.
In this instance Sally had previous experience that allowed her to confidently take an action to respond Fast, and Good but her salary as a senior means that it wonât be as cheap for the business to do so, and the aggressive response actions taken means it may not be cheap for the business impacted given a server has now been taken offline. Inversely Jack would have been more thorough and have had a good, and cheap outcome, but it wouldnât have been fast which would have been detrimental to the impacted business.
This analogy does breakdown in that many analysts donât come to a good outcome even after considerable time is put in, and as we know it time is money, so you canât have a good and cheap analyst because they would be slower and so fundamentally not cheap. As an analyst though we can have a good outcome for cheap, it just takes time to learn and get exposure to different technologies.
Itâs also unlikely that youâll have all the information required to make a decision in this single pane of glass, but for the purpose of simplification weâre assuming all of these details are in the alert.
Time is not Experience, but to Have Experience You Need Time
âTime doesnât equal experience, but you canât have experience without time.â
Iâve interviewed anywhere from 50-100 candidates in my career at the time of writing and have gone through hundreds of CVs. One thing that remains true is that the time spent in information security does not determine your experience in a given role, itâs all about your exposure to various technologies, tools, and tradecraft. Not only this but experience can come from your own time spent learning about a concept, no seriously this can make you a much better analyst than someone who has done the bare minimum in a working role for a year or 2.
The primary reason time can go in without a good outcome is because an analyst doesnât have enough foundational knowledge to build off of based on what they are looking at, or they havenât learnt how to perform the art of High Impact Security Analysis and Communication.
In the previous example with Jack and Sally thereâs multiple approaches Jack can now take to move forward:
- A: Jack could let imposter syndrome kick in for not knowing what he was looking at and sheepishly continue trying to work.
- B: Jack could make a note to read more into Mimikatz and understand it when he gets some free time after work.
- C: Jack could get a mentoring session from Sally to learn from her experience.
- D: Jack could read Sallyâs report sent to the impacted organisation to learn from it.
Only one of these options doesnât result in growth (Option A), the other 3 options all result in growth, and doing all of them often results in the most growth (but takes the most time). This is where as an analyst, regardless of the time we have available in a working day, if we want to grow time needs to be spent on self-learning and understanding.
Taking hypotheticals away, the way I grew as an analyst was often to take approach B and D. I would read, experiment, read some more, and write to confirm my understanding of a particular concept. As an analyst working in a fast-paced environment you will often not be granted the time to do this during work, so study in your free time becomes increasingly important.
Pattern Recognition and the Stool
In the above example Sally recognised a number of patterns she had seen before based on her unique knowledge of techniques known to be used by threat actors:
- Gavin was running an executable from the Music directory
- Gavin had a logon type of 10 which meant it was via Remote Desktop Protocol (RDP)
- A command known to be used by the Mimikatz executable
privilege::debug
was seen in the command line - Mimikatz was the name of a tool publicly available that is used for stealing plaintext passwords from system memory
- The executable was unsigned so not verified by a trusted software vendor
- Threat actors often use Mimikatz to obtain passwords to move laterally
- The server was exposing RDP to the internet which is commonly brute forced by threat actors
- The account identifier for Gavin ended in
1000
which indicates it is likely a legitimate user in the domain rather than a newly created one as user domain identifiers begin at1000
.
As an analyst we can use the premise of a stool where if thereâs 3 or more unique clear-cut attributes (legs) are there to stand on, then this generally indicates something may be malicious based on deviations to what you see as normal.
You could spend thousands of dollars gaining industry recognised certifications that teach you technical concepts, where to look for information on a system, what tools to use and so forth, but if you donât know what data you have available then youâll be paralysed trying to think of what to do with a detection or situation youâre unfamiliar with.
Regardless of Recognition and Data, Context is Key
Every single decision and data point you have as an analyst should guide you towards establishing context. Too often analysts will see a detection theyâre unfamiliar with and if they canât determine it to be malicious by looking at the hash of the executable, or some other basic checks then theyâll close it out as a false positive if nothing says it is malicious, THIS IS NOT HOW WE PERFORM HIGH IMPACT SECURITY ANALYSIS.
High Impact Security Analysis is all about having enough of a foundational knowledge to be able to establish context in unfamiliar situations. Take for example another alert below this time on a user workstation:
AV Alert: Malware:CS:Hueristic
Process Executable: C:\Windows\System32\svchost.exe
Process Command Line: C:\Windows\System32\svchost.exe -k UnistackSvcGroup -s WpnUserService
Signature: Signed (Microsoft)
User: Amanda (SID: S-1-5-21-1001356378-1477238915-642007331-500)
Logon Type: 2
Host Services: N/A
In this instance Sally picks up the alert and begins investigating it. She sees a user which is interactively logged onto a system with a logon type of 2, and there is an Antivirus alert which is pointing to an executable signed by Microsoft, svchost.exe. She also sees this is the local administrator account on the system by its SID.
- Sally looks at the executable hash on VirusTotal and finds it has been seen a lot of times with no AV vendors marking it as malicious, this looks to be a legitimately signed svchost.exe executable
- Sally looks back on their case management system and find other analysts have been closing similar signals off as a false positive for a few days now
Antivirus products have been known for a long time to have some false positives, but Sally canât see exactly why the AV product has raised the alert so Sally begins to think this could be a false positive based on the analysis of others; however, unlike many junior analysts before her Sally is going to investigate further because contextually this is the only system theyâre seeing the alert on, and the Antivirus product is on many systems.
- Sally begins to formulate a potential hypothesis on what she is looking at:
- A: The Antivirus product is causing a false positive
- B: The legitimate svchost process is in some way running malicious code
At this point Sally is in an unfamiliar situation; however, she gathers enough foundational knowledge through online searches to kickstart her analysis.
- First Sally knows that a process is made up of a number of threads, and any one of those threads could be performing malicious actions whilst the others perform legitimate actions
- Sally finds out that svchost.exe is an executable which runs service DLLs
- Sally finds out that the service specified in the command line (
WpnUserService
) loads a service DLL from the Windows Registry atHKLM\SYSTEM\CurrentControlSet\Services\WpnUserService\Parameters
- Sally finds the relevant value in this key
ServiceDll - REG_EXPAND_SZ - %SystemRoot%\System32\WpnUserService.dll
- Sally understands that system variables are indicated by
%
and that the default variable here means the DLL is located at:C:\Windows\System32\WpnUserService.dll
- Sally retrieves the DLL from disk and runs a hash search only to find it is known, signed, and a valid Microsoft DLL
- Sally finds out that the service specified in the command line (
- At this point Sally takes a look at active network connections on the system and finds that the svchost process has an active network connection to an IP address
- Examining the IP address on VirusTotal reveals it is an IP tied to VULTR, a provider of Virtual Private Servers which is commonly abused by threat actors
- Sally retrieves a memory dump of this process for analysis
- Looking at the strings in this memory dump and examining in WinDbg, Sally identifies the string
MZARUH
followed by what looks to be a PE file - A search for this reveals that
MZAR
is the default magic bytes for a 64-bit Cobalt Strike beacon
- Suddenly the assumption begins to shift, and maybe
Malware:CS:Hueristic
indicates a likely Cobalt Strike beacon being identified in memory
Thereâs a lot that goes into the above, and without enough foundational knowledge to build upon itâs unlikely Sally would have come to the correct conclusion as to why this AV product was raising an alert.
Even though a lot of analysis has been done at this stage to confirm Sallyâs possible hypothesis, the cause of the Cobalt Strike beacon in memory is still unknown and itâs likely to have taken a fair amount of time to get to this point given Sally was essentially learning on the fly. When you consider that most analysts are aiming to resolve an alert in 30-60 minutes itâs no wonder that many turn to the fallacy of believing if they canât confirm something is malicious that it must be a false positive rather than gathering context and validating findings.
If you fail to identify malice where malice is present then you at best didnât have the skills, time, tools, experience, or knowledge in identifying it, and at worst you didnât do your due diligence and were negligent. The Cambridge definition of incompetent
is as follows:
âLacking the skills or knowledge to do a job or perform an action correctly or to a satisfactory standardâ
I donât know of any analyst that would like to be incompetent or negligent in their analysis, and yet when we only go as far as to do basic checks without context, we wind up falling into this bucket.
Even though analysis can become faster with experience, your process and technology is equally as important in reducing the amount of time performing triage and analysis and coming to the right outcome.
People, Process, Technology
Without people, process, and technology your security operations will suffer, and security analysts will have a hard time getting to the right decision when faced with unknown alerts or situations. In the above example Sally was able to use deductive reasoning and context to infer that the svchost process had been injected into and was running a Cobalt Strike beacon. What if there was more context, would this have changed the time required to get to the same outcome?
Take for example the below alert instead:
EDR Alert: Possible Cobalt Strike Bytes Reflectively Loaded in Memory
Process Executable: C:\Windows\System32\svchost.exe
Process Command Line: C:\Windows\System32\svchost.exe -k UnistackSvcGroup -s WpnUserService
Signature: Signed (Microsoft)
Source Executable Command Line: C:\Program Files\Intel\Chipset.exe
Signature: Unsigned
Source Executable Parent Command Line: C:\Windows\System32\svchost.exe -k netsvcs -p -s Schedule
User: Amanda
LogonType: 2
Host Services: N/A
Byte Pattern: MZARUH....<snip>
Suddenly Sally starts this investigation with far more context than she had previously. Not only does the EDR Alert contain more clarifying information (and ideally an investigation guide), but it also paints a picture of where she should begin her investigation. In the above example itâs clear a process has been injected into and the source of this injection appears to be C:\Program Files\Intel\Chipset.exe
which is unsigned. This seems to be run from a scheduled task as indicated by the Schedule
service.
The sheer fact that the technology gives more context means that less time is having to be spent by the analyst to gather necessary context.
Now with all of the scenarios mentioned above, how could you possibly be expected to perform High Impact Security Analysis without having at least some knowledge or be able to find it out quickly in the following concepts:
- Processes
- Threads
- Scheduled Tasks
- Command Line Arguments
- DLL and Executable Metadata
- Memory
- Logon Types
- SIDs/RIDs
- Reflective Loading and Injection
- Signed and Unsigned Executables
- RDP
- Mimikatz and LSASS
- Svchost and Service DLLs
This is where documenting enough of your investigation pays off dividends over time. In the initial instance Sally had to do a lot of investigation; however, this is now knowledge she has which can be shared with others, and further to this the report she sent the impacted business can be examined in the future to understand what investigation Sally had performed based on her investigation notes. This creates a cycle of continuous improvement and knowledge sharing, but it takes time in the initial instance after which it becomes exponentially swifter if the same activity is seen again.
Without a process to follow it can be hard to perform enough analysis to determine if something is malicious or not, and even harder to know when to stop going down the analysis rabbit hole.
With the above letâs look at a generic process that would speed up analysis:
Step 1: Get context on the alert and form a hypothesis
An alert has gone off, how can you begin your investigation if you donât know what the alert is supposed to be for? Every professional who creates detections has an obligation to guide an analyst in what they should look for and why the detection exists.
In the first example there was only ambiguous information on why the detection was raised, and not enough context on what caused it to be raised. In the second example there was far more context given. Try to gather as much context on why an alert may have been raised and what it is trying to detect, and form some possible different hypothesis on why the alert may be raised. This doesnât need to be formally documented every step of the way like a research PHD would be, itâs simply keeping in the back of your mind that a number of possible causes could be present.
Step 2: Have we seen this before?
You have a certain amount of data at your fingertips, use this to see if youâve seen anything before in your case management system by pivoting on an indicator. Anything is fair game here but for some inspiration try searching for key terms based on:
- Alert Name/ID
- Command Line
- IP address
- User
- Signing Certificate
- Byte pattern
- DLL/Executable metadata
- Domain/URL
If you donât have the data then you need to acknowledge where your data limits are and work from what you have or can get (this is where having the ability to analyse different data sets is important). As you see more yourself and what is normal as an analyst you can begin to identify when something deviates from the normal and becomes a little bit unusual. This is where youâre able to use pattern recognition to make swift decisions with high accuracy. At the end of the day multiple analysts looking at a single alert will present different viewpoints which may be enough to comprehensively infer what has happened on a system.
Step 3: Get context on what is involved in the alert
If youâve got no analysis that has taken place previously that helps infer what youâre looking at, this is when you really need to get your hands dirty.
Starting with the data you have available you may wish to answer the following questions as a starting point:
- What user is involved, and how are they logged onto the system?
- Is the system a workstation or a server? Does it have any services or applications exposed to the internet that could have been exploited?
- What processes are involved? Are any of them suspicious? Are any of them known to be abused by other processes?
- What activity occurred in the lead-up to, and after the alert? Are there any unusual processes which ran?
- Look at processes that ran anywhere from 5 minutes to even days before the alert occurred if required. You should be able to identify what likely caused the alert, and what other actions have been taken.
- Do any of the executables or DLLs involved show suspicious characteristics?
- When it comes to Microsoft executables Matt Graeber a very smart professional in this field posed a number of questions back in 2018 which is a very valuable read.
- When it comes to executables in suspicious locations, the premise of DLL Search Order Hijacking/Sideloading is very important.
- What persistence has been established on the system?
How are you supposed to do any of these things properly and swiftly without having broad knowledge on the type of alert youâre dealing with, possible threats to systems, and the tools or data required to perform a level of analysis? You canât, and this is part of the reason it all comes back to Fast, Good, and Cheap - Pick Any 2
, or in some places you get what you pay for
.
Step 4: Confirm Hypothesis and Take Action
What needs to be done to stop the activity and restore normal operations? If itâs malicious then clear actions will need to be taken to respond, some of which may be to involve a devoted DF/IR firm. If itâs a false positive than working with the vendor who created the alert or adding exclusions may be required.
Where do I Stop?
Iâm commonly asked by analysts how much analysis is enough, and when do I stop? Quite simply it depends on your scope, what youâre hoping to achieve from your analysis, and what outcomes youâre going to deliver.
For example, in the above scenarios where Sally identified a Cobalt Strike beacon in memory, she could have continued her analysis. Using the byte pattern which has been matched, Sally may also be able to extract the configuration of the Cobalt Strike beacon using free, publicly available tools. This is important because it can not only give more context on what IP address it uses for command-and-control, but also provides a watermark (license) that can be used to track unique licenses assigned to Cobalt Strike, and even information about what process will be injected into as part of normal operations.
So, should she get this information or not? Once again, it depends. The command-and-control IP may be useful for blocking across an environment, finding other infections, or maybe the license key will come in handy to track future infections of cobalt strike and threat actors.
When in doubt, your priority is always Containment, Eradication, and Recovery
which is the standard process put out by NIST SP 800-61R2 the National Institute of Standards and Technology guide to handling computer security incidents.
In the above scenario containing the situation may involve:
- Isolating the system
- Taking memory dumps and killing the identified malicious processes in memory
- Identifying and removing persistence on the system
- Restarting the system to clear everything from memory
Eradication may involve:
- Removing all malware still present
- Disabling user accounts involved and changing credentials
- Identifying other compromised systems and identities and containing those entities
Recovery may involve:
- Confirming no malware is still present
- Restoring from backups
- Hardening systems and educating users to prevent the root cause from occurring again
- Installing patches
- Implementing security tools and firewalls etc
In short, for Sally to initially contain the situation, she doesnât need to extract the configuration of the Cobalt Strike beacon, and she already has an IP address the process connected back to. To properly eradicate and recover though, extracting the Cobalt Strike beacon configuration may be useful in pinpointing other infected systems, gaining threat intelligence, and identifying other C2 infrastructure she may have missed.
Taking ownership of an alert and performing the best analysis you can from start to finish will always benefit you, other analysts, and any organisation impacted by the alert. The sad reality is that much of the information that was shown above may not be available without looking at forensic artifacts, taking actions on an endpoint, or consolidating OSINT and other information sources, and so the balance comes with how quickly you can use the data at your fingertips to gain the necessary context required.
Communication is Essential
Now that we know the art of High Impact Security Analysis, the most essential part of it comes into play; Communication.
It doesnât matter how smart and technical you are if you arenât able to convey risk or benefit to others. The problem is that in a managed SOC/MDR capability you often wonât know who youâre communicating with, or what their technical knowledge is. Some ways you may be able to get this context is to look them up on LinkedIn or talk with someone that knows them, but this isnât always practical or possible, and thatâs why it should be distilled down into 3 main sections:
- What happened
- What we did about it
- What you need to do about it or follow up on.
If you can breakdown all of your analysis into these sections into actions that others can relate to, then youâll be able to adequately communicate to a wide range of stakeholders. Sometimes the individual involved may want more information, and thatâs okay because your notes associated with an investigation should help to convey the technical details of what analysis you have performed.
Itâs hard being an analyst as everything comes from experience or continuous education. I can tell you now that Iâve met a lot of very intelligent analysts in my time, and Iâve met a lot of lacking or inexperienced analysts in my time but every single person had something unique to offer and had knowledge that others didnât. Because analysts often donât understand the value of High Impact Security Analysis, are unable to communicate their findings, or just donât undertake the due diligence required based on their skills and time available, this can often lead to an outside perception that being an analyst means you are inexperienced in the field and in an entry level role which is simply not the case.
No matter your experience or role, you need to know when to be humble, and how to communicate in a way thatâs not assuming others donât know what theyâre talking about or invalidating their ideas. This in itself can be a challenging concept to come to terms with for many technical professionals, and something better reserved for its own blog.
A Drop in the Water
In the above scenarios we touched on only a couple of alert types that could come through and what they could indicate, but this is continuously evolving as threat actor tradecraft evolves or changes and you come across different threats during your analysis.
Some other areas that would be good to brush up on are the following:
- Web Shells (Not only what they are, but what technologies they can exist on e.g. .NET/ASPX, PHP etc)
- Tunnels (Not only what they are used for, but what common ones are used by threat actors and what protocols do they often tunnel?)
- Information Stealers (Not only what they are used for, but what extra actions should be taken if one were to successfully be run in an environment)
- Ransomware Threat Actors (Not only what ransomware there is in the wild, but what is commonly done BEFORE ransomware is run in an environment? How can you detect this?)
- DLL Sideloading/Search Order Hijacking (Not only what it is, but what itâs commonly used for, how do you identify when it may be occurring?).
- Rogue RMM Tools (Not only what ones are commonly used, but what artifacts show when they were installed, who is taking an action on a system, and from what IP)
- MITRE ATT&CK Tactics (Not only what they mean, but what is commonly used during these phases? How would you spot them?)
- Commonly abused externally facing services (Not only what they are e.g. SQL, RDP, SMB, IIS/Web Applications etc, but how do you identify exploitation of these services? What artifacts do you have available?)
- Networking and System Internals (This is something often skipped over and can become incredibly complicated the more you know, but if you donât understand concepts such as packets and their types, processes, scripts, executables, threads, drivers, trust levels, event logs (ETW), the registry, LOLBAS, COM objects, WMI, AMSI, UAC, SIDs, DNS, IP addresses, ARP, File hashes, signing certificates it will lead to gaps in your knowledge that need filling during analysis)
- Tools for accomplishing tasks (Not just what tools are available, but when to use them and what information they may provide)
The secret to being a well-rounded analyst is to try to gain a basic to intermediate understanding of as many technologies and systems as you can, and tie this back to known threat actor tradecraft.
In Summary
- Look Out: When you get an alert donât reinvent the wheel if it isnât required, use existing knowledge to respond swiftly.
- Look In: Gather context of what is involved in the alert and what activity has happened around that time to try and determine what may have caused it.
- Look Out: Using newly found intelligence and indicators look at other systems and binaries involved to try and get a feeling of the scope of the incident, any persistence, or otherwise.
- Look In: Document your learning outcomes and analysis process, this will help everyone in the long run
In closing, mistakes and malicious activity will be missed due to brief oversights, individual bias, individual experience, and competing priorities to be fast and meet SLAs/SLOs, so remain humble, help each other improve every single day, and continue to increase your skills as a security analyst.