Detection Engineering for SOC
Sunil Kumar BV | Sr Security Engineer | Rakuten
“Everything Published, Talked and/or discussed in this conference is solely based on my personal point of view,
and does not represent my current, or past employers.”
Next 40 Mins !
SOC Detections Concept
IDS and WAF Workflow example
Stage 1: Parse HTTP(s) packet from client�(HTTP Request and response logs)
Stage 2 : Chose rule set depending on type of incoming parameters
Packet decode, HTTP Fields
Stage 3: Normalize data �Packet grouping
Stage 4: Apply detection Logics �Regular expression based (signatures/Rules/patterns based)
Stage 5: Make detection decision�alert/offence will be triggered based on true/false/score
HTTP Header (under L7 data)
Layer 7 (or the application layer) is the highest layer in the OSI model of network communication. It's responsible for providing network services to application processes running on a host like web browsers, email clients and file-sharing programs.
Most user-facing protocols and applications like HTTP, FTP and SMTP operate on layer 7.
Not limited to these fields, there will be a greater number of fields available in the IDS/WAF�and Number of different ideologies (scoring etc) will be used by different vendors.
HTTP Related Fields | Files related | Email Related | TCP/UDP DOS related |
http-req-cookie | file-data | pop3-req-protocol-payload | tcp-context-free |
http-req-headers | file-elf-body | pop3-rsp-protocol-payload | udp-context-free |
http-req-message-body | file-flv-body | imap-req-cmd-line |
|
http-req-host-ipv4-address-found | file-html-body | imap-req-first-param | unknown-req-tcp-payload |
http-req-host-ipv6-address-found | file-java-body | imap-req-params-after-first-param | unknown-rsp-tcp-payload |
http-req-host-header | file-mov-body | imap-req-protocol-payload |
|
http-req-mime-form-data | file-office-content | imap-rsp-protocol-payload | unknown-req-udp-payload |
http-req-ms-subdomain | file-pdf-body | email-headers | unknown-rsp-udp-payload |
http-req-origin-headers | file-riff-body |
|
|
http-req-params | file-swf-body |
|
|
http-req-uri | file-tiff-body |
|
|
http-req-uri-path | file-unknown-body |
|
|
http-req-user-agent-header | ftp-req-params |
|
|
http-rsp-headers | ftp-req-protocol-payload |
|
|
http-rsp-non-2xx-r | ftp-rsp-protocol-payload |
|
|
http-rsp-reason | ftp-rsp-banner |
|
|
http-req-method | ftp-rsp-message |
|
|
Regular expression…
…..is a sequence of characters that define a search pattern�
Most of the Regex in IDS/WAF are written for Signature set of Injections like SQLi, LDAP, Header, Code, OS command and XSS -Cross site Scripting (Nucli scan) etc.�
Attacker are able to find potential ways to Bypass IDS/WAF , these are Bug or "weak places" in regular expressions:
https://github.com/attackercan/regexp-security-cheatsheet and https://www.slideshare.net/slideshow/lie-tomephd2013/21958607#35
Mitigation: Precompile regex patterns where possible to improve performance
Use of DAT (Dynamic analysis Tool) for regex checking and fuzzing with created regex patterns which will help in checking of Input validation, limitation , Regex timeout, Resource limits��Ex: https://redosdetector.com/
How can these be used in our Environment:
- Understanding what to detect?
- Understanding how to detect?
- Peer review (continuous testing and Validation)
- Submit detection into the pipeline (towards Production)
Example on writing these patterns / signature to detect:
sharing the latest attacks details: https://blog.orange.tw/posts/2024-08-confusion-attacks-en/
Confusion Attacks:
Exploiting Hidden Semantic Ambiguity in Apache HTTP Server!
CVE-2023-38709 - Apache HTTP Server: HTTP response splitting
https://bugzilla.redhat.com/show_bug.cgi?id=2273491
http-req-cookie contains "\r\nContent-Length:” <case-insensitive> AND http-req-URL contains “cURL” <case-insensitive> |
It is a RCE vulnerability in Apache Log4j 2.0 through 2.14.1 and
we can achieve this by submitting an exploit string as part of HTTP headers destined for a vulnerable server, then exploit will request a malicious payload from an attacker-controlled server through the Java Naming and Directory Interface (JNDI) over a variety of services, such as Lightweight Directory Access Protocol (LDAP).
POC: https://www.trendmicro.com/ja_jp/devops/22/a/detect-log4j-vulnerabilities.html
Check out for Threat classification:
http://projects.webappsec.org/w/page/13246978/Threat%20Classification
http-req-header == "($?) $JNDI:" OR http-req-header == "($?) $JNDI:LDAP” OR http-req-header contains "($?) $JNDI:LDAP” AND “\b[a-zA-Z0-9-]+\. [a-zA-Z]{2,}\b” |
Detection Logic in EDR
- The IOC are going to change easily.
- We should concentrate on Tactics, Techniques and Procedures (TTPs)
How the adversary goes about accomplishing their mission from reconnaissance all the way through data exfiltration and at every step in between
What exactly to look at:
Detection Focus (based on Kill chain)
Reconnaissance: attackers scan the environment we can max block IP or segment, but they can change it quickly before the attack.�Weaponization: we cannot catch the attackers here, as they build their payload in their environment where we do not have access/ logs.�Delivery/ Exploitation: these are our vendor address like Firewall and Email gateways �Installation and C2: here we can look on detection engineering activity , once the files ( macro, powershell etc ) from any process or object get in we can check for possible alarming in environment.�Action on Objectives: based on Organizational requirement Fine tuning the detection/ rule set to reduce the FP fatigue or our requirement.�Impact: so that we can catch advisory before potential impact
Data collections:
- Type of Data collected?
- Where is it stored?
- Is it ingested to SIEM, EDR or not?
- Prioritizing data sources based on expenses.
- Gap analysis on Data sources and ingested data.
MITRE Framework
- Then starts look for particular things under selection controls�threat group, data sources
- Select unannotated ( technique/task not applicable) and
- Then toggle the state and hide (eyeball) the rest.��
adversaries leverage Scripts
OS Binaries are local to their OS, but these binaries have been utilized and exploited by cyber criminals and crime groups to camouflage their malicious activity.
we can have look on categories for various OS Binaries:
2. Mac OS Built in Binaries - Living Off the Orchard - https://www.loobins.io/binaries/ LOOBins is a Python SDK and command-line utility for programmatically interact - https://www.loobins.io/docs/api/pyloobins/
�ATT&CK techniques count per Data Source
We can see Command Execution and Process Creation are top used procedures in TTPs.�
MITRE DeTTECT - The Miter detect framework
Based on this, we can priorities our data collection to address ATT&CK techniques and sub-techniques and collect Sysmon, linux, servers and other logs accordingly...
EDR Fields
These are some of the most used data fields associated with events. Fields that begin with lowercase letters are present in all events.�We have Greater number of Fields from Agents which will collect the data.
timestamp | FilePath |
_time | GrandParentImageFilePath |
HostID | ParentImageFilePath |
event_platform | FileWrittenFlags |
event_Name |
|
ComputerName | DetectId |
| DetectName |
OriginalFilename | DetectDescription |
FileName |
|
ImageFileName | CommandLine |
GrandParentFileName | ParentCommandLine |
BaseFileName | GrandparentCommandLine |
TargetFileName |
|
ContextBaseFileName | RegType |
| RegistryPath |
SHA256HashData | RegNumericValue |
DomainName | RegStringValue |
HostURL | RegBinaryValue |
Ex :- Detecting malicious PowerShell command execution
It is a built-in command line tools and It can download and execute code from another system and provides unprecedented access on Windows computers
Its malicious use is often not stopped or detected by traditional endpoint defenses, as files and commands are not written to disk. This means fewer artifacts to recover for forensic analysis.
Several offensive tools exist that are built on or use PowerShell, including the following: Invoke-mimikatz
POC: https://book.hacktricks.xyz/windows-hardening/basic-powershell-for-pentesters
- There can be number of other tuning need to performed based on from which process the command is executed (parent process) and is the parent is legitimate or unknown in string etc.
Condition 1: ((CommandLine contains “powershell.exe –exec” AND CommandLine contains “bypass) AND (CommandLine contains “IEX (” OR CommandLine contains “Invoke-Expression”) AND�CommandLine contains “.DownloadString” AND CommandLine contains “\b[a-zA-Z0-9-]+\. [a-zA-Z]{2,}\b” AND reffererURL contains “\b[a-zA-Z0-9-]+\. [a-zA-Z]{2,}\b”) |
Processing Directions:
Step 1 : collecting commands/scripts and its variable details:
Step 2:
Same way we can try for other commands, process like:
Fine tuning Practices�
- Continuous Improvement and development� use version control to keep and monitor changes.��- Non-Efficient alerts need to be fixed� Update the logic� Added new functionality such as enrichment or correlation
References:
Anybody got any
Questions?