1 of 20

Adversarially Robust Assembly Language Model for Packed Executables Detection

CCS `25

2 of 20

2

Motivation

  • Malwares love packing�🡪 To conceal the malicious code, attacker uses the packing

  • With packing, malware can effectively bypass the static analysis

3 of 20

3

What is packing

  • Packing�🡪 Compress / Encrypt instructions to conceal the malicious behavior �🡪 Usually making new entry, then jump to unpacking routine� 🡪 Unpacking routine unpacks the original code

4 of 20

4

How to detect packing

  1. Signature-based�- Find the byte squences, PE headers, strings from well-known packers�

  • Entropy-based�- For more generic version, use the entropy of the program�

5 of 20

5

How to detect packing

  1. Signature-based�- Find the byte squences, PE headers, strings from well-known packers�🡺 Expert should maintain the rules�🡺 Easily bypass by custom or new packers

  • Entropy-based�- For more generic version, use the entropy of the program�🡺 Empirical threshold�🡺 Easily bypass by low-entropy adversarial attack

6 of 20

6

Pack-ALM idea

Idea from Natural Language

🡪 Human can distinguish the “real word” and “pseudo word”

7 of 20

7

Pack ALM Architecture

  1. Preprocessing�: Make Asm

  • Pre-training�: Pre-train with non-packed (benign) dataset��
  • Fine-tuning�: Fine-tune with packed program, and classify to three class (instruction, native data, packed data)

8 of 20

8

Pack ALM Architecture

  1. Preprocessing�: Make Asm

  • Pre-training�: Pre-train with non-packed (benign) dataset��
  • Fine-tuning�: Fine-tune with packed program, and classify to three class (instruction, native data, packed data)

9 of 20

9

Step 1 - Preprocessing

  1. Linear Assemble the byte code

  • Normalization �: To prevent overfitting, normalize the instructions (mutually exclusive way)

For example, � mov eax, 0x1234 🡺 mov eax, [const]� jmp 0x12345112 # invalid address 🡺 jmp [mem_abnormal]��

10 of 20

10

Step 2 - pre-training

  • Use MLM model (RoBERTa) for pre-training � 🡺 This model can get some structural relationships

  • Use cross-entropy for loss
  • Dataset : Non-packed program

11 of 20

11

Step 3 – Fine-tuning

  • Fine-tune the model with packed program�: Use softmax for classification �� - Real Instruction : Original instruction�� - Pseudo Instrucion : � - Packed Data : Data which is packed by packer� - Native Data : Original data, but wrong disassembled

12 of 20

12

Experiment setup

  • PyTorch 2.1.2
  • CUDA 12.1
  • CUDNN 8.8.1
  • GeLU
  • Input token 512

  • Dataset�: Pre-training 🡪 4,207 programs from SourceForge, Linux platform program, some benchmarks �🡺 Total 235,954,417 instrucions

  • : Fine-tuning 🡪 2,388 non packed (not presented in pre-training) , 1,990 packed programs� 🡺 Packed programs uses ten packers including ‘low-entropy’ adversarial attack � 🡺 40M instructinos

13 of 20

13

Evaluation

  • RQ1: How does Pack-ALM compare to the state-of-the-artmodels in packed data detection?

  • RQ2: Can Pack-ALM generalize to detect model-unseen or adversarial packers?

  • RQ3: How do Pack-ALM’s components contribute to the effectiveness?

  • RQ4: How effective is Pack-ALM in detecting non-packed and packed programs in real-world scenarios?

14 of 20

14

Evaluation

  • RQ1: How does Pack-ALM compare to the state-of-the-artmodels in packed data detection?

Task A : Real or Pseudo Instruction?

Task B : In pseudo instruction, it is native or packed?

15 of 20

15

Evaluation

  • RQ2: Can Pack-ALM generalize to detect model-unseen or adversarial packers?

16 of 20

16

Evaluation

  • RQ2: Can Pack-ALM generalize to detect model-unseen or adversarial packers?

17 of 20

17

Evaluations

  • RQ3: How do Pack-ALM’s components contribute to the effectiveness?

18 of 20

18

Evaluations

  • RQ4: How effective is Pack-ALM in detecting non-packed and packed programs in real-world scenarios?

19 of 20

19

My thoughts

Pros� : Novel idea to detect packing method in adversarial attacks

: Show great performance between SOTAs� : Consider data leakage problems, and prevent it well�

Cons

: It’ll be better to use selective way for context� : I’ll be better that they have packer classification

20 of 20

Thank you