1 of 1

Future Works:

1) Expand the experiment to include more attack types, 2) further investigate class imbalance issue, and 3) optimize the sampling strategy for replay buffer.

Cross-Organizational Continual Learning of Cyber Threat Models

Chanel Cheng, Shanchieh Jay Yang

  • Intrusion detection systems are developed to help detect cyber threats across networks.
  • Yet, cyber threats evolve over time and follow different patterns across various organizations.
    • Continuous detection of changing data is difficult by traditional means[1]

Consider:

An incoming stream of network traffic from two different organizations.

  • Similar attack types have different patterns across organizations
  • New attack types are also present in each organization

Stream encounters both gradual and sudden changes in attack patterns

Methodology

References

Experiment & Results

  • By the 10th iteration, no more samples are required to be labeled, while most of the F1-scores reach above 0.9 except DoS-GoldenEye (∼0.88) and DoS-Slowloris (∼0.8), both of which have a much smaller sample size.
  • This learning strategy largely reduces the number of labeled data needed and can quickly reach good prediction performance.

Introduction

Experience Replay (ER)

    • ER[7] eliminates need for task boundaries, test time oracle, and enforces constant memory footprint.
    • Potentially more suited for real-world scenarios with gradual and sudden shifts in data.

Related Works

  • Two datasets were converted into a single data stream for continual learning without task boundaries.
    • Regular benign traffic and malicious traffic are present together in stream (with benign as the majority of traffic)
    • Order and source of data does not matter for our continual model in learning the attack types

(a)

Observations & Future Works

Figure.2. Flow diagram for the continual learning strategy.

Figure.3. (a) Average ratio of samples saved to buffer and (b) f1-score for network traffic classification as new DoS classes were introduced over 14 iterations.

Figure.1. Example data stream for task-agnostic continual learning on network traffic flow from CIC-IDS-2018[8] and USB-IDS-2021[9]

(b)

PNNs / EWC / SI / iCaRL / GEM

    • PNNs[2] - constructs new networks as novel tasks occur, resulting in linearly increasing memory requirement.
    • EWC & SI[3,4] - extends loss function with a term that consolidates selective network weights, but requires explicit task boundaries.
    • iCaRL[5] - combined use of replay and distillation but still requires explicit task boundaries.
    • GEM[6] - builds optimization constraints using old data but less effective across shifting domains.

Table.1. Side-by-side comparison of continual learning strategies[6]

  • Replay buffer of fixed size with older samples replaced as new samples are selected to the buffer.
  • Only expert-labeled samples saved to the buffer train and update the model.

Port Service

Port #

DNS

53

http

80, 8080

https

443, 8443

wbt

3389

smb

445, 139, 137

ftp

20, 21

ssh

22

llmnr

5535

other

(unassigned port #'s)

Table.2. Aggregate port mapping and one-hot encoding maps the most commonly used port numbers to their corresponding port services and one-hot encodes them as features for the model to learn from.

[1] D. Silver, Q. Yang, and L. Li. 2013. Lifelong machine learning systems: Beyond learning algorithms. In 2013 AAAI spring symposium series.

[2] A. Rusu, N. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell. 2016. Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016).

[3] J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences 114, 13 (2017), 3521–3526.

[4] F. Zenke, B. Poole, and S. Ganguli. 2017. Continual learning through synaptic intelligence. In International Conference on Machine Learning. PMLR, 3987–3995

[5] S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. Lampert. 2017. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2001–2010.

[6] D. Lopez-Paz and M. Ranzato. 2017. Gradient episodic memory for continual learning. Advances in neural information processing systems 30 (2017).

[7] P. Buzzega, M. Boschini, A. Porrello, D. Abati, and S. Calderara. 2020. Dark experience for general continual learning: a strong, simple baseline. Advances in neural information processing systems 33 (2020), 15920–15930.

[8] “CIC-IDS-2018 on AWS.” https://www.unb.ca/cic/datasets/ids-2018.html (accessed Jul. 18, 2022).

[9] “USB-IDS-1.” http://idsdata.ding.unisannio.it/datasets.html (accessed Jul. 18, 2022).