1 of 24

Dr. Charles Kamhoua

Senior Electronics Engineer

Network Security Branch

AI for Cybersecurity

UNCLASSIFIED

U.S. ARMY COMBAT CAPABILITIES DEVELOPMENT COMMAND – ARMY RESEARCH LABORATORY

UNCLASSIFIED

2 of 24

CONTENTS

What is AI?

Introduction to Cybersecurity

Machine Learning

Adversarial Machine Learning

Future Research

Concusion

UNCLASSIFIED

3 of 24

WHAT IS AI

Artificial intelligence (AI) is a wide-ranging branch of computer science concerned with building smart machines capable of performing tasks that typically require human intelligence.

UNCLASSIFIED

4 of 24

APPLICATIONS

UNCLASSIFIED

5 of 24

ADVERSARIAL MACHINE LEARNING

Is a machine learning technique that attempts to fool models by supplying deceptive input.

The most common reason is to cause a malfunction in a machine learning model.

It studies a class of attacks which aim to deteriorate the performance of classifiers on specific tasks.

Adversarial attacks are classified as:

Poisoning attacks.

The attacker influences the training data.

Evasion attacks.

The attacker manipulates the data during deployment to deceive previously trained classifier.

UNCLASSIFIED

6 of 24

ADVERSARIAL MACHINE LEARNING

Evasion attacks are most common, practical types of attacks.

Usually used on intrusion and malware scenarios.

Is not to be confused with GANs

In GANs the adversary goal is to make the model stronger! (non-cooperative game).

But in Adversarial ML, the adversary goal is to fool the model!

UNCLASSIFIED

7 of 24

ADVERSARIAL MACHINE LEARNING

Adding a tiny magnitude of noise makes a huge difference for the Neural Network Classifier

This attack Can be Untargeted 🡪 misclassify the panda

In the case of Targeted attack 🡪 The goal is to classify the panda as another specific class!

UNCLASSIFIED

8 of 24

ADVERSARIAL MACHINE LEARNING

The goal of the adversary is to move the classifier away from the correct label!

For targeted attack, the adversary moves the classifier away from the correct label toward the direction of the targeted class/label

We have two types of attacks

White Box Attack: The adversary have access to the model parameters.

Black Box Attack: The adversary

does not have access to the model parameters

UNCLASSIFIED

9 of 24

GENERATIVE ADVERSARIAL NETWORKS

GANs Can:

Generate New Data.

Such as unreal face images

Try: www.thispersondoesnotexist.com

Restore New Videos from Old recordings.

The GAN has two networks that are working against each others (adversarial).

It generates data (Generative) in the self-supervised learning settings.

Generative models can generate new data instances.

given a set of data instances X and a set of labels Y:

Generative models capture the joint probability p(X, Y)

Discriminative models discriminate between different kinds of data instances.

given a set of data instances X and a set of labels Y:

Discriminative models capture the conditional probability p(Y | X)

UNCLASSIFIED

10 of 24

GAN’S STRUCTURE

Both the generator and the discriminator are neural networks.
The generator output is connected directly to the discriminator input.
Through backpropagation, the discriminator's classification provides a signal that the generator uses to update its weights.

UNCLASSIFIED

11 of 24

GAN SCENARIO

When training begins, the generator produces obviously fake data, and the discriminator quickly learns to tell that it's fake:

UNCLASSIFIED

12 of 24

GAN SCENARIO

Finally, if generator training goes well, the discriminator gets worse at telling the difference between real and fake. It starts to classify fake data as real, and its accuracy decreases.

During discriminator training:
The discriminator classifies both real data and fake data from the generator.
The discriminator loss penalizes the discriminator for misclassifying a real instance as fake or a fake instance as real.
The discriminator updates its weights through backpropagation from the discriminator loss through the discriminator network.

UNCLASSIFIED

13 of 24

BEYOND ML SECURITY

RL algorithms learns by interactions:

Reinforcement learning is learning what to do (in a given situation (state)).

“The learner is not told which actions to take, but instead must discover which actions yield the most reward by trying them.” —Sutton and Barto, Reinforcement Learning: An Introduction

Reinforcement learning is one of three broad categories of machine learning.

Reinforcement Learning with MATLAB

UNCLASSIFIED

14 of 24

REINFORCEMENT LEARNING

RL works with data from a dynamic environment.

The goal is not to cluster data or label data, but to find the best sequence of actions that will generate the optimal outcome.

The way reinforcement learning solves this problem is by allowing a piece of software called an agent to explore, interact with, and learn from the environment.

Reinforcement Learning with MATLAB

UNCLASSIFIED

15 of 24

FROM RL TO GAME THEORY

When the environment is under attack.

There is a new agent (attacker) acting with adversarial goal to minimize the reward of the RL agent.

This formulated as a dynamic game between the two players.

Reinforcement Learning with MATLAB

UNCLASSIFIED

16 of 24

GAME THEORY FOR CYBERSECURITY

Recently, Algorithmic Game Theory have been extensively used to study security problems both in the cyber and physical domains.

The application of game theory in cyber security can be generally divided into two categories:

Cyber attack-defense analysis.

Cyber attack-defense analysis predicts the actions of the cyber attackers through modeling the attack and defense behaviors as games

Cyber security assessment.

The analysis of the equilibrium of the cyber attack-defense game and the predication of the attack and defense strategies can be also used as the basis of cyber security and reliability assessment.

UNCLASSIFIED

17 of 24

GAME THEORY FOR CYBERSECURITY

A Survey of Game Theoretic Methods for Cyber Security Yuan Wang, Yongjun Wang, Jing Liu, Zhijian Huang and Peidai Xie, 2016

UNCLASSIFIED

18 of 24

MORE-GENERAL MODEL POSG

Partially observable stochastic games generalize stochastic games by introducing imperfect information.
The players can only observe their private observations and recall the actions they made.

UNCLASSIFIED

19 of 24

INTRUSION DETECTION USING ML

IDS has been in use for a number of years.
Their objective is to scan network traffic and identify any malicious activities or threats in real-time.

As a security system, like a firewall, an IDS has the security purpose of protecting confidentiality, integrity, and availability, which are the main target of a potential attacker to break.

IDS consists of two kind of attack detection methods:

Anomaly-based detection

Compares the behavioral changes of the system in order to discover abnormalities in action or activities.

Signature-based (or Misuse detection) detection

Fingerprints or signatures from previous attacks are stored and rules are written within the IDS.
These rules are used to compare against new packets that enter the network

UNCLASSIFIED

20 of 24

EXAMPLE: SPAM FILTER

This is used as an example of pattern recognition

There is a vast number of features, or characteristics, that can be associated with spam (labels).

Spam mail being received by users has vastly reduced with the introduction of ML into the spam detection.

Google claimed to block 99.9% of spam mail after introducing Artificial Neural Network (ANN) to their spam filtering (2015)

UNCLASSIFIED

21 of 24

GAME THEORY FOR RESOURCE ALLOCATION

Stackelberg Security Games have been used for secure resource allocation and have been deployed in many applications.

Infrastructure security:

ARMOR at the Los Angeles Airport (LAX) deployed in 2007 to randomize checkpoints on the roadways entering the airport; which was followed by IRIS

A game-theoretic scheduler for randomized deployment of the Air Marshal Service that has been in use since 2009; and PROTECT which is deployed for generating randomized patrol schedules for the Coast Guards

UNCLASSIFIED

22 of 24

GAME THEORY FOR RESOURCE ALLOCATION

Stackelberg Security Games have been used for secure resource allocation and have been deployed in many applications.

Environment security:

Green security games

Focus on defending against environmental crimes. These problems exhibit a spatial and temporal aspect that distinguishes work on these problems from infrastructure security

Opportunistic crime:

Refers to the problem of urban crime where criminals are not committed to detailed plans and are flexible in the execution of their plans, as opportunities arise.
Protecting against such urban crime has been studied as a Stackelberg game and evaluated for deterring fare evasion within the Los Angeles Metro System (TRUSTS) and for crime prevention at the University of Southern California.

UNCLASSIFIED

23 of 24

OPEN PROBLEMS

Scalability

remains an issue despite many existing approaches, especially in handling uncertainty.

Uncertainty

Many of the game/environment parameters may be unknown or partially known to the agents.

Deception

A fundamental issue in any security situation is deception.
The mythical Trojan Horse is a classic example of deception.
Deception by the defender has been studied in GT literature, albeit in simple one-time interaction settings using signaling enabled by the advantage of extra information available to the defender.

UNCLASSIFIED

24 of 24

THANK YOU

Questions!

Contact:

charles.a.kamhoua.civ@army.mil

UNCLASSIFIED