1 of 22

SIEVA

SIEM Visibility assessment

Nil Ortiz - Senior R&D Cybersecurity Engineer

Albert Calvo - Research engineer | PhD Candidate

TNC23 — SIG-CISS meeting

5/06/2023

2 of 22

Who we are

Research engineer | PhD Candidate

Trust-aware systems for cybersecurity and utilities domain

Msc. in Artificial Intelligence

albert.calvo@i2cat.net LinkedIn/in/albertcalvo/

Senior R&D Cybersecurity engineer

Incident response and threat intelligence analysis

Msc. in Cybersecurity

nil.ortiz@i2cat.net LinkedIn/in/nilortiz/

2

Nil Ortiz

Albert Calvo

i2CAT in �a nutshell

Never stop designing the digital future

3 of 22

3

i2CAT is a research center focused on mission driven projects to tackle the challenge of designing the digital society of the future based on research and innovation in advanced digital technologies.

Who we are

i2CAT in �a nutshell

Never stop designing the digital future

4 of 22

Research and Innovation areas

SMART NETWORKS AND SERVICES�6G, 5G, IoT

IMMERSIVE & �INTERACTIVE �TECHNOLOGIES

CYBERSECURITY �& BLOCKCHAIN

DISTRIBUTED ARTIFICIAL INTELLIGENCE

ARTIFICIAL INTELLIGENCE

DRIVEN SYSTEMS

SPACE COMMUNICATIONS

DIGITAL SOCIAL�TECHNOLOGIES

eHEALTH

CCAM

INDUSTRY 4.0

SMART CITIES

AGRICULTURE

CONNECTED, COOPERATIVE AND AUTOMATED MOBILITY

PUBLIC STRATEGIES �& POLICIES

i2CAT in �a nutshell

Never stop designing the digital future

5 of 22

Some of our initiatives

5

DetectUEBA (Threat-Centric ML for Detection capabilities)

  • Adoption of Explainability Proxies for threat detection
  • Align the output of a ML model to MITRE ATT&CK

openUEBA (User-Centric ML for prevention capabilities)

  • Determine behavioural patterns in historical user data
  • Compute the impact of a threat within an infrastructure.

SIEVA

  • Visibility assessment for SOC teams
  • Let us explain this tool in detail …

Limited disclosure, restricted to participants’ organizations.

i2CAT in �a nutshell

Never stop designing the digital future

6 of 22

SIEVA

Q&A

i2CAT in �a nutshell

Never stop designing the digital future

7 of 22

Context

  • Most organizations use SIEMs as the cornerstone of their security operations

  • SIEMs manage all security related data and generate alerts based on rules

  • To develop rules you need to understand what information is required from each available data source

7

i2CAT in �a nutshell

Never stop designing the digital future

8 of 22

Problem

  • SIEMs are very complex tools which need to be properly configured and maintained

  • Security Engineers spend a lot of time doing meaningless tasks related to data engineering

  • Organizations have a hard time understanding their own monitoring needs and capabilities

8

i2CAT in �a nutshell

Never stop designing the digital future

9 of 22

Solution

  • SIEVA is a tool to assess the visibility of a SIEM over a network against the MITRE ATT&CK framework

  • Does NOT do detection, analysis, prevention, risk management, or anything else other than assess visibility

  • Reduces the workload of security engineers taking care of data engineering tasks

  • Allows C-suite to better define long term strategies regarding their monitoring needs

9

i2CAT in �a nutshell

Never stop designing the digital future

10 of 22

MITRE ATT&CK Framework

10

The MITRE ATTACK Framework is a curated knowledge base that tracks cyber adversary tactics and techniques used by threat actors across the entire attack lifecycle.

i2CAT in �a nutshell

Never stop designing the digital future

11 of 22

Architecture

11

i2CAT in �a nutshell

Never stop designing the digital future

12 of 22

Architecture

12

i2CAT in �a nutshell

Never stop designing the digital future

13 of 22

AI Engine - How does it work?

The AI engine is a three-step procedure:

  • Classification step A supervised multiclass classification model is used to determine the typology of logs

  • Identification step A custom NER model is used to extract entities from logs. *

  • Visibility Heuristic step ATT&CK matrix is used to categorise logs according to the degree of visibility

(*) A Dataset containing logs from different vendors is defined during step 1 and 2

13

Classification Step

Identification Step

Visibility Heuristic

Train Dataset

Validation Dataset

AI Engine

i2CAT in �a nutshell

Never stop designing the digital future

14 of 22

AI Engine - How does it work?

Classification Step

- A classification model is used to classify the logs according to the Data Sources defined in Mitre ATT&CK

- We manually build a dataset using logs samples from different vendors.

14

i2CAT in �a nutshell

Never stop designing the digital future

15 of 22

AI Engine - How does it work?

Identification Step

- Named-entity recognition (NER) techniques seeks to locate and classify the different entities in text into pre-defined categories.

- Text files → SIEM Raw Logs

- Categories → IP, Domain, host ..

Common NER pretrained model

Custom NER model

15

i2CAT in �a nutshell

Never stop designing the digital future

16 of 22

AI Engine - How does it work?

Identification Step (Custom NER training)

- Self-made dataset where we manually label the entities in the logs (using custom Regex for each vendor and category)

- A custom NER model is built using the spaCy library

16

Labeling

Train

i2CAT in �a nutshell

Never stop designing the digital future

17 of 22

AI Engine - How does it work?

Visibility Heuristic

  • Putting all together using the MITRE ATT&CK Matrix representation
  • An heuristic is established to associate (Data Sources, Fields) => TTP
    • Visibility heuristic : (No visibility → 0% predicted accuracy per class)

(Partial visibility → !0% predicted accuracy per class)

(Visibility → !0% predicted accuracy and > n categories defined)

17

Visibility

URL and Domain Categories

i2CAT in �a nutshell

Never stop designing the digital future

18 of 22

How does it look?

18

i2CAT in �a nutshell

Never stop designing the digital future

19 of 22

Integration into SOCTools

19

i2CAT in �a nutshell

Never stop designing the digital future

20 of 22

Summarising

How can SIEVA help you?

Reduce efforts in data engineering

Identify gaps on your visibility

Improve your data management & threat

monitoring strategies

20

Open Source available

License AGPLv3

@ i2CAT’s Github SIEVA repository

https://github.com/Fundacio-i2CAT/SIEVA

i2CAT in �a nutshell

Never stop designing the digital future

21 of 22

Planned next steps

Next version :: Planned start Sept/23

  • Integration of vulnerabilities reports
  • Add more data source metrics
  • Add more information about DS / TTPs relations
  • Add output REST API endpoint

SIG-CISS Requirements

-

-

-

-

-

-

-

21

i2CAT in �a nutshell

Never stop designing the digital future

22 of 22

Open Source available

@ i2CAT’s Github SIEVA repository

https://github.com/Fundacio-i2CAT/SIEVA

nil.ortiz@i2cat.net albert.calvo@i2cat.net

LinkedIn/in/nilortiz LinkedIn/in/albertcalvo

Q&A

i2CAT in �a nutshell

Never stop designing the digital future