1 of 98

Project ALVI:�Privacy-Driven Defenses: Federated Learning Security & Authentication

R24-053

2 of 98

Team members

Mr. Kanishka yapa

supervisor

Mr. Samadhi Rathnayake

dr. Kasun Karunarathne

Co-supervisor

External supervisor

  • Seasoned expert with 20+ years in Network, Security, and PMO
  • Specializes in Technology Strategy and Cybersecurity
  • PhD from the University of Colombo
  • MSc from the University of Moratuwa
  • ISO 27701:2019 PIMS Lead Implementer

Peiris B.L.H.D

J.P.A.S. Pathmendre

Athauda A.M.I.R.B

A.R.W.M.V. Hasaranga

3 of 98

01

introduction

4 of 98

  • What is federated learning?
  • What security measures are implemented in federated learning?
  • Is security sufficient in federated learning?

5 of 98

02

background

6 of 98

background

7 of 98

Existing security implementations in federated learning

  • Model aggregation done at a single server
  • Adding gradient noise to local update
  • Authentication using primary methods
  • 9 state of art defenses

8 of 98

Research problem

  • How can we detect and prevent attacks that happens on global model?
  • How can we mitigate backdoor attacks on local model without accuracy lost?
  • How can we authanticate models in a secure manner?
  • How can we make sure VFL system is secured?

This Photo by Unknown Author is licensed under CC BY

9 of 98

Our

objectives

  • Implementing detective & preventive security measures within a system that operates on a federated learning
  • CodeNexa : Dynamic Watermarking Technique for Federated Learning
  • HydraGuard: Backdoor immunity in FL Environments
  • SECUNID: Enhancing Security on Global Model
  • S.H.I.E.L.D :CoAE-SMC Enhanced VFL Security

main objectives

sub objectives

10 of 98

System overview

01

02

03

04

CodeNexa

Dynamic watermarking technique

hydraguard

Backdoor immunity

SECUNID

Enhancing Global Model Security

S.H.I.E.L.D.

Security in VFL

11 of 98

System diagram

12 of 98

Peiris B.L.H.D IT21110184

Cyber Security

IT21110184 | Peiris B.L.H.D| R24-053

13 of 98

Component 1

CODE NEXA :

DYNAMIC WATERMARKING TECHNIQUE FOR FEDERATED LEARNING

IT21110184 | Peiris B.L.H.D| R24-053

14 of 98

BACKGROUND

IT21110184 | Peiris B.L.H.D| R24-053

What existing efforts address the challenges of model ownership and intellectual property protection in federated learning?

15 of 98

IT21110184 | Peiris B.L.H.D| R24-053

OBJECTI VE

Developing a Dynamic Watermarking Technique for Federated Learning to improve model integrity.

Sub Objective 1

Design and Implement Temporal Variation Mechanism

Sub Objective 2

Integrate Dynamic Watermarking with Federated Learning System

Sub Objective 3

Evaluate and Optimize for Non-IID Data Scenarios

16 of 98

RESEARCH GAP

IT21110184 | Peiris B.L.H.D| R24-053

Paper 1

Paper 2

Paper 3

Paper 4

Proposed Solution

Embedding watermarks with low

computational overhead

Preventing watermark removal attacks

Performance impact

Adapt with non-IID data

Paper 1

Paper 2

Paper 3

Paper 4

Proposed Solution

Embedding watermarks with low

computational overhead

Preventing watermark removal

attacks

Performance impact

Adapt with non-IID data

17 of 98

METHODOLOGY

COMPONENT DIAGRAM

ris B.L.H.D| R24-053

18 of 98

PROJECT COMPLETION

OVERALL PROGRESS

IT21110184 | Peiris B.L.H.D| R24-053

19 of 98

PROJECT COMPLETION

WORK DONE

IT21110184 | Peiris B.L.H.D| R24-053

Develop a dynamic watermarking technique

20 of 98

PROJECT COMPLETION

WORK DONE

IT21110184 | Peiris B.L.H.D| R24-053

Watermark generating part.

21 of 98

PROJECT COMPLETION

WORK DONE

Create a log For watermark.

IT21110184 | Peiris B.L.H.D| R24-053

22 of 98

PROJECT COMPLETION

WORK DONE

IT21110184 | Peiris B.L.H.D| R24-053

output

23 of 98

PROJECT COMPLETION

WORK DONE

IT21110184 | Peiris B.L.H.D| R24-053

output

24 of 98

Evaluate the impact of the dynamic watermarking technique on model performance

IT21110184 | Peiris B.L.H.D| R24-053

FUTURE WORK

Analyze the watermark detection and authentication capabilities

Assess the technique's resilience against potential attacks or watermark removal attempts

Testing the implementation with different datasets and model architectures

25 of 98

TECHNOLOGIES

IT21110184 | Peiris B.L.H.D| R24-053

PyTorch Distributed

01

Python

PyTorch

02

03

04

0

Github

26 of 98

REFERENCES

IT21110184 | Peiris B.L.H.D| R24-053

Y. Li, X. Zhu, J. Lei, and F. Li, "Ensuring Federated Ownership Verification with FedBack: A Trigger- Based Watermarking Approach," in 2021 IEEE International Conference on Communications (ICC), pp. 1–6, 2021.

T. Li, Z. Zhou, M. Koushanfar, D. Boneh, and H. Shacham, "FedIPR: Ownership Verification for Federated Deep Neural Network Models," in Proceedings of the 2020 IEEE INFOCOM, pp. 1–9, 2020.

X. Zhang, M. He, L. Song, L. Zhu, W. Wang, W. Jiang, and R.C. Qiu, "Secure Federated Learning Model Verification: A Client-side Backdoor Triggered Watermarking Scheme," IEEE Trans. Dependable Secure Comput., vol. 20, no. 5, pp. 1802–1815, 2022

A. N. Bhagoji, S. Chakraborty, P. Suresh, and D. Prehofer, "WAFFLE: Towards Practical Watermarking for Federated Learning," IEEE Trans. Mobile Comput., vol. 20, no. 2, pp. 333–346, 2021.

Y. Wu, X. Zhou, D. He, Z. Li, X. Wang, M. Li, and Y. Dai, "WMDefense: Using Watermark to Defense Byzantine Attacks in Federated Learning," IEEE Internet of Things J., vol. 10, no. 12, pp. 11093–11104, Dec. 2023.

27 of 98

Component 2 :HydraGuard: Backdoor immunity in FL Environments.

J.P.A.S.Pathmendre – IT21085376

28 of 98

BACKGROUND

01

J.P.A.S.Pathmendre – IT21085376

29 of 98

BACKGROUND

  • What is the main attack that local models commonly face?

  • What are the main types of backdoor attacks on local models?

  • What are the main Types of data poisoning attacks?

J.P.A.S.Pathmendre – IT21085376

30 of 98

RESEARCH PROBLEM

02

J.P.A.S.Pathmendre – IT21085376

31 of 98

RESEARCH PROBLEM

J.P.A.S.Pathmendre – IT21085376

  • Continuous attacks are more aggressive than single-shot attacks.

  • Detecting and Rejecting malicious weights leads to data loss, and data breaches and reduces module accuracy.

  • Existing defense mechanisms need big computational power and violate the essence of the FL.

  • Unreliable Predictions.

32 of 98

RESEARCH GAP

03

J.P.A.S.Pathmendre – IT21085376

33 of 98

RESEARCH GAP

Zhang, K., Tao, G., Xu, Q., Cheng, S., An, S., Liu, Y., Feng, S., Shen, G., Chen, P.Y., Ma, S. and Zhang, X., 2022. Flip: A provable defense framework for backdoor mitigation in federated learning. arXiv preprint arXiv:2210.12873.

J.P.A.S.Pathmendre – IT21085376

34 of 98

RESEARCH GAP

J.P.A.S.Pathmendre – IT21085376

35 of 98

OBJECTIVES

04

J.P.A.S.Pathmendre – IT21085376

36 of 98

OBJECTIVE

Developing a robust preventive and detective mechanism against backdoor attacks in FL systems without Reducing accuracy lost or without computational overhead.

J.P.A.S.Pathmendre – IT21085376

37 of 98

SUB OBJECTIVE

  • Reducing ASR and gaining ACC in FL local models against backdoor attacks without computational overhead.

  • Get more performances than SOTA defenses mechanisms .

  • Trigger the attacks and get the accuracy levels of the implementation and analyze them.

  • Evaluate the performances of local models against backdoor attacks on different datasets.

J.P.A.S.Pathmendre – IT21085376

38 of 98

Literature Review

05

J.P.A.S.Pathmendre – IT21085376

39 of 98

Literature Review

  • How can we address continuous backdoor attacks in federated learning?

  • How can we reduce the computational overhead of defenses against backdoor attacks?

  • How can we prevent loss of accuracy on local models while defending against backdoor attacks?

J.P.A.S.Pathmendre – IT21085376

40 of 98

Literature Review

  • How can we effectively reduce attack success rates of backdoor attacks in federated learning?

J.P.A.S.Pathmendre – IT21085376

41 of 98

NOVELTY

06

J.P.A.S.Pathmendre – IT21085376

42 of 98

NOVELTY

Componenet Digram

Requirments

J.P.A.S.Pathmendre – IT21085376

43 of 98

Component Diagram

J.P.A.S.Pathmendre – IT21085376

44 of 98

REQUIREMENTS

07

J.P.A.S.Pathmendre – IT21085376

45 of 98

REQUIREMENTS

Non-Funtional

Funtional

Trigger Inversion

Reinitializing Linear Classifier

Measure Class Distance

DATA Set-CIFAR10,MNIST

Fashion-Mnist

Maintain ACC and reducing ASR

Reduce Computational Overhead

Maintaining Model Accuracy

46 of 98

Project Completion

08

J.P.A.S.Pathmendre – IT21085376

47 of 98

65%

J.P.A.S.Pathmendre – IT21085376

48 of 98

Resource Collection

J.P.A.S.Pathmendre – IT21085376

Connecting With Senior Researchers

Connecting With Industry Experts

49 of 98

Work Done Model

J.P.A.S.Pathmendre – IT21085376

Model Training Phase

Getting Results

50 of 98

Work Done Model

J.P.A.S.Pathmendre – IT21085376

When Poisoning

When Triger Inversion

51 of 98

Challenges

09

J.P.A.S.Pathmendre – IT21085376

52 of 98

  • Lack of Computational power. (for train the model under one data set (local) get approximately 50 Hours)

J.P.A.S.Pathmendre – IT21085376

53 of 98

Future Work

05

J.P.A.S.Pathmendre – IT21085376

54 of 98

  • Getting Results Using Different Data Sets.(Cifar10).

  • Fixing Best Fit Learning Round Limits.

  • Integrating Linear Classifier Reinitializing.

  • Accuracy Fixing in the most suitable mode

J.P.A.S.Pathmendre – IT21085376

55 of 98

REFERENCES

07

J.P.A.S.Pathmendre – IT21085376

56 of 98

REFERENCES

  1. Zhang, K., Tao, G., Xu, Q., Cheng, S., An, S., Liu, Y., Feng, S., Shen, G., Chen, P.Y., Ma, S. and Zhang, X., 2022. Flip: A provable defense framework for backdoor mitigation in federated learning. arXiv preprint arXiv:2210.12873.

  • Qin, Zeyu, et al. "Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks." arXiv preprint arXiv:2302.01677 (2023).

  • T. Gu, K. Liu, B. Dolan-Gavitt and S. Garg, "BadNets: Evaluating Backdooring Attacks on Deep Neural Networks," in IEEE Access, vol. 7, pp. 47230-47244, 2019, doi: 10.1109/ACCESS.2019.2909068.keywords: {Training;Machine learning;Perturbation methods;Computational modeling;Biological neural networks;Security;Computer security;machine learning;neural networks}

  • S. -M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi and P. Frossard, "Universal Adversarial Perturbations," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 86-94, doi: 10.1109/CVPR.2017.17. keywords: {Neural networks;Visualization;Optimization;Training;Computer architecture;Correlation;Robustness},

  • Mugunthan, Vaikkunth, Anton Peraire-Bueno, and Lalana Kagal. "Privacyfl: A simulator for privacy-preserving and secure federated learning." Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020.

  • Virat Shejwalkar, Amir Houmansadr, Peter Kairouz, and Daniel Ramage. 2022. Back to the drawing board: A critical evaluation of poisoning attacks on production federated learning. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 1354–1371.

J.P.A.S.Pathmendre – IT21085376

57 of 98

ATHAUDA A.M.I.R.B�IT21049354�CYBER SECURITY

58 of 98

COMPONENT 3�

Athauda A.M.I.R.B – IT21049354

SECUNID:

Enhancing Global Model Security

R24-053

59 of 98

01

TABLE OF CONTENTS

02

03

04

Background

Research Problem

Research Gap

Objectives

05

Methodology

06

Evidence of Completion

07

References

Athauda A.M.I.R.B – IT21049354

R24-053

60 of 98

BACKGROUND

01

Athauda A.M.I.R.B – IT21049354

R24-053

61 of 98

  • A global model is generated by a global server and multiple clients. The clients store their samples locally and only share the model with other nodes, which protects the privacy of the raw data. The central server then aggregates all the models into a global model, which is then sent back to the clients to improve model performance

  • Data poisoning & Model poisoning are two significant threats in global model

  • Focused on 3 main attacks,�1. Byzantine Attack�2.Label Flipping Attacks�3.Partial Knowledge Attack

Athauda A.M.I.R.B – IT21049354

R24-053

62 of 98

RESEARCH PROBLEM?

02

Athauda A.M.I.R.B – IT21049354

R24-053

63 of 98

  • Improving the performance of the Federated Learning on non-iid data

  • Developing effective defence algorithms that are robust to various attacks without making unrealistic assumptions

  • Current defence mechanisms need big computational power

  • Current methods usually require some knowledge of the attacks,�1. Malicious participant ratio

2.Examing local datasets(compromise the privacy of participants)

3.Assuming IID data

Athauda A.M.I.R.B – IT21049354

R24-053

64 of 98

RESEARCH GAP

03

Athauda A.M.I.R.B – IT21049354

R24-053

65 of 98

Athauda A.M.I.R.B – IT21049354

R24-053

RESEARCH A

RESEARCH B

RESEARCH C

PROPOSED SOLUTION

Robust outlier detection for security​

 

Efficient handling of Non-IID data​

Integrated approach for FL security & Non-IID data​

Scalability to large FL network​

Real world applicability across diverse domains​

Adherence to Data Privacy Regulations​

User-friendly system deployment​

No special insfracture requirements​

 

66 of 98

OBJECTIVES

04

Athauda A.M.I.R.B – IT21049354

R24-053

67 of 98

Main Objectives

  • Enhancing the global model security by detecting outlier status of the participants and prevent from poisoning attacks without reducing accuracy lost

Sub Objectives

  • Get more performances than FedAvg mechanism

  • Parameters reserved by the participants are examine in each iteration using a statistical outlier detection technique

  • Evaluate the performances of the model against poisoning attacks on different datasets

Athauda A.M.I.R.B – IT21049354

R24-053

68 of 98

METHODOLOGY

05

Athauda A.M.I.R.B – IT21049354

R24-053

69 of 98

SECUNID

Athauda A.M.I.R.B – IT21049354

R24-053

70 of 98

EVIDENCE OF COMPLETION

06

Athauda A.M.I.R.B – IT21049354

R24-053

71 of 98

Athauda A.M.I.R.B – IT21049354

R24-053

72 of 98

Github Code for CIFAR Dataset

Athauda A.M.I.R.B – IT21049354

R24-053

73 of 98

Accuracy Results for Byzantine attack CIFAR Dataset

Athauda A.M.I.R.B – IT21049354

R24-053

74 of 98

Accuracy Results for Byzantine attack CIFAR Dataset with median method

Athauda A.M.I.R.B – IT21049354

R24-053

75 of 98

Accuracy Results for Byzantine attack CIFAR Dataset after implementing SECUNID

Athauda A.M.I.R.B – IT21049354

R24-053

76 of 98

REFERENCES

07

Athauda A.M.I.R.B – IT21049354

R24-053

77 of 98

[1] E. Isik-Polat, G. Polat, and A. Kocyigit, “ARFED: Attack-Resistant Federated averaging based on outlier elimination,” Future Generation Computer Systems, vol. 141, pp. 626–650, Apr. 2023, doi: https://doi.org/10.1016/j.future.2022.12.003.​

[2] H. Zhang, Y. Zhang, X. Que, Y. Liang, and J. Crowcroft, “Efficient federated learning under non-IID conditions with attackers,” Oct. 2022, doi: https://doi.org/10.1145/3556557.3557951.​

[3] D. Panagoda, C. Malinda, C. Wijetunga, L. Rupasinghe, B. Bandara, and C. Liyanapathirana, “Application of Federated Learning in Health Care Sector for Malware Detection and Mitigation Using Software Defined Networking Approach,” IEEE Xplore, Aug. 01, 2022. https://ieeexplore.ieee.org/document/9909488 (accessed Jun. 10, 2023).​

[4] C. Zhou, Y. Sun, D. Wang, and Q. Gao, “Fed-Fi: Federated Learning Malicious Model Detection Method Based on Feature Importance,” Security and Communication Networks, vol. 2022, pp. 1–11, May 2022, doi: https://doi.org/10.1155/2022/7268347.​

[5] Z. Zhang, X. Cao, J. Jia, and N. Z. Gong, “FLDetector: Defending Federated Learning Against Model Poisoning Attacks via Detecting Malicious Clients,” Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Aug. 2022, doi: https://doi.org/10.1145/3534678.3539231.​​

Athauda A.M.I.R.B – IT21049354

R24-053

78 of 98

Athauda A.M.I.R.B – IT21049354

R24-053

Thanks!

79 of 98

A.R.W.M.V.Hasaranga�IT21051548

80 of 98

Table of Contents

  • Introduction and Background: Understanding Vertical Federated Learning
    • Research Problems: Challenges in VFL Security
    • Research Gap: Existing Solutions and Their Limitations
    • Objectives: Main and Sub-objectives for Enhanced Security
    • Novelty of the Approach: Integrating CoAE and SMPC
    • System Design: Component Diagram and System Architecture
    • Requirements: Functional and Non-functional Specifications
    • Research Progress: Preliminary Results and Findings
    • References: Key Sources and Literature
    • Q&A: Open Discussion

Photo by rupixen.com on Unsplash

81 of 98

Background

  • Definition of VFL: Vertical Federated Learning (VFL) involves multiple parties collaboratively learning a predictive model while keeping their training data local, particularly useful when datasets are split vertically among different entities.
    • Importance of Security in VFL: Security is paramount as VFL involves sensitive data across various domains, making it susceptible to cyber attacks such as data breaches, eavesdropping, and inference attacks.
    • Current Security Challenges: Despite advancements, VFL systems are vulnerable to sophisticated cyber threats, particularly label inference attacks which aim to infer sensitive information from the model outputs.

Photo by Shubham Dhage on Unsplash

82 of 98

Research Problems

  • Exposure of Sensitive Labels: Direct attacks can deduce sensitive labels, exposing private data to unauthorized parties.
    • Indirect Privacy Leaks: Passive attacks may infer private information without directly compromising data, utilizing model outputs or metadata.
    • Manipulation Risks: Active attacks involve adversarial inputs to manipulate the learning process, potentially skewing the model's output and functionality.
    • Accumulated Vulnerabilities: Repeated attacks can progressively weaken security measures, leading to systemic vulnerabilities in VFL systems.

Photo by Sharad Bhat on Unsplash

83 of 98

Research Gap

  • Limited Defense Mechanisms: Current VFL security protocols are not robust enough to fully prevent label inference attacks, leading to potential data breaches.
    • Insufficient Mitigation Strategies: Existing solutions may not effectively address all types of label inference attacks, especially sophisticated passive and active forms.
    • Compromised Data Integrity: Lack of comprehensive security measures can lead to compromised integrity of the model's output, affecting decision-making processes.
    • Need for Enhanced Solutions: There is a critical need for integrating advanced mechanisms such as CoAE and SMPC to bolster the resilience of VFL systems.

Photo by Yoel Peterson on Unsplash

84 of 98

Research Gap

85 of 98

Novelty of the Approach

  • Innovative Integration: Integrating Confusional Auto-Encoder (CoAE) and Secure Multi-Party Computation (SMPC) with existing defense mechanisms.
    • Targeting All Attack Forms: This integration is designed to mitigate direct, passive, and active label inference attacks, a method not previously implemented.
    • Enhanced Data Protection: Aims to significantly enhance data protection by keeping sensitive information encrypted and secure during all phases of computation.
    • Pioneering Security Solutions: This unique combination has not been previously attempted, offering potential groundbreaking improvements in federated learning security.

Photo by Adi Goldstein on Unsplash

86 of 98

Component Diagram

87 of 98

Objectives and Sub-objectives

  • Main Objective: Develop and integrate advanced defense mechanisms to secure Vertical Federated Learning systems against all forms of label inference attacks.
    • Sub-objective 1: Implement CoAE techniques to enhance data privacy and model integrity under collaborative environments.
    • Sub-objective 2: Utilize Secure Multi-Party Computation (SMPC) to safeguard data during computation, preventing unauthorized inference.
    • Sub-objective 3: Test and validate the effectiveness of integrated defense strategies on real-world data sets.

Photo by rafaela pimentel on Unsplash

88 of 98

Functional and Non-functional Requirements

  • Functional Requirements: The component should effectively enhance VFL security. The component should maintain the performance of the VFL system.
    • Non-functional Requirements: The component should integrate CoAE and SMPC with existing defense methods. The component should provide means for evaluating its effectiveness.
    • Security Specifics: Detailed security requirements like continuous data protection, robust against cyber threats, and compliance with privacy regulations.
    • Performance Metrics: Expected system performance under normal and stress conditions, ensuring efficient processing and minimal downtime.

Photo by ThisisEngineering RAEng on Unsplash

89 of 98

Research Progress

  • Experiment Setup: Trained a vertical federated learning system with 2 clients using the UCI Adult Income dataset.
    • Data Division: Dataset was split vertically among clients.
    • Attack Simulations: Simulated direct, passive, and active label inference attacks.
    • Impact Assessment: Evaluated the damage these attacks could inflict on the system.

Photo by DeepMind on Unsplash

90 of 98

Research Progress

Direct label inference attack

Trained model

Active label inference attack

Passive label inference attack

91 of 98

Future works

  • Integrate Mitigation Techniques
    • Integrate CoAE
    • Integrate SMPC.

Photo by Sharad Bhat on Unsplash

92 of 98

References

  • 1]C. Fu, X. Zhang, S. Ji, J. Chen, J. Wu, S. Guo, J. Zhou, A. X. Liu, and T. Wang, “Label Inference Attacks Against Vertical Federated Learning,” in 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, May 2021, pp. 1-17, doi: 10.1109/SP40001.2021.00001.
  • [2]H. Shi, Y. Xu, Y. Jiang, H. Yu and L. Cui, "Efficient Asynchronous Multi-Participant Vertical Federated Learning," in IEEE Transactions on Big Data, 2022, doi: 10.1109/TBDATA.2022.3201729.keywords: {Computational modeling;Stochastic processes;Training;Data models;Collaborative work;Privacy;Servers;Federated learning;privacy-preserving;asynchronous distributed computation},
  • [3]Y. Liu et al., “Batch Label Inference and Replacement Attacks in Black-Boxed Vertical Federated Learning,” arXiv:2112.05409 [cs.LG], Feb. 2022. [Online]. Available: 1
  • [4]H. Shi et al., “MVFLS: Multi-participant Vertical Federated Learning based on Secret Sharing,” in AAAI, vol. 35, no. 4, pp. 379-393, Feb. 2021, doi: 10.1609/aaai.v35i4.7010.

Photo by Mari Helin on Unsplash

93 of 98

Questions & Discussion

  • Invitation to Discuss: Feel free to ask any questions or share your thoughts regarding the presentation.

Photo by drmakete lab on Unsplash

94 of 98

Commercialization

  • Target user – Privacy precerving organizations (Banks, Hospitals)
  • Marketing approach – Business conferences and awareness sessions

95 of 98

TECHNOLOGIES

PYTHON

docker

ML

PyTorch

GITHUB

TensorFlow

Jupyter Notebook

96 of 98

WORK�BREAKDOWN�STRUCTURE

97 of 98

Gantt chart

98 of 98

Thank

YOU