LFN AI Initiative
Four Key Areas in AI for Networking
4- Network Infrastructure (Open Source Projects + Vendor solutions) + Domain Data sets The network itself and the data it provides and acts on the learnings from the above layers |
1- Applications/AI Use Cases in Networking The new functionality that is made available using AI |
2b. AI Models (Generic) The AI capabilities, such as prediction, content generation, anomaly detection, etc. |
3- Data and AI infrastructure (computing elements) (Sharing, Governance, Processing) How data is collected and stored. The resources used for processing, running and training the models |
2a. AI Models (Domain Specific) The AI capabilities, specific to Networking and Domain |
Networking Use case group analysis - LFN Survey results
Network Operations & optimization rank higher, although wide interest in all 6 categories
Focus Areas of LFN - Survey results
Q: Where do you think are the 2-3 things existing LFN projects should focus on in order to accelerate adoption of AI in networking?
The keys for unleashing the power of AI for Networking
Open Source Collaboration is the only way to address these challenges
Telco Data Anonymization
The Anuket/Thoth project
The challenge of PII in Telco data sets
Good AI models require high quality network data
Training Telco AI models has to be performed on actual Telco data
Raw Telco data sets contain personally identifiable Information (PII)
What does the Anuket/Thoth project do?
Agree
on what constitute the ‘sensitive’ data. Agree on the problem set (questions we would want to answer)
Try
available tools (Libraries) and techniques (implementations) on the available datasets.
Find
the gaps in datasets, tools and techniques.
Fill
those gaps considering the problem-set.
Publish
the results.
What are the techniques we are trying?
Natural Language Processing
NLP techniques for the Logs.
2
Classic Techniques
K-Anonymity, L-Diversity, T-Closeness, Differential Privacy
1
GANs
Synthetic data generation as a perfect anonymization solution.
4
Autoencoders
Unsupervised techniques for the anonymization
3
Questions the Anuket/Thoth project is answering
Can we build a tool that takes in the dataset and anonymizes it automatically using the best technique ?
04
Is there a single technique that is applicable to all kinds of sensitive information?
03
What kind of sensitive information is well suited for each of the techniques?
02
Do we have the datasets, which consists of the all the sensitive information?
01