Update on SLATE and FedOps
Rob Gardner for the SLATE Team
2021.01.06 �OSG Area Coordinators Meeting
1
SLATE update
2
Quick Recap
3
SLATE numbers
4
Aside: Hosted CEs operated by OSG Ops
$ slate instance list --group osg-ops
Name Cluster ID
osg-hosted-ce-amnh-ares uchicago-river-v2 instance_AOUkOiliGjg
osg-hosted-ce-amnh-hel uchicago-river-v2 instance_-5tbF_slj3k
osg-hosted-ce-amnh-mendel uchicago-river-v2 instance_KlZKGY-i5hM
osg-hosted-ce-asu-dell-m240 uchicago-river-v2 instance_-T9qcccY3e0
osg-hosted-ce-clarkson-acres uchicago-prod instance_omQQTLH2-XU
osg-hosted-ce-computecanada-cedar uchicago-river-v2 instance_EjL5pbnc594
osg-hosted-ce-fsu-hnpgrid uchicago-river-v2 instance_j1D_dZ3_jrI
osg-hosted-ce-gsu-acore uchicago-river-v2 instance_qUO78rDyrSw
osg-hosted-ce-my-cluster osgcc instance_6YYWSOQ2Ahk
osg-hosted-ce-nd-caml-gpu uchicago-river-v2 instance_5OP48zv5rek
osg-hosted-ce-psc-bridges uchicago-river-v2 instance_SW6qKz9cFVA
osg-hosted-ce-sdsc-triton-stratus uchicago-river-v2 instance_OwaaDcR5wiU
osg-hosted-ce-sut-ozstar chtc-tiger instance_5qgv0f6m9oQ
osg-hosted-ce-tcnj-elsa uchicago-river-v2 instance_hLptxaFKTjI
osg-hosted-ce-tufts-cluster chtc-tiger instance_DNA8O0VQAIs
osg-hosted-ce-uci-gpatlas uchicago-river-v2 instance_twHw8lU6_Zg
osg-hosted-ce-uconn-xanadu uchicago-river-v2 instance_fI2zEGUbdWg
osg-hosted-ce-ucsd-comet uchicago-river-v2 instance_o6038H1CDIg
osg-hosted-ce-usf-sc uchicago-river-v2 instance_4riG7c9yTFA
osg-hosted-ce-uwm-nemo uchicago-river-v2 instance_Zrk8YgF3yK8
osg-hosted-ce-wsu-grid uchicago-river-v2 instance_oVjOC0nHnN8
$ slate instance list --group osg-covid19
Name Cluster ID
open-science-ce-ce1 uchicago-river-v2 instance_qPo-bcR3TqU
5
Aside: Hosted CEs run by others
$ slate instance list | grep hosted-ce | grep -v osg-ops
osg-hosted-ce-aggiegrid nmsu nmsu instance_qeOWw6HnsiQ
osg-hosted-ce-blin-local-submit-attr chtc-osg osgcc instance_SB5CJyRBj8Y
osg-hosted-ce-discovery nmsu nmsu instance_qOZ2j80yaos
osg-hosted-ce-osg-gatech-dev gatech-dev osg-gatech-dev instance_V1DM5Z3YNyo
osg-hosted-ce-tacc-frontera osg-hepcloud-ops uchicago-river-v2 instance_ZKBN7nCW1hI
osg-hosted-ce-tacc-stampede2 osg-hepcloud-ops uchicago-river-v2 instance_MEsPx3_7fqQ
osg-hosted-ce-uiuc-htc mwt2 uchicago-prod instance_YEyAToBxjJA
slate-dev-osg-hosted-ce-uchicago-grid slate-dev uchicago-prod instance_Hzaq1kea1Ck
6
Aside: Various Squid Caches
$ slate instance list | grep squid
ndcms-osg-frontier-squid-global ndcms notredame instance_GwswqO_izOs
osg-frontier-squid uu-chpc-ops uutah-prod instance_rz4dkyGny_A
osg-frontier-squid ssl uchicago-river-dev instance_d2nV6MP2_4Y
osg-frontier-squid gpn-poc gpn-poc-onenet instance_y1J_s4VymXY
osg-frontier-squid gatech-dev osg-gatech-dev instance_OIZCS8jjKcc
osg-frontier-squid-global nmsu nmsu instance_ykKyfHfFapA
osg-frontier-squid-mwt2-iu mwt2 mwt2-iu instance_tVkVVIrXIIA
osg-frontier-squid-mwt2-uc mwt2 uchicago-prod instance_Mi9mzDKl3OI
osg-frontier-squid-mwt2-uiuc mwt2 mwt2-uiuc instance_9Uu0ma4pz7c
osg-frontier-squid-swt2-cpb swt2-cpb swt2-cpb instance_-9j5nN97ldc
slate-dev-osg-frontier-squid-cvmfs slate-dev koik8scluster instance_OxEZcuOm_H0
slate-dev-osg-frontier-squid-cvmfs slate-dev uchicago-prod instance_js3-usm2paY
slate-dev-osg-frontier-squid-global slate-dev uchicago-prod instance_vTb5dO1fuZA
slate-dev-osg-frontier-squid-global slate-dev umich-prod instance_QAcSmU3wq8o
spt-osg-frontier-squid-global spt spt-npx instance_5m107QSBV0U
ssl-osg-frontier-squid-cvmfs ssl uchicago-river-v2 instance_WkTVO4N_r8w
7
Various XCaches
All of the US sites:
AGLT2
BNL*
MWT2
NET2
SWT2_CPB
EU sites:
LRZ-LMU*
Prague��bold = FedOps
8
All of them are single node deployments.
Developing a chart that supports:
*same image but not deployed using SLATE
Federated Ops Security
9
SLATE Security Personnel
Information Security Officer
Tom Barton
Email: tbarton@uchicago.edu
Mobile Phone: 773-213-1096
Many thanks to Chris Weaver who has now moved on to a cloud engineering position on IceCube at Michigan State University
SLATE Security Staff
Mitchell Steinman
Email: mitchell.steinman@utah.edu
Office Phone: 208-721-2945
Mobile Phone: 208-721-2945
Muhammad Akhdhor
Email: muali@umich.edu
Office Phone: 734-936-3249
Mobile Phone: 813-406-1982
10
High Level Picture on Security
11
Roles in the SLATE Federation
12
Role | Description | Who does this |
Platform Administrator | Operates the central parts of the federation | SLATE Team |
Edge Administrator | Runs a cluster which participates in the federation | OSG Sites |
Application Administrator | Runs one or more services on one or more clusters | OSG operations |
Application Developer | Maintains an application for use on the platform | OSG Software Team |
Application Reviewer | Checks applications for consistency with policy | SLATE Team |
Private cloud operation - DevOps basics
13
Host w/ containers
Host w/ containers
Host w/ containers
Orchestrator
Container registry
Developer
Developer
Developer
Test, accredit
RHEL, SEL, …
Docker, singularity
Kubernetes (+Helm)
Docker Swarm
AWS ECS
Github
...
Jenkins
Puppet
...
A single organization manages everything
this and following from Tom Barton, SIG-ISM/WISE Workshop, Oct 2020
SLATE: federated DevOps
14
Site N
Host w/ containers
Host w/ containers
Host w/ containers
Orchestrator
SLATE Platform
Container registry
Developer
Developer
Developer
SLATE API & user portal
Test, accredit
Site 2
Host w/ containers
Host w/ containers
Host w/ containers
Orchestrator
Site 1
Host w/ containers
Host w/ containers
Host w/ containers
Kubernetes + Helm
Roles in SLATE federated operations
15
Site N
Host w/ containers
Host w/ containers
Host w/ containers
Orchestrator
SLATE Platform
Container registry
Developer
Developer
Developer
SLATE API & user portal
Test, accredit
Project M
Project 2
Project 1
App Admin
App Dev
Site 2
Host w/ containers
Host w/ containers
Host w/ containers
Orchestrator
Site 1
Host w/ containers
Host w/ containers
Host w/ containers
Kubernetes + Helm
Reviewer
Platform Admin
Instruct SLATE to run containers on edge sites
Check conformance with criteria
Publish conformance report
Conform to criteria
Operate securely Support other roles
Edge Admin
Configure SLATE namespace & policies Permit/deny App Admin groups & containers
Trust issues for the site manager
How can I remain responsible for the security of my site if I permit others to run things in it?
A federated platform must further consider:
16
For the any FedOps implementation:
Prospect of a platform gaining unauth priv�What we've done in the SLATE context to address this
Extension of SCI v2 criteria to the federated operations context was accomplished through per-role Obligations documents
17
Container Security
Top container misconfiguration security risks*
RBAC; Secrets; Network policies; Privilege levels; Resource limits/requests; Read-only root file systems; Annotations, labels; Sensitive host mount and access; Image configuration, including provenance
We are currently determining additions to the per-role Obligations documents, application review criteria and procedures, and installation defaults, to address these concerns in the context of SLATE
An aspirational goal: to report each container’s adherence to application review criteria so that Edge Admins can better understand the risk
18
*State of Container and Kubernetes Security, Fall 2020, StackRox
What SCI v2 did and didn’t for SLATE security work
Did
Each of its specifications informed aspects of one or more of the various SLATE security documents
Extension to the federated operations context was pretty straightforward through use of per-role Obligations documents
Didn’t
Help address container security
Provide guidance on its use in a federated operations context
19
OS3 and OS4 (Operational Security directives) don’t really address the upstream DevOps technologies and processes that can have more impact on the resultant security of a running container than its host’s own security configuration
SLATE Policy Areas and Documents
20
Area | Planned Documents | Status |
Overview | Master Information Security Policy and Procedures | In progress |
Definition of Protected Environment (Network Security) | "Overview of SLATE Platform Internals and Security" | Done |
Risk Assessment | Asset Inventory | Done |
Acceptable Use | Acceptable Use Policy | Done |
User Data Handling | Privacy Policy | Done |
Incident Response | Incident Response Policy | Done |
Obligations for each Role | Edge Admin. Obligations, App. Admin. Obligations, App. Dev. Obligations, App. Reviewer Obligations | Done |
Application Review Process | Application Review Procedures | In progress |
Access Control | Access Control Policy | Pending |
Traceability | Traceability Policy | Pending |
Change Management | Change Management Policy | Done |
SLATE Policy Areas and Documents
21
Area | Planned Documents | Status |
Overview | Master Information Security Policy and Procedures | In progress |
Definition of Protected Environment (Network Security) | "Overview of SLATE Platform Internals and Security" | Done |
Risk Assessment | Asset Inventory | Done |
Acceptable Use | Acceptable Use Policy | Done |
User Data Handling | Privacy Policy | Done |
Incident Response | Incident Response Policy | Done |
Obligations for each Role | Edge Admin. Obligations, App. Admin. Obligations, App. Dev. Obligations, App. Reviewer Obligations | Done |
Application Review Process | Application Review Procedures | In progress |
Access Control | Access Control Policy | Pending |
Traceability | Traceability Policy | Pending |
Change Management | Change Management Policy | Done |
Some are SLATE platform specific; others can be used to guide other distributed platforms
Next Steps
Focus is on this set of documents:
Restart the Federated Operation Security Working group
22
security@slateci.io
23