1 of 10

WLCG IAM Token Issuers

  • INDIGO IAM Deployed at CERN on HA Kubernetes
  • Experiment instances: ATLAS, ALICE, CMS, LHCb, FCC, ILC, AMBER, CALICE, COMPASS
  • Operations: DTEAM, OPS
  • Collaborating closely with INFN CNAF team for prompt updates

2 of 10

Deployment Evolution

Milestones

  • (October) High Availability Kubernetes deployment completed in October
  • (October) “Cold Recovery” Business Continuity exercise was completed successfully
  • (November?) Waiting for experiments and sites to fully support new endpoints before turning off Openshift instances (ideally ASAP as we are a small team and running both is not ideal!)

A note on DBOD

Although a “High Availability” DBOD is available at CERN we propose that we keep the standard DBOD, for the following reasons

  • HA DBOD may actually be less stable due to its complexity
  • The standard DBOD provides an excellent level of service, and we are not hitting its limits
  • A read-only replica or independent backup may be a reasonable measure to provide redundancy and peace of mind - to be investigated whether compatible/useful with IAM
  • Code improvements are scheduled to remove the slowness caused by storing access tokens

3 of 10

What are our technical limits?

  • Access token:

4 of 10

What are our technical limits?

  • Refresh token (Requesting AT+RT using token-exchange):

5 of 10

What are our technical limits?

  • Full workflow (AT AT+RT AT)

Token exchange

Refresh

6 of 10

What are our technical limits?

  • Full workflow (AT AT+RT AT) (100 Hz for 6:30 hours)

Token exchange

Refresh

7 of 10

INDIGO IAM Code Improvements

  • Frequent releases
  • Aug - 1.10
    • AUP reminders
    • Ability to move to HA
  • Aug - 1.10.1
    • bug fixes on 1.10
  • Nov/Dec - 1.10.2 (waiting for bug fix)
  • Technical students will join team in 2025 to work on WLCG/CERN specific features in INDIGO IAM

8 of 10

Extra Slides

9 of 10

Recent issues

Issue

Workaround

Ticket

AUP signing link led to Error page

Email templates overwritten locally with working link

HR DB affiliations do not appear to be consistent with secretariats’ input

SNOW ticket RQF2906708, related issue https://github.com/indigo-iam/iam/issues/863

Security required removing the local login form

Local login hidden from the internet for Kubernetes instances. WIP for Openshift.

10 of 10

Long term staffing

  • This topic has been raised within CERN IT, support from stakeholders is always welcome
  • Currently running well due to excellent (temporary) personnel and very strong collaboration within the CERN IT department, with CNAF and with experiment representatives
  • Although VOMS was staffed with 0.1 FTE, WLCG IAM should be expected to require additional effort for many reasons:
    • Additional features (OAuth)
    • Additional load on the infrastructure (depending on the chosen workflows)
    • A quickly evolving codebase that means there are frequent upgrades and improvements
    • Additional instances - CERN is running WLCG IAM for several experiments that previously used VOMS hosted at DESY (these also require a slightly different login workflow as not all users have CERN accounts)
    • Lack of trust in reliability the token issuers -> longer token times -> poorer security