1 of 10

ATLAS status report

L Rinaldi, A De Salvo

INFN Tier-1 CdG, September 28th 2018

2 of 10

ATLAS data taking

LHC Data taking just restarted. Integrated luminosity already above 2017 (53.9 vs 50.2)

3 of 10

Global Processing activities

Average 320K jobs/day, peak 420K jobs/day

4 of 10

INFN-T1 processing activities (cumulative)

Average 25K jobs/day, peak 77K jobs/day

Mainly simulation activity (58%) and user analysis (23%)

5 of 10

Harvester and Panda unified queue (UCORE)

Analysis

Production

Evgen

Simul

Reproc

Group production

  • ATLAS wants to decide how many resources to assign to each share
  • Impossible to control through independent SCORE and MCORE queues at each site
  • ⇒ Unified queues with a centrally controlled SCORE and MCORE ratio depending on share status
  • For unified queues to work, sites need to:
    • Not have static SCORE-MCORE partitions and all machines be able to run S and MCORE
    • Allow submission of the different types to the same batch queue and follow the CPU and memory requirements in the JDL

...

SCORE

MCORE

6 of 10

Usage of UCORE (-->MCORE) at INFN-T1

CINECA

CNAF

Expected 1-core/8-core ratio: 10 % (1-core used for MC simulation only)

Now ratio is ok (before 9/15 mcore nodes swithed off for intervention),

BUT in the present system resource allocation is faster for 1-core (8-core filling could be slower)

HTCondor would behave better…, still in preliminary testing phase

7 of 10

Tape Carousel (greater use of tape-based workflow)

8 of 10

9 of 10

10 of 10

the StoRMy summer

Frequent crashes of the ATLAS StoRM FE (details in Lucia’s slides)

mainly due to high number of SRM requests (mainly by WNs)

Several fixes on StoRM (thanks to many), now tests show that the system can manage high load (compatible with production load)