JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 3

Discussion on the reaper

2 of 3

Introduction

The reaper partitioning is different from the other daemons (based on heartbeat)

Every reaper instances work on a set of RSEs
If the list of RSEs processed by 2 different reapers overlap, the 2 reaper will process the same files
It can happens that some instances have a lot of job to do whereas the other stay idle → Limit the overall deletion rate
Make it difficult to deploy on a Kubernetes cluster

Proposal is to implement the same heartbeat mechanism as the other daemons

2017-01-01

3 of 3

Proposals

Partitioning based on DIDs hashes :

Should not be too difficult to implement
Potential issue : there are millions of files on a given RSE on the top of the deletion queue. With this approach, all the reapers will run on the same RSE

To address the issue mentioned above, apply first a reordering : Take the x oldest replicas on RSE1 and put them in the deletion queue, then take the x oldest replicas on RSE2…

We can reuse the BEING_DELETED state
It would probably need a new daemon (reaper-preparer)

2017-01-01