The reaper partitioning is different from the other daemons (based on heartbeat)
Every reaper instances work on a set of RSEs
If the list of RSEs processed by 2 different reapers overlap, the 2 reaper will process the same files
It can happens that some instances have a lot of job to do whereas the other stay idle → Limit the overall deletion rate
Make it difficult to deploy on a Kubernetes cluster
Proposal is to implement the same heartbeat mechanism as the other daemons
2017-01-01
3 of 3
Proposals
Partitioning based on DIDs hashes :
Should not be too difficult to implement
Potential issue : there are millions of files on a given RSE on the top of the deletion queue. With this approach, all the reapers will run on the same RSE
To address the issue mentioned above, apply first a reordering : Take the x oldest replicas on RSE1 and put them in the deletion queue, then take the x oldest replicas on RSE2…
We can reuse the BEING_DELETED state
It would probably need a new daemon (reaper-preparer)