1 of 8

Next steps for the Cost and Performance Modelling WG

2 of 8

Objective

  • Collect and discuss ideas on the future direction
  • Some suggestions proposed here
  • More to be added hopefully today, in any case in the next days and weeks

3 of 8

Aggregate results

  • Create a one-stop page for
    • Instructions to run the reference workloads
    • Systematically collecting performance data obtained with the tools we use
  • Should profit from the work in the benchmarking group

4 of 8

Understanding the use of data

  • Need an agreed and prioritized work plan of what we see important to understand
  • Agree on a common way to present the results when we do studies for different experiments
  • Do we have all the data needed to do complete studies, do we need data access measurements from the sites?
  • How to coordinate these studies with DOMA?

5 of 8

Site Cost Modeling

  • We have spreadsheets that can compute the TCO for HS06 and TB. Who should use the spreadsheets?
  • Maybe we should create reference spreadsheets with typical “reference” numbers, so that each site interested can use their local cost structure and compare it with the reference….
  • How can the sheets be extended to reflect also future evolutions (dCost/dt)?
  • Allowing for server configurations with GPUs?

6 of 8

Resource Requirement Predictions and the common framework?

  • Should we stop dealing with this and rely on experiments providing input without allowing people to play with the values?
  • Something we are still missing is the calculation of network requirements. We should be able to calculate the required bandwidth at both the LAN and the WAN levels for the computing model being considered
  • The increasing role of GPUs and other accelerators will also have an impact on the resource estimation calculations. Initially this type of resources will probably be provided by specific facilities but using GPUs at scale will considerably change the resource requirements

7 of 8

Modeling site usage

  • The metric work is needed for input. If we can capitalise on the work done by KIT we would be much closer to the goal
    • https://indico.cern.ch/event/587955/contributions/2936249/

8 of 8

Miscellanea

  • Tape cost:
    • even not absolute, relative to disk.
    • But taking into account the complexity of e.g. Oracle dropping out.
    • Correlate with size of the Tape installation
  • Metrics and performance measurement, common tools:
    • prmon usage by all experiments
  • Disk tape - follow DOMA qos work to understand how to attach cost to the various configurations being derived