First results from a combined analysis of CERN computing infrastructure metrics

TitleFirst results from a combined analysis of CERN computing infrastructure metrics
Publication TypeConference Paper
Year of Publication2016
AuthorsNieke, C., and D. Duellmann
Conference Name22nd International Conference on Computing in High Energy and Nuclear Physics CHEP2016
Date Published10/2016
Conference LocationSan Francisco, USA
Abstract

The IT Analysis Working Group (AWG) has been formed at CERN across
individual computing units and the experiments to attempt a cross cutting analysis of computing
infrastructure and application metrics. In this presentation we will describe the rst results
obtained using medium/long term data (1 months - 1 year) correlating box level metrics, job
level metrics from LSF and HTCondor, I/O metrics from the physics analysis disk pools (EOS)
and networking and application level metrics from the experiment dashboards. We will cover
in particular the measurement of hardware performance and prediction of job durations, the
latency sensitivity of di erent job types and a search for bottlenecks with the production job
mix in the current infrastructure. The presentation will conclude with the proposal of a small
set of metrics to simplify drawing conclusions also in the more constrained environment of public
cloud deployments.