6–8 May 2019
WCW Congress Centre
Europe/Amsterdam timezone

Boosting the CMS computing efficiency for data taking at higher luminosities

8 May 2019, 11:15
15m
Turing (WCW Congress Centre)

Turing

WCW Congress Centre

Science Park 123 1098 XG Amsterdam

Speaker

Leonardo Cristella (CERN)

Description

Thousands of physicists continuously analyze data collected by the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) using the CMS Remote Analysis Builder (CRAB) and the CMS global pool to exploit the resources of the World LHC Computing Grid. Efficient use of such an extensive and expensive system is crucial and the previous design needs to be upgraded for the next data taking at unprecedented luminosities. Supporting varied workflows while preserving efficient resource usage poses special challenges: - scheduling of jobs in a multicore/pilot model where several single core jobs with an undefined runtime run inside pilot jobs with a fixed lifetime; - avoiding that too many concurrent reads from same storage push jobs into I/O wait mode making CPU cycles go idle; - monitor user activity to detect low-efficiency workflows and automatically provide them with advices for a smarter usage of the resources tailored on the use case. In this talk we report on two novel complementary approaches adopted in CMS to improve the scheduling efficiency of user analysis jobs: 1. job automatic splitting, 2. job automatic estimated running time tuning. They both aim at finding an appropriate value for the scheduling runtime. With the automatic splitting mechanism, an estimation of the runtime of the jobs is performed upfront so that an appropriate value can be estimated for the scheduling runtime. With the automatic time tuning mechanism instead, the scheduling runtime is dynamically modified by analyzing the real runtime of jobs after they finish. We also report on how we used the flexibility of the global computing pool to tune the amount, kind, and running locations of jobs exploiting remote access to the input data. We discuss the strategies, concepts, details, and operational experiences, highlighting the pros and cons, and we show how such efforts have helped to improve the computing efficiency in CMS.
Type of abstract Presentation
References Computing efficiency, WLCG, High luminosity, CMS

Primary author

Leonardo Cristella (CERN)

Presentation materials