11–14 Apr 2011
Radisson Blu Hotel Lietuva, Vilnius
Europe/Vilnius timezone

Experiment Dashboard Providing Generic Functionality for Monitoring of the Distributed Infrastructure

13 Apr 2011, 16:00
1h 30m
Lambda (Radisson Blu Hotel Lietuva, Vilnius)

Lambda

Radisson Blu Hotel Lietuva, Vilnius

Workshop User Support Services - Infrastructure General Workshops

Overview

The Worldwide LHC Computing Grid delivered a scalable infrastructure
for the experiments of the Large Hadron Collider at CERN which this
year started data taking. Reliable monitoring is crucial for achieving
the necessary robustness and efficiency of the infrastructure and, to
a big extent, defines the success of the LHC computing activities. On
the other hand, monitoring of the WLCG infrastructure is a challenging
task since the infrastructure is huge and heterogeneous; it comprises
different middleware platforms (gLite, ARC and OSG) and integrates
more than 170 computing centers in 34 countries. In order to provide
monitoring of the distributed sites and services the Experiment
Dashboard system developed several generic solutions which are shared
by the LHC experiments but can be also used by other virtual
organisations.

URL

http://dashboard.cern.ch

Conclusions

The talk will overview the Dashboard applications for infrastructure
monitoring highlighting the possibility to use these applications
outside the LHC domain.

Impact

The Dashboard applications for infrastructure monitoring are widely
used by the LHC virtual organizations for the computing shifts and
site commissioning activities. During the first year of data taking
Site Usability Dashboard and Site Status Board became essential
components of the LHC computing operations.

Description of the work

The following applications focused on monitoring of the distributed
sites and services are provided by the Experiment Dashboard syestem:
Site Usability Dashboard, Site Status Board, Site View. Site Usability
Dashboard evaluates site usability based on the SAM tests which are
specific to a given virtual organization (VO).
Site Status Board provides a flexible framework which allows VOs to
construct customized monitoring views based on monitoring metrics
which are considered to be critical for various computing activity at
the sites.
SiteView provides a single entry point for site adminstrators to
understand how the site is used by the LHC VOs and to detect eventual
problems preventing the site to perform effectively.
Though initially focused on the needs of the LHC VOs all the
applications are generic and can be adapetd for the needs of other
communities.

Co-authors

Presentation materials

There are no materials yet.