OTAG Meeting: Monitoring (12 Sep 2016)
Cyril L'Orphelin (NGI_FRANCE)
Ionut Vasile (NGI_RO)
Stephen Burke
Vincenzo Spinoso (EGI Operations)
Ian Neilson (STFC)
Miroslav Dobrucky, NGI_SK
Emir Imamagic (SRCE/ARGO)
Sven Gabriel (Nikhef, EGI CSIRT)
David Meredith (STFC, GOCDB)
Anna Golik (NGI_PL)
Alessandro Paolini (EGI Operations)
David Crooks (NGI_UK)
Luis Alves (CSC/NGI_FI)
Christos (GRNET)
Peter Solagna (EGI Operations)
Presentation about the current ARGO status and development roadmap.
Questions and comments:
At the moment central nagios do not sends alerts via email to site administrators, this is due to the fact that centralized monitoring has two nagios boxes and enabling notifications will send too many notifications. Being centralized at the computing engine level it will allow to have a reliable (high availability) and controlled system for the notifications.
Presentation about the ARGO proposal to unify all the topology information for ARGO under GOCDB. Slides and document.
Comments:
VO Information: The proposal does not restrict the number of endpoints for every service and VO information can be added. But the proposal does not require VO information explicitly in GOCDB, the endpoints marked as "monitored" will have to support 'ops' VO.
The proposal will be compatible with others use of GOCDB requiring VO information, for example added with key-value pair to an endpoint, but at the moment it will not be mandatory.
Automatic update: at the moment the best option is to update manually GOCDB. Ideally there should not be too many changes in the endpoints to require information to be automatically fed to GOCDB, and from a functional point of view, GOCDB information should be relatively static, and be the reference source of information even if, let say, services have errors in the local configurations.
Security monitoring: ideally the same topology information could be used for security monitoring, but secmon has additional requirements that require to be discussed in a separate meeting.
Information not in synch with BDII: manual insertion of information in GOCDB can generate mismatches with what is published in the BDII
Action: NGIs to disseminate this proposal among site managers, and provide feedback. It will be presented also at the OMB meeting (Sep 15th)
If the proposal will be approved, a timeline will be presented at the OMB (possibly October).
A number of requirements suggested by Ionut (NGI_RO) have been discussed.
Contextualisation based on user membership to Sites or NGIs:
ARGO is being connected to the EGI AAI, contextualization is something that is not possible to achieve in the immediate future.
Other requirements may have a shorter timeline to implement. All the requirements will be evaluated by the ARGO team.
A dedicated queue for ARGO requirements is going to be implemented in RT to track all the requirements.
Given the limited amount of requirements available so far there is no immediate need for priortization.
A summary will be presented at the OMB on September 15th