EGI Monitoring is the key service needed to gain insights into the Services that are part of the EGI Infrastructure. It is based on ARGO Monitoring Service that provides a flexible and scalable framework for monitoring status, availability and reliability of a wide range of services and is able to quickly detect, correlate, and analyze data for the detection of errors. Service Providers are able to make use of the EGI Monitoring Service via various sources of truth (e.g. CMDB, EOSC Resource Catalogue) so that they are able to get notifications when a problem occurs or ARGO reports to advertise with confidence the stability and reliability of their services. Similarly Researchers or Research communities are able to gain insights about the Services they want to use.
Two new functionalities will enable gaining even better insights into Services: Service Trends and Status Pages. Via the constant monitoring of the services, we have the ability to analyze service trends and provide insights such as lists of top services with Critical, Warning or Unknown status or top services with authentication problems. Whether it's a server issue, bug in production, the simple truth is that a problem happens. The main idea of Status Pages is to build communities' trust and inform in real time about the status of the services in one simple view.
We plan to streamline the process of registering new metrics and probes thus allowing faster inclusion of new metrics into ARGO reports. We provide a new all-inclusive report that includes all deployed metrics by default. Finally, EGI Monitoring is capable of exporting Monitoring Results via API or ARGO Messaging to 3rd Party dashboards and to EOSC Exchange Monitoring so as to further promote the Availability and Reliability of services that comprise the EGI Service Portfolio.
Any relevant links
|Topic||EOSC Compute Platform|