Attendance
Dimitris Zilaskos
Christos Kannelopoulos
Marcin Radecki (replacing Tomasz Szepieniec)
Alessandro Paolini
Helene Caudier
Dusan Vugradovic
Mats Nylen
1. Recap from previous meeting actions, for which mails have been send:
- Helene: feedback on VO assessment ressource cic feature
- Helene: feedback alarm/team ticket status
- Dimitris: feedback regarding impact of the increase of the suspension limit to 60/60
- Dimitris: Kostas comment regarding increase of suspension limit.
Helene: Correct URL for the new tool will be send. Another feature of the VO resource assessmnet tool is the ability give vo display of sites the vo is supported, only which vos a site support
Helene: no timeframes defined in Biomed for repsonse to tickets, 2-4 thours in WLCG MOU for alarm tickets for LHC VOs
Dimitris: Regarding increase of the thresholds for suspension, we can propose kostas ideas for introduction at the 2nd year of the project.
Marcins : Need to start measuring this 3 months before the 2nd year, or right now and star enforicing that in the 2nd year and have first suspension march or january?
Tiziana: First suspension will occur in May,
Helene: Consider the case of low availability for a site that is about to be decommissioned
Tiziana: Decommissioning goes together with certification procedure, provide vera with feedback
Timeframe for VO to pull data: 3 month should be enough
Helene: The documentation should be best practices for certification and decommissioned, but not actually enforced
Tiziana: A common ground is needed for everyone, for example accounting not checked in some NGIs. Follow up with vera
Helene: If such a procedure is adapted as official, it should be an EGI requirement from NGIs, not sites.
2. Recap previous meeting decisions
* Remove minimum storage/core requirements
Tiziana: During the meeting with ACE team James proposed to adapt the VO feed mechanism, it tells the ATP using xml what are the virtual sites that are important for a VO. They could be services that are hosted in different physical sites. This could be expanded to be customizable list of services that needed. James also proposed to implement an EGI site for central operations/middleware services. The ACE needs no change to calculate on this. Suggestion: webpages visualizing the virtual site on demand as used by WLCG not a clean solution for operations, better have a virtual site configured in the gocdb. Proposed to and Gilles and he would think about this. The VO feed can be adapted very quickly, a virtual site is created in the GOCDB, and that is then fed to ace.
Tiziana: attach timelines for features that do not need operations work, and for those that need developments the relevant timeline
Helene: who is the one responsible creating the virtual site?
Tiziana: site manager/NGI, then VO. No point to ask WLCG to switch to that or move to gocdb. Other non LHC VOs might find it useful. Do not need to add this in the OLA, probably more to VO-site or VO-NGI ola
Dimitris: how to add the concept on the virtual services on the OLA?
Helene: EGI should be responsible. Add a sentence in the EGI-NGI OLA about fulfillment of requirements of core services for VO.
Dimitris: Perhaps define that the site should specify at least one service as minimum requirement, instead of cores and storage is it is currently
Dusan: At least one service + site bdii which is mandatory for a site to publish information
Tiziana: Important not to compare things that are not the same, it is a radical change. NGIs may not like that. EGEE had common case for all to make comparison easier.
Marcin: implications for COD work, procedures are based on the results of the critical tests.
Mats: Dont have it completely flexible at the site's admin discretion, there must be a sensible list of service, otherwise agrees
Alessandro: Agrees, should provide computing and/or storage resources. In theory a site could provide just a wms or an LGC, the site bdd is mandatory
Tiziana: there is also the extreme case of a LHCB site taking care only LHCB services
Marcin: How a service is categorized as critical, since it will affect rod and cod works, for example for alarms
Tiziana: Either leave it to site managers/ or in an agreement with VOs decide which services are critical, or the NGI to customize the OLA adding its own services, for comparison keep the same list as the current, or just add the site bdii and one additional service as extra requirements. It requires more thought: for now mention both options
* GGus tickets increase 8 hours:
Regarding alarm tickets:
Helene: may be misused, better start by team tickets, for well trained VOs
Tiziana: It is not not clear how they are handled, even by WLCG Tier-1. Maybe make them available to all communities, and se the limit to ack a ticket to 4 hours
Tiziana: Perhaps it is better to start with something simple, the ticket is be acknowledges within the timeframe defined for each GGUS priority
Dimitrs: Should we add in the OLA these time limits, or to somewhere else?
3. Go through the ACE meeting notes and discuss their impact
This was not discussed as the notes have been send to the list, and some parts have been already discussed
4. Points from previous meeting agenda that were not discussed due to lack of time (some could be discussed in 2.) :
* Early adapters:
Tiziana: This is related to what is critical and what is not. One options is to make it not critical by extending GOCDB witht he ability to flag the site as an early adapter and not include this in the statistics. Advantage: can have an EGI virtual site hosting all early adapters
Helene: sites feel they do not have clear say when a test is failing because of something not their fault. a "buttom-up" procedure
Action or Dimitris: how to make this better
Helene: Nagios commemts/downtimes, site admins need to know what is taken into account
* Core services and operational tools:
Tiziana: ACE needs to handle them as a virtual service at EGI/NGI level. Minimum thresholds: not less than 95%.
Dusan: agrees
Helene: not sure, little higher than the availability of last year, doesn't think too high, but has to be checked if its reasonable, a gradual increase approach is better.
Tiziana: would need some history if thresholds are reasonable.
Mats: if a service is defined as core 95% is not very high, 99% is better. Such services are: accounting portal/repository, dashboard, regional ticketing, nagios. But are all core services? accounting is not that important.
Tiziana: core service is central, either to the EGI or NGI
Tiziana: Learn from experience what is reasonable
Helene: Start with a short list first
Actions for Dimitris:
Provide a short summary of the TF and the work so far for the next meeting
Draft an amended OLA proposal to present to the OMB that will take place in November
There are minutes attached to this event.
Show them.