EGI Operations Meeting - minutes. Attendance: Jan Kundrat, Edgars Znots, Vladimir Slavnic, Tomas Kouba,Torsten Antoni,florian zrenner,Christoph Witzig,Mario David,Dejan Lesjak,Miroslav Dobrucky,Alessandro Paolini,Mingchao Ma,Luuk Uljee,Paolo Veronesi,Emir Imamagic,Szabolcs Hernath,Gabor Roczei,Goncalo Borges,Christos Papachristos,Andrea Cristofori,Anders Wnnen,Onur Temizsoylu,Feyza Eryol, Marcin Radecki, Malgorzata Krakowian, Michaela Lechner, Michael Gronager, CNRS/France over the phone bridge Minute taker Marcin 1. Status of gLite releases (Mario) Mario presented status of gLite by following the presentation attached. Outstanding remarks: * New node type - "glite cluster". To be deployed on CE. Allows to publish subclusters for nodes with of different type. * More early adopters for gLite 3.1 needed. * LIP done sBDII tests - seems OK. Able to publish resources with glue 2.0 schema. Comments: Complaint from Michael Gronager about the use of GGUS DMSU team, we should have something for people to trigger update process. Savannah bugs not good for that as it is internal for developers and we need to contact sites and users. Need to advertise for all production sites and users, they should use GGUS which will go to DMSU for bugs in the mw. 2. Staged rolloup workflow (Mario) Wole process is depicted on milestone 402 document. Full workflow will be put on wiki. There will be shorter guidelines for each team on what a specific team has to do (for example for EA sites). 3. CA realeases in EGI era. Process described at given URL. Need to review the process. *Everybody is encouraged to take a look* Michaela: The document is kind of claiming that we have 2 policies while in fact there is one, reference to old roc managers mailing list, persional addresses there etc. Shall we contact David Groep directly? Mario: Send to David G and maybe CC to me. David G. responsible for updating wiki. 4. glite 3.1 to 3.2 migration. (Mario) Feedback from NGIs on supported and end of life of gLite components. Everybody should make a plan. If gLite 3.1 components are needed to be supported it must be made known. Some NGI's already started to answer/give feedback, NGI_AEGIS, CERN, NGI_TR Ibergrid 5. Operational Tools Deployment (Emir) a) Ops. tool mailing list - for operational tool administrators, announcing new releases, communication between admins, developers can answer (regional nagios admins). So far 41 members, address: tool-admins@maiilman.egi.eu registration: https://mailman.egi.eu/mailman/listinfo/tool-admins b) migration to new domain egi.eu Going well, GOCDB harder to migrate, to be done by end of month c) availability calculation for operational tools Started RT tickets on tools developes to provide tools to monitor them, 5-6 tickets in JRA1 queue. d) Tools failover configuration Not much progress on that one. e) deployment * nagios boxes - there are 9 NGI instances and 8 ROC instances (covering 29 EGI countries), South America, 5 project instances (run by CERN) 3 of them started migration: AP - migration to ROC instance, Canada - almost finished, Russia - tricky * operations portal (developed by CNRS) - distibuted to Greek, CZ, SWE. They are validating it. * there was a minor GGUS release and minor GOCDB release. GOCDB pointed to new PI. 6. COD issues (Małgorzata) Collecting feedback for May Availability/Reliability report met some problems with responsiveness. Russian ROC did not respond. No one representing Russian ROC was present at the meeting so issue will be escalated to OMB. Goncalo: DI-Uminho was already suspended with agreement with site responsible, the site is not capable of sustaining in egi prod infrastructure. Proposal to suspend non-responsive sites (Marcin) Marcin proposed to suspend sites which did not provide any answer for explanation for their low avaliablity/reliability. No answer means that also NGI was not possible to contact with them and the site can be considered not manageable. Goncalo: There were some sites with low availability found in NGI_IBERGRID A/R report recently, but that seems to be due to they were certified in the middle of the month and avaialbility is computed as if they were in production from the begining of the month. Marcin: issue will be followed up with Dimitris, who is responsible for chasing such cases before the report is announced to NOC managers. No more comments heard. Malgorzata: Tickets asking for explanations were opened for June today, please have a look and collect feedback. Review of open actions (RT "operations" queue) 2 - will be done in next few days, so EA sites know what they should do 3 - reported on the meeting today 4 - may be interesting for other NGIs, please look at. 5 - Daniele will update RT ticket. Gilles said not sifficult to add new service but someone has to come up with the procedure e.g. this type of service we allow, this one not. Daniele proposal: request for new machine type should go through OTAG. 6 - NGIs can commment. 7 - wait for Tiziana to comment. 8 - will be followed in EGI era. AOB Michaela: Dedicated meeting for milestone 407 "integration resources in EGI production infrastructure" - assess the status of interoperability activities, ops. tools procedures. Come together and discuss what to put into the mailstione. This Thursday 10.00. E-mail announcement will be sent. Next meeting: Mario will not be here in 2 week, next meeting 23.08.2010