INTRODUCTION/T. Ferrari INTRODUCTION/T. Ferrari Topic of the EGI/EUDAT/PRACE pilot4 group is information discovery. Objective of this meeting is to understand the commonality of the requirements between MAPPER and VERCE, and to define a mandate once a set of use cases is selected. The duration of the pilot depends on the milestones that will be defined. VERCE/Iraklis VERCE is a support infrastructure for seismologists At the workshop a broader set of requirements was presented. The virtual platform for seismologists will be based on existing technologies. The platform is based on streams (not file based) Private clusters/storage will be integrated with HPC and Grids Storage of data with different levels of persistency Data will have to be pulled from different storage solutions, searching based on metadata. Non permanent archiving is also needed. An experiment needs to be able to be repeatable. Data/metadata may be stored by VERCE partners or they could be elsewhere VERCE is also relying on processing of data EGI: recording of metadata in EMI is based on AMGA, a distributed catalog of metadata, that is key/value pairs describing research data. It allows users to annotate and search files based on their content and descriptions. See manual: https://twiki.cern.ch/twiki/pub/EMI/AMGA/amga-manual_2_3_0.pdf MAPPER The relevant of EUDAT services is still to be understood. At the moment the 2 infrastructures that are more relevant are EGI and PRACE - services/resources to which access is authorized - where jobs can be submitted - availability of resources in the short and medium term (in the future) resource --> computing service (cluster) service --> storage services, visualization services Information: properties of services, status of services (storage resources), status of individual jobs Information should be consistently available from different infrastructures Accounting of storage space --> in prototype in EGI Job status monitoring --> LB, publishing of job status through messaging Some resource can be used as gateways to a given infrastructure, however in PRACE this is mostly static What form this information is this presented? What interfaces are available? --> LDAP EGI: BDII is the service for publishing of data (mainly static and semi-static information about services). https://tomtools.cern.ch/confluence/display/IS/BDII Information can be extracted through LDAP. Information is published using the GLUE format: https://wiki.egi.eu/wiki/Other_Operations_Documentation#Standards Information about job status can depend on how computing job are submitted to the infrastructure: for example they can be extracted from the Logging and Bookkeeping for jobs submitted through WMS (Workload Management System). Storage accounting is being prototype by EGI based on the Storage Accounting usage record standard of OGF - this schema is being considered by EUDAT as well. ACTIONS A mandate needs to be defined. In order to proceed two actions are identified (deadline 15-02-2013). A1. MAPPER/Ilya to identify the two most important use cases of MAPPER A2. VERCE/Iraklis to define the data discovery use cases of VERCE, possibly defining the technical solutions that are being considered by VERCE (if any yet). The scope of pilot1 needs to be checked after the kickoff to understand if data discovery is part of their mandate. NEXT MEETING F2f on the 12th of March, co-located with the EUDAT user forum