Participants:
* Ingemar Haggstrom (EISCAT-3D)
* Yin Chen (ENVRI)
* Malgorzata Krakowian (EGI.eu)
* Gergely Sipos (EGI.eu)
Outcome:
Prepare a short document that outlines the scope of the EISCAT_3D - EGI collaboration. The document would be used by EGI and by EISCAT_3D to invite contributors into the collaboration (NGIs, institutes)
The title of the document could be 'Towards a big data strategy for EISCAT_3D'.
Key points in the document:
* EISCAT archive includes 60TB of correlated data collected during 1981-2013. There are interfaces to interact with the data (e.g. GENESI?) but the data is not searchable.
* The collaboration would make the archive searchable by
1. moving the data to EGI storages (grid or cloud to be decided later)
2. registering the data into EGI file catalogues and metadata services
3. studying, identifying and - if feasible - implementing data mirroring and indexing strategies that would suit to the envisaged processing strategies of EISCAT_3D. The studies could cover areas such as use of Hadoop or other big-data technologies on EGI.
Todo for Yin and Ingemar:
-
Harmonise slide 7 with slide 14 - e.g. use the same terminology, indicate 4 data levels, indicate data throughput at the levels. Indicate how the EISCAT archive fits into this picture.
-
Create the first draft of the document (include the diagram), then circulate the document to EGI.
-
Setup a teleconference with EGI to discuss how to complete the document so it could be circulated in the EGI and EISCAT_3D collaborations.
There are minutes attached to this event.
Show them.