Workshop: Design your e-Infrastructure

chaired by Gergely Sipos (EGI.eu), Zhiming Zhao (EGI.eu)
Thursday, 9 May 2019 from to (Europe/Amsterdam)
at University of Amsterdam ( C1.112 )
Amsterdam Science Park
Description

The 3rd edition of the Design your e-infrastructure Workshop will take place on 9 May 2019 in Amsterdam, after the EGI Conference 2019.

This interactive workshop will look into the design and setup of e-infrastructures for emerging scientific communities and their use cases.

The event will bring together 7 scientific communities, projects, Research Infrastructures with service and technology providers from the EGI community.

The workshop participants will hear short introduction talks about the EGI services, will analyse the use cases that the participating scientific communities bring to the event then, in small groups, will design and define suitable e-infrastructure setups and roadmaps to implement them using 'off-the-shelf' and customised solutions from the EGI community.  

Use cases

The workshop will focus on the following community use cases and will co-design e-infrastructure setups for them (Find further details about the use cases in the timetable below):

  1. Vincent Negre, INRA, France: PHIS plant phenotyping platform (agricultural sciences)
  2. PIN, Università degli Studi di Firenze, Italy: Access control for the ARIADNEplus services and data (archeology and cultural heritage)
  3. Zheng Meyer, ASTRON, Netherlands: ASTRON Science Data Centre (radio astronomy)
  4. Baptiste Cecconi, OBSPM, France: Data management and computing services for the NeuFAR telescope (radio astronomy)
  5. Adam Belloum, UvA, Netherlands: LOBCDER system for SKA from the PROCESS project (radio astronomy)
  6. Ingemar Haggstrom, EISCAT association, Sweden: Container computing for EISCAT_3D (athmospheric physics)
  7. Mihai Ciubancan, ELI-NP, Romania: Data and computing for the Nuclear Physics facility of the Extreme Light Infrastructure (Laser and nuclear physics)

What will we get out of the workshop?

  • Design and implementation plans for community-specific e-infrastructure use cases
  • Answers to participants' questions concerning EGI services and technologies 
  • Better understanding of the participating scientific use cases
  • Social network formed between use cases and e-infrastructure service providers
  • Gaps in e-infrastructure offerings, and service candidates to fill these gaps
  • Connections among e-infrastructure communities

The workshop is organised for

  • Scientific project, research infrastructures and research communities and groups who would like to setup, access or operate an e-infrastructure. 
  • Technology and service providers from the EGI community. The providers will present their services and will engage with the use cases through the break-out sessions of the afternoon.

Organisers

Members of the EGI User Community Support Team: Gergely Sipos and Yin Chen. The organisers can be reached via the designworkshop@mailman.egi.eu email list.

Registration

  • Registration is free, but compulsory through the 'Apply here' button below. 
  • Your registration will be received by the event managers who will approve it or get back for clarififaction about your role if needed. Don't book flights until we confirm your registration. Contact the organisers via designworkshop@mailman.egi.eu in case of questions. 
  • PLEASE NOTE: Attending the EGI Conference requires separate registration. 

Previous events

This will be the 3rd edition of this event, following similar workshops that were organised at the EGI Conference in 2016, and at the DI4R Conference in 2017. The agenda pages of these previous workshops, the analysed use cases and the designed infrastructure roadmaps are available at https://indico.egi.eu/indico/event/2895/ and https://indico.egi.eu/indico/event/3025/

Go to day
  • Thursday, 9 May 2019
    • 08:30 - 09:00 Arrival & Coffee
    • 09:00 - 09:15 Welcome and introduction
      - Structure of the workshop
      - Goals of the day
      Material: slides powerpoint file
    • 09:15 - 10:25 EGI services
      During this session the EGI service catalogue will be presented and discussed. This will help us establish a common understanding about existing possibilities with EGI, so later during the day we will be able to map scientific use cases to specific services and service capabilities. 
      
      Material: slides powerpoint file
    • 10:25 - 10:45 Coffee break
    • 10:45 - 12:30 E-infrastructure use cases
      Introduction of the scientific use cases and their expected use of EGI. 
      Convener: Dr. Gergely Sipos (EGI.eu)
      • 10:45 PHIS Plant phenotyping platform 15'
        Plant phenotyping refers to a quantitative description of the plant’s anatomical, ontogenetical, physiological and biochemical properties. The PHIS information system network was brought into the EMPHASIS ESFRI from the PHENOME French project to ease data management and analysis within plant phenotyping communities. The system integrates various open source solutions, such as PostgreSQL, MongoDB, RDF4J databases, Apache HTTP,  Apache Tomcat, iRODS. During the workshop we'd like to analyse and understand how the platform could be ported and hosted on EGI cloud resources as well as extended with a distributed file system and data archive. 
        Speaker: Dr. Vincent Negre Vincent (INRA)
        Material: Slides pdf file
      • 11:00 Access control for the ARIADNEplus services and data 15'
        ARIADNEplus is a recently started H2020 project, an integrating activity for the archaeological and cultural heritage research community with a user base larger than 10,000. The project upgrades various systems from the community that offer visual services for graphics and 3D models, and a Natural Language Processing service for knowledge extraction from archaeological texts. Our interest in the workshop is understanding how EGI Authentication-Authorisation services could help us bring our tools into a coherent security framework that would enable find-grained access control of data by the owners towards users via a Web-based Virtual Research Environment. 
        Speaker: Dr. Achille Felicetti (PIN S.c.R.L.)
        Material: Slides powerpoint file
      • 11:15 ASTRON Science Data Centre 15'
        ASTRON (Netherlands Institute for Radio Astronomy) is working on establishing a Science Data Centre in the coming years. The goal of the Science Data Centre is to provide astronomers easy access to astronomical data, compute infrastructure and storage. In order to achieve this goal, a web-based science analysis platform (SAP) is developed, which provides services such as finding data, staging data, processing data, analysing results, and publishing/sharing results, preferably all through a single sign-on mechanism, for example, with users' institution identity. Finding data can be realised by interfacing services provided by the Virtual Observatory (VO) with SAP. Processing data can be achieved in two ways, namely interactive and batch processing. Interactive data processing requires a flexible compute environment, it can be done through for example running Jupyter notebook/lab on a virtual machine. Batch data processing and staging data will require pre-allocated compute/storage/network resources offered by e-infrastructure providers. Authentication and authorisation of all these SAP services need to be handled consistently, for example, through a single sign-on mechanism implemented based on the AARC Blue Print Architecture. This use case would want to explore EGI services mentioned above, and bring EGI services and the Radio Astronomy community together.
        Speaker: Mrs. Zheng Meyer (ASTRON)
        Material: Slides pdf file
      • 11:30 Data management and computing services for the NenuFAR telescope 15'
        The NenuFAR project is a new radio telescope located in Nançay (France). It is an SKA (Square Kilometer Array) precursor, and it will enter in commissioning phase soon. The instrument will produce 3 to 4 PB of data per year. We are now setting up a local data center for reducing and integrating the data prior to the delivery to the observer. We are seeking for solutions from EGI to setup a pipeline that can transfer and store the data (3 to 4 PB per year) in a online facility; Control access to the online data repository to selected observers; Provide VMs and computational time to observers with preconfigured software for post-processing; Store processing results in user space. 
        Speakers: Dr. Alan Loh (OBSPM), Dr. Albert Shih (OBSPM)
        Material: Slides pdf file
      • 11:45 LOBCDER system for SKA from the PROCESS project 15'
        PROCESS, a H2020 project, aims to build an infrastructure for exascale applications. The core component is LOBCDER, a virtual distributed file system based on a micro-infrastructure approach which allows creating a containerized micro-infrastructure of data services required by a community use case. 
        PROCESS is working with SKA/LOFAR dataset to demonstrate its applicability. In this use case, they want to explore whether the infrastructure adopted by PROCESS is easy to integrate with EGI and other existing e-infrastructures.
        Speakers: Dr. Reggie Cushing (University of Amsterdam), Dr. Souley Madougou (Nethlands e-Science Center)
        Material: Slides pdf file
      • 12:00 Container computing for EISCAT_3D 15'
        EISCAT_3D, an ESFRI research infrastructure, is building the world leading incoherent scatter radar for upper atmosphere observation. EISCAT_3D is testing various EGI services in order to manage the scientific data that go be generated by the radar in high speed and volume. In this use case, EISCAT 3D would want to integrate EGI Notebooks and EGI workload manager service (DIRAC) using Docker. This will enable researchers to process data in an interactive, Web-based environment. The result of the work would be scripts and files, to be used later on the large data sets in the non-interactive Dockers.
        Speaker: Dr. Ingemar Haggstrom (EISCAT)
        Material: Slides presentation file pdf file
      • 12:15 Data and computing for the Nuclear Physics facility of the Extreme Light Infrastructure 15'
        The Nuclear Physics facility of the Extreme Light Infrastructure (ELI-NP) will create a new European laboratory with a broad range of science covering frontier fundamental physics, new nuclear physics and astrophysics as well as applications in nuclear materials, radioactive waste management, material science and life sciences. In full operational capability, the total amount of data envisaged at this moment to be collected over a year is 2.5-3PB. We are looking for solutions and partnerships in EGI to combine buffer, mid-term and long-term storage systems; HPC and HTC compute facilities; User access controls; Data transfer and Software distribution services. 
        Speaker: Dr. Mihai Ciubancan (INC-PFSIN)
        Material: Slides powerpoint file pdf file
    • 12:30 - 13:30 Lunch
    • 13:30 - 16:30 Break-out groups - Designing community-specific e-infrastructures (incl. a coffee break between 14:40-15:00)
      Forming break-out groups to analyse and further discuss the e-infrastructure use cases. Each group will consist of representatives of one/more use cases and e-infrastructure experts/providers. Each group will have a convener. The use case analysis will be performed with a pre-defined template which will be provided for the conveners. The outcome of the break-out analysis will be e-infrastructure design and e-infrastructure implementation plans. These will be presented to the audience at the end of the day, and will drive post-workshop implementation work. 
      
      First break-out: Complete the 'Background' and 'Users' topics of the use case presetnations (1h)
      1. Who will be the users? Can the users be characterised? How many are they?
      2. What value will the envisaged system deliver for them (the whole setup)? What will the system exactly deliver to them? (Customer's pains and gains)
      3. How should they use the system? (Customer's job)
      4. What's the timeline for development, testing and large-scale operation? (Consider consecutive releases if possible) 
      
      14:40-15:00 Coffee break
      
      Second break-out: Design the setup and define implementation plan (1.5h)
      1. What should the first version include? - The most basic product prototype imaginable already bringing value to the users (the so-called Minimal Viable Product - MVP)
      2. Which components/services already exist in this architecture?
      3. Which components/services are under development (and by who)?
      4. Which components/services should be brought into the system from EGI or its partners? Which partners can/should do it?
      5. Are there gaps in the current EGI service catalogue that should be filled in order to implement the use case? Which service provider could fill the gap? 
      Material: Template powerpoint file
    • 16:30 - 16:45 Break
    • 16:45 - 18:00 Reporting back and discussion from each break-out group
      Material: slides powerpoint filedown arrow pdf file