Overview
Monday, 6 May- Opening Plenary
- Tiziana Ferrari, EGI Foundation
- Ian Bird, CERN
- Andreas Veispak, European Commision, DG-CONNECT
- Alexandre Bonvin, WeNMR
- Sorina Camarasu Pop, CNRS
- Stefano Nativi, JRC
- Sarah Caudill, Nikhef
Wednesday, 8 May - Future Technical Challenges
Speakers
Ian Bird
CERN Large Hadron Collider
Title: Evolving distributed computing for the LHC
Session: Monday, 13:50, Opening Plenary
Abstract: In this talk I will look back at the successes of the global distributed computing environment for LHC, and some of the lessons learned. In looking forward to the High Luminosity upgrade of the LHC, where we anticipate data volumes on the order of several Exabytes per year, there are a number of ongoing R&D project investigating how the system will evolve to manage the overall capital and operational cost, whilst retaining the key attributes of a globally federated and collaborative data and computing infrastructure. I will describe some of the most important of these activities, and the potential synergies with other large scale and international science projects.
Alexandre Bonvin
Bijvoet Center for Biomolecular Research, Untrecht University
Title: Structural biology in the clouds: past, present and future
Session: Monday, 16:00: Community Presentations
Abstract: Structural biology deals with the characterization of the structural (atomic coordinates) and dynamic (fluctuation of atomic coordinates over time) properties of biological macromolecules and adducts thereof. Gaining insight into 3D structures of biomolecules is highly relevant with numerous application in health and food sciences. Since 2010, the WeNMR project has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the grid computational infrastructure provided by EGI. These services have been further developed in subsequent initiatives under the H2020 EGI-ENGAGE West-Life projects. The WeNMR services are currently operating under the European Open Science Cloud with the EOSC-Hub project. In my talk, I will summarize 10 years of successful use of e-infrastructure solutions to serve a large worldwide community of users (> 12’000 to date), providing them with user-friendly, web-based solutions that allow to run complex workflows in structural biology.
Helge Meinhard
CERN
Title: Trends in computing technologies and markets
Session: Wednesday, 9:00: Future technical challenges of data and compute intensive sciences
Abstract: Driven by the need to carefully plan the resources for the next data taking periods of CERN’s Large Hadron Collider, sites started a common activity tasked with tracking the evolution of technologies and markets of concern to the data centres. The talk will give an overview of general and semiconductor markets, server markets, CPUs and accelerators, memories, storage and networks; it will highlight some areas of uncertainties and risks. The components and servers that will be covered by the presentation are used in domains other than HEP as well; the activity is presumably relevant for a wider audience than HEP.
Sorina Camarasu Pop
CNRS
Title: BIOMED, VIP and the long tail of science: achievements and lessons learnt
Session: Monday, 16:20: Community Presentations
Abstract: The biomed Virtual Organization (VO) is a large scale international and multi-disciplinary VO supporting communities from the Life Sciences sector, with three main thematic groups: medical image analysis, bioinformatics and drug discovery. The VO is operated on the EGI infrastructure and supported by more than 50 sites, delivering access to a large number of heterogeneous resources. Scientific gateways such as the Virtual Imaging Platform (VIP) allow academic researchers worldwide to easily access high level services built on top of these resources. With more than 1000 registered users, some 20 applications and 54 publications made with results computed on the platform, VIP is an established service constantly striving to answer the researchers’ needs, such as sharing of applications and data, transparent use of distributed computing (CPUs and GPUs) and storage, as well as enabling open and reproducible science. Based on our experience, we will give an overview of the biomed VO and the VIP platform, illustrated with examples and success stories from the community. We will then carefully analyse the lessons learnt and current needs that can help us draft the roadmap for a promising future.
Ivan Rodero
EMSO ERIC
Title: Towards a Data Integration System for European Ocean Observatories – EMSO ERIC’s Perspective
Session: Wednesday, 9:30: Future technical challenges of data and compute intensive sciences
Abstract: The EMSO ERIC is a European environmental research infrastructure distributed throughout European seas, from the North Atlantic across the Mediterranean to the Black Sea, at 11 key environmental sites whose overall objective is to record essential ocean variables to respond to the societal challenges in global change issues. This talk addressees the challenges for the integration and harmonization of data, meta-data and analysis capabilities from the distributed and heterogeneous EMSO sites according to FAIR principles. It also discusses current data management and information technology efforts and short-term goals using cloud-based abstractions and resources provided by EGI. Finally, it outlines open challenges in the EMSO ERIC’s longer-term agenda including models for seamlessly connecting multi-disciplinary large-scale observing systems.
Stefano Nativi
JRC
Title: The Datafication paradigm to face the global changes of our planet
Session: Monday, 16:30: Community presentations
Abstract: The current Digital Transformation is profoundly changing both economy and society. This is a result of the uptake and integration of numerous digital technologies in all the aspects of our daily lives, both professional and social. In this presentation we focus in particular on the Environmental and Space sectors which are also moving from data collection and provision to implementing the full Datafication paradigm. This revolution builds not only on the ICT technology development (e.g. IoT, AI, HPC and network function virtualization) but also on both economic and social developments (e.g. hyper-connectivity, digital twins, social networks, platform-based economy). We use the example of GEOSS to show how the policy needs expressed in the SDGs, Sendai Framework, and Paris climate change agreement need to be addressed not just with data, but with more advanced products including policy options, assessment of possible impacts, multiple viewpoints from different stakeholders, and so on. This exploits the datificaiton paradigm and the establishment of platforms to interact with the different stakeholders, gather their feedback and turn it into the semantic context that enriches the original data. This adding-value process is becoming increasingly crucial to make sure that new algorithms can be trained effectively to process the increasing amount of data available and address the global challenges of our planet. In the European context we will show some examples from the EuroGEOSS initiative, leveraging also the EOSC capabilities, to support EU policy.
Vincent Negre
INRA
Title: Data analytics needs in the agrifood sector
Session: Wednesday, 09:00: Future technical challenges of data and compute-intensive sciences
Abstract: The H2020 e-ROSA has defined a three-layer architecture as federated e-infrastructure to address societal challenges of Agriculture and Food that require multi-disciplinary approaches. In that direction, the European project AGINFRA+ aims to exploit core e-infrastructures such as EGI.eu, OpenAIRE, EUDAT and D4Science, to provide a sustainable channel addressing adjacent but not fully connected user communities around Agriculture and Food. In this context, a Virtual Research Environment (VRE) has been developed for the Plant Phenotyping Research community. A VRE is a collaborative Web platform which provides different useful components to make data analysis. This VRE has been enriched with data exploration and data retrieving services in order to transparently access to multiple sources of phenotyping data. These services are based on OpenSILEX-PHIS which is an open-source information system designed for plant phenotyping experiments. Several instances of OpenSILEX - PHIS have been deployed for french phenotyping platforms on a national infrastructure. It is planned to deploy others instances on EGI infrastructures for the european partners (Emphasis ESFRI). OpenSILEX-PHIS interoperates with external resources via web services, thereby allowing data integration into other systems. The VRE is also equipped with a JupyterLab provisioned by EGI. JupyterLab is the next-generation web-based user interface for Project Jupyter allowing users to work with documents and activities such as Jupyter notebooks. EGI also provided a Galaxy server which is a scientific workflow system used by plant science community for building multi-step computational analyses facilitating data analysis persistence. That significative advance makes OpenSILEX-PHIS as a representative component of the future Food cloud and gives a basis for the requirements for an e-infrastructure supporting data intensive agri-food sciences.