Webinar: The EGI Datahub to federate distributed data sets for data-intensive applications in the cloud

by Andrea Manzi (Data Solutions Manager), Lukasz Dutka (PLGrid Technical Director CYFRONET)

Zoom webinar

Zoom webinar


Are you curious to learn more about the QoS and hybrid cloud data processing scenarios for distributed EOSC environments?

Objectives of the webinar

  • Learn about the EGI DataHub service

  • Learn more about the use of protocols such as POSIX and web services  and how it guarantees easy and scalable access to data from cloud and HTC applications

Target audience

Scientific communities, and IT-service providers who support research and education.

Webinar programme (1h)

  • Overview about the EGI DataHub service

  • Real use case scenarios to support data-intensive applications in EGI 

  • Q&A

Description about the presentation

The EGI DataHub allows users to make their data available using different levels of access: from completely unrestricted open access to open data to authenticated access to closed data sets. This is possible as a result of the seamless integration with the EGI AAI service. The data hosted on the EGI DataHub can be readily accessible by cloud Virtual Machines (VMs) or running grid jobs thanks to full integration with EGI Federated Cloud and High-Throughput compute resources. The use of protocols such as POSIX and web services guarantees easy and scalable access to data from cloud and HTC applications. This ensures maximum compatibility with existing applications and minimum hassle for developers and users alike. The EGI DataHub is built on top of the EGI Open Data Platform using Onedata technology to connect a wide range of existing storage services, regardless of their underlying technology (e.g. Lustre, Amazon S3, Ceph, NFS, or dCache).

During this webinar the QoS and hybrid cloud data processing scenarios for distributed EOSC environments based on EGI DataHub and Onedata solutions will be introduced by Lukasz and Andrea. 

About the speakers

Łukasz Dutka, PhD, (M), has significant expertise in Grid systems, large-scale systems, development of applications for business purposes, team and project management in commercial projects as well as EU IST projects. He obtained his M.Sc. in Computer Science from the Jagiellonian University, Poland and a Ph.D. in Computer Science from the University of Science and Technology, Cracow, Poland. He has actively participated in several EU IST projects including CrossGrid, EGI- Engage, Indigo-DataCloud. Since 2008, he has been the Technical Director of PL-GRID project and he is responsible for full operation of the infrastructure including R&D tasks. He is in charge of more than 100 employees involved in the provisioning of PLGrid distributed infrastructure, with special focus on problems of distributed cloud processing and distributed data management. For the last 7 years he has been leading development of the open source Onedata platform thanks to that he gained expertise in the domain of large scale distributed data management and transparent access platform for large scale hybrid clouds.

Andrea Manzi works as Data Solutions Manager at the EGI Foundation assisting user communities in the integration of scientific applications and platforms with the EGI services and pilots and validating solutions against user requirements.

He worked 11 years at CERN and, as part of the storage team, he developed storage  and data transfer solutions, leading the FTS (File Transfer Service) project, and participated in different EU projects (D4Science I and II, iMarine, XDC). Andrea has been previously employed at ISTI-CNR in Italy as software developer. He holds a Master degree in Computer Science from the University of Pisa (Italy).