Internal training on the EGI DataHub

Giuseppe La Rocca (

This is the first internal training event organized to introduce the EGI DataHub service. The EGI DataHub service was designed to make data discoverable and available in an easy way across all EGI federated resources.

The EGI DataHub allows users users to make their data available using different levelos of access: from completely unrestricted open access to open data, to closed data sets. This is possible as a result of the seamless integration with the EGI AAI service.

The data hosted on the EGI DataHub can be readily accessible by cloud Virtual Machines (VMs) or running grid jobs thanks to full integration with EGI Federated Cloud and High-Throughput compute resources. The use of protocols such as POSIX and web services guarantees easy and scalable access to data from cloud and HTC applications. This ensures maximum compatibility with existing applications and minimum hassle for developers and users alike.

The EGI DataHub is built on top of Onedata technology to connect a wide range of existing storage services, regardless of their underlying technology (e.g. Lustre, Amazon S3, Ceph, NFS, or dCache).

Learning objectives

During this session EGI staff members will learn the basic concepts and functionalities to use Onedata software stack. From a technical perspective, Onedata provides a set of interfaces for easy access and management of data distributed among the storage providers resources.

With Onedata, users can access, store, process and publish data using global data storage backend by computing centers and storage providers worldwide. Onedata software stack focuses on instant, transparent access to distributed data sets, without unnecessary staging and migration, allowing access to the data directly from your local computer or worker node.

Target audience

This training track is relevant for researchers, IT support people, EGI staff members and service providers who operate services for Open Science.

Training material

All the training material used during the hands-on session can be found on GitHub.


    • 12:00 12:05
      Welcome and introduction of the training 5m
    • 12:05 12:35
      Introduction of the EGI DataHub service 30m
      Motivation, Terminology, and High-level functionalities
    • 12:35 12:50
      Hands-on session 15m
      Introduction of the hands-on session and the exercises. Exercise no.1 - Install the oneclient docker container to access the volume space Exercise no.2 - Analyse the datasets in the volume space Exercise no.3 - Register the outcome of the analysis in Zenodo general-purpose open-access repository Exervice no.4 - How to generate an ACCESS_TOKEN via REST API to access the EGI DataHub
    • 12:50 13:00
      Q&A 10m