ENES Competence Centre #2

Europe/Amsterdam

Participants

  • Sandro Fiore
  • Fabrizio Antonio
  • Donatello Elia
  • Alessandro D'Anca
  • Feyza Eryol
  • Hakan Bayindir
  • Miguel Caballer
  • Amanda Calatrava

Discussion

ENES looks for resources with considerable amount of storage (100TB to start) that will 

Jupyter frontend, backend in cloud/HPC with the data science environment with the different technologies. 

Pilot in HPC: focus on containerisation with udocker, open to compare with other technologies if available.

HPC deployment would it be possible to have a frontend in Jupyter? onedata? -> cloud part will be start faster, shouldn't be a problem to have public IP. 

Look into cloud deployment first: 1 jupyterhub + few VMs pre-allocated to start the analysis to ensure some availability, with elasticity, onedata will come probably later.

EC3 can provide such deployment, that can be tuned to the use case, e.g. start several VMs at the same time instead of a single one. 

sequential tools in the beginning, focus on making the user experience easy

Storage: need to be accessible by all the nodes. TUBITAK has NFS system that should work with SSD caches

NFS can be critical for parallel computations but TUBITAK has good experience on NFS to tune it for high I/O workloads. Need to test.

LUSTRE used in the HPC side of things, but considering NFS.

Need to understand how to map users to the storage.

Plan:

  • start with the cloud deployment with EC3 tuning to have some pre-allocated VMs
  • understand how to get storage visible on the VMs
  • understand how different is this setup from HPC
  • move data from cloud to HPC, should be easy as it will be in the same dataset
  • clarify how to map users from IdP to the resources is very important
  • next meeting May 4 9.00 CEST

Next steps:

  1. Define workflow for users to get into the data space - make it as easier as possible: CMCC: provide the minimal data from user needed
  2. Create a VO to support deployment (Enol), Fabrizi Antioni as VO manager - TUBITAK to start supporting it at the infra
  3. EC3 configuration of basic system (cluster with pre-allocated WN + frontend) 
There are minutes attached to this event. Show them.
The agenda of this meeting is empty