In particular in the field of Genome annotation we instrumented an application for extended and robust protein sequence annotation over conservative non hierarchical clusters based on the Bologna Annotation Resource v 3.0. Tests have been done with 150 genomes splitting the data in 8000 pieces and implementing a loose parallel computation over the Grid. The above computation has been done on 200 nodes and took
2 weeks instead of months if running in a single computer.
Concerning mid-ocean ridge modeling, we simulate mantle flow dynamics to study mantle rock serpentinization, a process central to a variety of chemical exchanges between solid earth, hydrosphere and biosphere. Sub-lithospheric mantle flow and mantle thermal structure are obtained from the solution of Stokes and heat equations by semi-analytical-pseudo-spectral and finite difference techniques, respectively. The GRID infrastructure allowed to submit in a single instance n x m jobs, achieving n x m solutions at approximately the time of one single run.
Ensemble method technique has been used to quantify the forecast uncertainty in short-term ocean forecasting systems. In this study, we explore the short-term ensemble forecast variance generated by perturbing the initial conditions. Grid allowed us to perform several ensemble forecast experiments with 1,000 members: they are completed within 5 hours of wallclock time after their submission, and the ensemble variance peaks at the mesoscale.
Molecular dynamics simulations were performed using NAMD v2.7, a parallel molecular dynamics code, to study the transmembrane lipid translocation processes. These processes are the basis of important properties and functions of cell membranes. In this study, the dynamics of a lipid bilayer, under the effect of two electric field intensities is investigated. Up to a ten-fold decrease in total computation time has been achieved by using a system with three GPU instead of traditional CPUs .
Description of the work
Experience has been done with applications on the field of Genome annotation, mid-ocean ridge processes, ensemble methods for ocean forecasting and molecular systems computation with clusters of grid-accessible GPU-enabled systems. These applications are very CPU demanding and data-driven thus using sequential computation requires months of CPU time. The peculiar features offered by a grid environment such as large number of storage and computation resources and advanced high level job submission services allow to reduce consistently the amount of computation time spreading the jobs on different resource sites.
The key solution is based on their execution as loose parallel applications, splitting the input data in several parts and using the parametric job submission feature of the gLite WMS to optimize the management of the computational tasks.
A multidisciplinary context has been set up in order to build a common distributed computing infrastructure and supporting the applications of each partner. Comput-ER (Computing in Emilia Romagna) is based mainly on commodity farms but new GPU-based farms for parallel applications are under testing. Comput-ER resources are accessed through the gLite middleware. These resources are either based on real hardware systems, or are dynamically provisioned via virtual machines using the Worker Nodes on Demand Service (WNoDeS) grid/cloud virtualization system .
This paper describes how the above applications have been modified in order to be executed in a loose parallel way and shows the advantages obtained by using most of the available computing resources of a distributed infrastructure and the benefit of the knowledge-sharing from a multidisciplinary scientific computing environment.