The combined power of computing centres at different CTA institutes grouped in the Grid infrastructure allow massive Monte Carlo simulations. Our aim is to keep the development for CTA at a minimum by staying grid middleware compliant and relying on tools already used in the grid community. Current simulation production is much more efficient than anyrhing being done at single computing centres and allows the estimation of the future CTA computing requirements. The Grid infrastructure can be recommended for large scale computing in CTA.
Description of the work
CTA is currently in the preparatory phase and no dedicated computing infrastructure is available. Using Grid infrastructure the computing centres of participating institutes can provide computing resources for CTA. Currently, the CTA virtual organisation (vo.cta.in2p3.fr) comprises 8 computing centres with several 1000 CPUs and about 500TB of storage. The participating computing centres are Tier 1 and Tier 2 centres and provide very heterogeneous computing resources.
The CTA Computing Grid is currently used for massive Monte Carlo simulations. For job submission and monitoring we use EasiJob (EASy Integrated JOB submission), a tool developped within the MUST framework (Mesocentre de calcul et de stockage ouvert sur la grille EGEE/LCG), the computing infrastructure from LAPP (Laboratoire d'Annecy-le-Vieux de Physique des Particules). This tool includes a web interface to define and configure a grid production.
The configuration and submission is currently being done manually by the production manager. For the future we plan to set up an automated data pipeline where each newly produced simulation file will be automatically processed. This automated data processing will be based on a data management system. We are investigating the potential of AMI (ATLAS Metadata Interface) for the needs of CTA. The outcome of this investigation will be presented in the talk.
Gamma-ray astronomy is one of the most active topics in astroparticle physics involving both particle physicists and astrophysicists leading to the design of new types of research infrastructures. The Cherenkov Telescope Array - CTA - is a proposed new project for ground based gamma-ray astronomy. This project is driven by the CTA consortium, comprising 132 institutes in 25 countries. The CTA Computing Grid (CTACG) uses Grid Technology to perform heavy Monte Carlo simulations and to investigate the potential of Grid computing for the future data reduction and analysis pipeline. This talk presents the tools developped in this framework, the performance achieved and the lessons learned.
Within 2010 60000 files containing 109 simtelarray showers have been produced, outnumbering by far any production on dedicated computer clusters.
The current production jobs are very demanding in memory size (up to 4 GB RAM) and disk space (5-10 GB scratch). Not all available computing centres meet these requirements. Therefore only a subset of all sites is currently used. Future plans for upgrading computer centres will take into account the CTA memory requirements. The necessary scratch disk space and disk operations can be reduced by running processes parallel within one job and piping the output directly. It will be necessay to require several cores on multi-core machines on job submission.
The Grid provides the necessary infrastructure to help participating institutes to easily supply there computing power to the CTA community. By relying on official Grid standards the work load on the technical staff at the computing sites is reduced and the developpement of the Grid can be easily followed. In this way CTA can concentrate nearly all its efforts on scientific study.