Speaker
Description of Work
The ARCTURUS platform is designed to deliver next generation cluster services for increased speed of deployment for Grid Clusters.
Utilising, a wide variety of protocols and proven methods as well as newer data management and configuration techniques, the purpose of ARCTURUS is to give a base line for WLCG Cluster deployments and to further efforts in automation beyond those of merely deploying software or Virtual machines.
The end goal of this research programme is to build a fully automated expert system which, through multiple open source projects, can build, configure and deployed as well as diagnose potential issues within a cluster with minimal technical operator involvement.
Printable Summary
With the current trend towards "On Demand Computing" in big data environments it becomes crucial that the deployment of services and resources becomes increasingly automated. With opensource projects such as Canonicals MaaS and Redhats Spacewalk; automated deployment is available for large scale data centre environments but these solutions can be too complex and heavyweight for smaller, resource constrained WLCG Tier-2 sites. Along with a greater desire for bespoke monitoring and the collection of more Grid related metrics, a more lightweight and modular approach is desired. In this paper work carried out on the test cluster environment at the Scotgrid site of the University of Glasgow is presented.
Progress towards a lightweight automated framework for building WLCG grid sites is presented, based on "off the shelf" software components such as Cobbler and Puppet, the building blocks of the larger open source projects mentioned before. Additionally the test cluster is used to investigate these components in a mixed IPv4/IPv6 environment, as well as using emerging OpenFlow technologies for software service provisioning.
As part of the research into an automation framework the use of IPMI and SNMPv2 for physical device management will be included, as well as the possibility of SNMPv2 as a monitoring/data sampling layer such that more comprehensive decision making can take place and potentially be automated.
Additionally, the development of Virtualised Networking and its role in service delivery within a Grid Cluster will be investigated,
This could lead to reduced down times and better performance as services are recognised to be in a non-functional state by autonomous systems. Finally, through the use of automated service provisioning and automated device management the building blocks of a fully automated expert system will be touched upon.