Gregor von Laszewski, Fugang Wang, Javier Diaz, Archit K. (please add your name if you have contributed)
The purpose of this document is to coordinate the activities related to the deployment and software development of user facing dynamic provisioning
In Phase I/II of Futuregrid we wanted to configure Moab in such a way that it is able to dynamically provision operating systems provided by the system administrator. This allows also a minimal user facing deployment possibility in that the user can contact the system administrator to deploy such an image. Naturally the application of this process is limited by the following factors (a) the deployment of the image is not automatized and needs to be conducted by the system administrator (b) the system administrator must trust the person that generates the image (c) testing of an image deployment can only be done after step a is completed. We term this kind of provisioning “administrator controlled dynamic provisioning”.
However, we found in our own usecases that users would need access to the deployment mechanism to make this feature truly useful (this we call user facing dynamic provisioning).
Mitigation restrictive access for FG developers: As a mitigation to this issue we decided that a minicluster be available for the core FG developers to test out such features. As such the “testbed” FG has limited some of its services offered to general users to be restricted to services that could be deployed on on production systems such as TeraGrid. The goal here is to set up a system ASAP to have the newest version of XCAT and Moab so we can test out the new system BEFORE it is deployed in general. This activity needs to be started now.
Status: the backup of the minicluster is in progress and completed by June 10, 2011
A jira task is fled at http://jira.futuregrid.org/browse/FG-1156
Mitigation restrictive access for FG users: In the next phase of the development we will be working on services that are distinct from those typically offered in production systems while focussing also on the provisioning of bare metal environment via “user facing dynamic provisioning” where users can invoke a process that creates an image for their needs that can than be deployed.
Mitigation incompatible software deployments: One of the issues we have in the current FG is that the versions of some of the key software are so different that software development is difficult to coordinate and conduct due to incompatibility between even minor versions. As we do not have enough software developers to deal with this, we need to address this issue by alligning the software. Table 1 depicts the status of the software on the various systems as of June 7th, 2011. It was decided to upgrade most of the iDataplex systems to be the same version. The coordination of this activity is tasks to be executed by Greg Pike. Recently we also found out that UC is not using XCAT which was in our original deployment plan. We need to mitigate how we deal with this. This is important as recently questions were issued if we should be using XCAT at all (???verify with Greg)
Mitigation reusing our own software tools to further develop them: As part of the divergence in the deployed software versions, we could be using our own tools that we develop to keep the system up to date. That is, instead of us generating an image by hand we first generate an image with the software installed as much as we can automatically. Than we provide specialized configuration scripts that modify the images for a particular machine or node and deploy them from a repository. If a new operating system would be used or a change will take place we have documented the process on how to generate quickly a new distribution and an update of the system will be much less time consuming and we can focus development and deployment resources elsewhere. The reuse of our own tools within the deployment infrastructure has the following benefits (a) we can test them (b) we can enhance them if we find issues or want new features.
Table 1: Installed software version on the HPC images for the various clusters (green indicates up to date software) Items updated on June 7th are indicated with a *. Although the newest version is installed is is lacking the support to conduct dynamic provisioning using the image deployment tools from Javier. Also we have identified a number of deployment issues as some of the configuration files need to be updated. Possibly the best way to deal with this is to conduct a new installation of XCAT.
iu-india | iu-xray | tacc-alamo | uc-hotel | ucsd-sierra | ufl-foxtrot | |
gcc | 4.1.2 | 4.1.2 | 4.1.2 | 4.1.2 | 3.4.6 | 4.1.2 |
moab | 5.4.0 | 5.3.6 | 5.4.3 | 6.0.3 | 5.4.0 | n/a |
modules | 3.2.7 | 3.1.6 | 3.2.6 | 3.2.8 | 3.2.7 | n/a |
openmpi | 1.4.2 | n/a | 1.4.3 | 1.4.3 | 1.4.2 | n/a |
torque | 2.5.5 | 2.4.4-snap.200912210954 | 2.4.8 | 2.5.5 | 2.4.3 | n/a |
xcat | 2.6* | n/a | n/a | n/a | 2.6* | 2.4.3 |
To motivate the Phase III development of dynamic provisioning we summarize first a number of usecases that we wish to address.
Development tasks for this activity are to set up an environment that allows the easy integration of images into the images available to the users. As the user facing interfaces are already provided by the queuing system integration.
To simplify the interaction and create an abstraction layer that can be in future adapted to other environments we will develop a function called
fg-moab-image -upload -image <filename> -label <label>
This command uploads the image to the repository that can be read by the queuing system
fg-moab-image -delete -label <label>
This command deletes the image from the repository accessible to moab
fg-moab-image -init
This command may be called after images have been uploaded in order to initialize the queuing system after one or more images have been uploaded. The option -init can directly be specified when an image is uploaded and deleted and is than executed immediatly after the action has been performed as last command.
The commands are restricted to authorized users. In contrast to Phase I, such commands are not available and the steps were executed by hand by the administrator.
In Story A, the system is provisioned each time the user specifies the image. However Moab internally has the ability to not reprovisioned the image in case the requested image is already provisioned.
A possibly command for switching this behaviour could be developed
fg-moab -mode perjob|ondemand
switches the behavior of moab to either per job (Story A) or on demand (Story B)
In certain cases it may be more convenient to generate special queues that are associated with a special image. The queue can be managed through other Moab tools and its availability may be controlled by the administrator. The advantage is that features could be build in to the system to only start jobs if the cost of provisioning the resources are bellow a certain threshold. An example would be multiple jobs are queued and justify the provisioning of the image.
At this time we are not yet considering to implement this usecase.
The resources of FG are assigned to different service endpoints. The current services include HPC, Nimbus, and Eucalyptus but could also include other services. The nodes associated with the HPC service are manged through Moab and thus all servers will have to be integrated into it. In case a server is moved to the Eucalyptus, Nimbus, or openstack clouds they need to be removed from the HPC resource pool and be marked to be managed as part of the other services.
In order to achieve this interaction a convenient command line will be made available to privileged users. This command will later be included in a metascheduler.
A possible command to conduct this action would be
fg-pool -create <name>
This command creates a named resource pool
fg-pool -list <name>
Lists the resources in the named pool
fg-pool -list
List the names of the resource pools
fg-pool -to <to> -resource <name>
moves the named resource to a particular pool
fg-pool from <from> -to <to> -resource <name>
moves the named resource to a particular pool. However conducts a check if it was previously in the from pool. If it was not an error is returned.
Our strategy documented within the PEP plan was identified in the early stages of the project and was not able to be changed. The tools to use were XCAT while integrating them within Moab while only considering RHEL as OS. However since than major events took place that motivate us to revisit the issue:
Thus it is timely before we conduct a software upgrade on all the systems to identify a more homogeneous solution. We will however be faced with the issue that TACC is using their own cluster management strategy. As TACC provides theirs, IU may provide also their different solution.
As a consequence we will initially work on userfacing dynamic provisioning only for the india and sierra machines, while foxtrot will be able to leverage form that environment if desired.
We will be asking FG management for guidance on how to proceed in driving the decision.
The planed systems having dynamic provisioning in its various forms available are depicted in Table 2.
Table 2: Proposed dynamic provisioning solutions
Host | dynamic resource | userfacing OS provisioning |
india | TG11? | TG11? |
sierra | TG11? | TG11? |
foxtrot | to be decided | to be decided |
hotel | Fall 2011 (impl. by IU) | to be decided |
alamo | Fall 2011 (impl. by TACC) | to be decided |
xray | N/A | N/A |
In order to spend software resources wisely all software development will be using the newest Moab version that is supported by adaptivecomputing.com. It will be important that the development team works together with the system team do enable an environment that can be used to develop such activities. It was decided by the system management team to provide a minicluster to the software team that will include all the newest software and mimic the setup of india once that has been completed. The goal here is that the software developed on the minicluster can easily transferred and deployed onto india.
As there is a variation in available system software distributed by adaptive computing the first task will be to decide which version. From reading the documentation it will be the newest 64 bit version of Moab while using an ODBC connector. We will use also the newest version of TORQUE from the 2.5 branch (According to Greg the TORQUE developers recommended to not use the newest TORQUE release). In case we chose XCAT we would also use the newest version of XCAT.
One of the discussions that took place recently was if we should use XCAT for our solution. We have to keep in mind that we like to have dynamic provisioning deployed by TG11, but also need to keep in mind that we have limited software development support to provide such solutions. Thus we discuss the impact of two different solutions as part of the next sections:
The following list provides some information what is needed for a solution while using the newest version of XCAT.
While the previous solution uses the FG image generation framework, this solution uses the XCAT provided generation that is targeted towards RHEL. It is a different approach.
This method is documented here to provide us with an alterantive to XCAT and to bring the software development to a broader user community (e.g. systems that do not use XCAT).
The question arises which solution we should continue based on a number of constraints that may conflict with each other.
This includes the long term plan, the available developer resources, the available expertise, and the deadline for TG11. IN order for this decision to be cast, we like that you include comments in this document and make additional improvements.
(A hybrid solution of 1 and 3 is also possible. First we continue the current approach as experimented on the mini-cluster. This requires less effort so we could be possible to make the deadline of TG. Meanwhile we can continue exploring the approach 3, which may require some significant development effort, but the final output would be a unique solution across all FG sites. The two solution could be co-existed side by side as both of them are client of the Moab MSM so a switching between the two is easy). The disadvantage of this approach is that at TG11 we only can say dynamic provisioning is available on our minicluster (which is not accessible to the users and considered a test environment).
Based on the available resources and the short time line we believe solution 1 is still the one we should continue.
Service Interface and Portal Access
The service interface of the dynamic provisioning is a related but somewhat independent issue, considering the same architecture and mechanism could be used for virtually all FG services, namely image management(including generation and repository), dynamic provisioning/RAIN, experiment management, etc.
Currently we adopt python as the main developing language, and would like to continue using this to develop the service interfaces for the image generation/deployment, and image repository scripts. A preliminary RESTful interface for image repository has been started by the community volunteers from the UIC. We will continue evaluate the frameworks available in Python to do RESTful services, especially those offer, or are easily to be integrated with, secured REST services.
Cherrypy is the framework that we tested and the UIC group is using. The next step is to find compatible library/framework/solutions to do HTTPS with user authentication(either HTTP basic or digested). Since each user already have a pair of username/password from the FG portal, this credential could be used to access the secured service from a CLI or portal.
The CLI will be part of the FG software release, in which the service clients will be contained to access the secured FG services. A portal interface will also be provided via the current drupal solution, from where we need a service proxy, which connects to the secured FG service in one end, and serving the pages(typically generated via PHP/JavaScript) in the drupal site where the end users are initiating the service interaction and receiving results.
The service proxy in the drupal is to be developed based on the Services modules of drupal. We have developed and deployed such services in the portal, but only for unrestricted services. When access control is needed, more explore is needed to do the API key based access and/or role/username-password pair based authentication/authorization.
Refernces for the Portal component GUI development
https://portal.futuregrid.org/rest-services-python
https://portal.futuregrid.org/rain-futuregrid-image-generation
Moab
FG
Tasks
Figure 1 Image management
Figure 2 Image Repository
Available images on FG