Published using Google Docs
Cloud Computing Activities
Updated automatically every 5 minutes

Status Report: Cloud Computing Activities

September-April 2016

Sean Arms, Julien Chastang, Ethan Davis, Steve Emmerson, Ward Fisher, Tom Hollingshead, Michael James, Ryan May, Jennifer Oxelson, Mike Schmidt, Christian Ward-Garrison, Jeff Weber, Tom Yoksas

Activities Since the Last Status Report

Docker

With the goal of better serving our core community and in fulfillment of objectives articulated in Unidata 2018: Transforming Geoscience through Innovative Data Services , Unidata is investigating how its technologies can best take advantage of cloud computing. To this end, we have been employing Docker container technology to streamline building, deploying, and running Unidata technology offerings in cloud-based resources. Specifically, we have created Docker images for the IDV, LDM, ADDE, RAMADDA, THREDDS, GEMPAK and Python and with Unidata Technologies, and we have been experimenting with these Docker containers in the Microsoft Azure and Amazon AWS commercial cloud computing environments.  Our preliminary efforts are available on Unidata’s github repository.

While these efforts are promising initial steps, there are challenges ahead  in making these technologies useful to our community.  It is unlikely that most of our users will initially use these containers directly, rather they will be leveraged by experts on behalf of the community, or they will be abstracted from users by being integrated into a user-friendly workflow.  Moreover, we may have to rethink workflows in a cloud environment ( data-proximate analysis and visualization, for example) in addition to porting present Unidata cyberinfrastructure to the cloud.

Microsoft Azure for Research Grant and AMS “UniCloud: Docker Use at Unidata” Presentation

Our efforts in the Docker arena were presented at 2016 AMS annual meeting in New Orleans in a presentation entitled Unicloud: Docker Use at Unidata. In coordination with this presentation, we staged three “motherlode class” machines (in reference to our motherlode data server at Unidata) on our Microsoft Azure for Research resources; unidata-server, unidata-server-2, unidata-server-3. These servers provide data supplied by the LDM, and served by RAMADDA, TDS, and ADDE. They can be staged in minutes on cloud virtual machines with Docker and  instructions for doing so can be found here.

Our Microsoft Azure for Research equipment grant will be ending mid-April 2016. We plan to respond to the April 15th Azure for Research RFP with several new proposals for Azure resources.

XSEDE Jetstream Award

To further investigate how the Unidata community can benefit from Unidata technologies in the cloud, Unidata obtained an XSEDE equipment award on the Jetstream cloud-computing platform.  The Extreme Science and Engineering Discovery Environment (XSEDE) five-year, $121-million award is  a National Science Foundation supported project. We wish to continue our research of porting Unidata technology into a variety of cloud environments. Specifically, we would like to deploy a motherlode class machine on the Jetstream cloud with Docker technology in a manner similar to what we accomplished with our Azure resources. As Docker provides a common baseline for cloud computing, this experiment should proceed in a fairly smooth manner, but we will not know until we try. Jetstream became available in February of 2016. We are currently in the very early stages of experimenting with Jetstream.

AWS Training/Technical Discussions at the University of Wyoming

A number of Unidata technical staff traveled to Laramie, WY to meet with  Amazon Web Services representatives for best practice training on the use of AWS resources including S3 and on efforts related to the  NOAA Big Data Project.  Meeting outside of Colorado was necessary to protect Amazon’s Colorado sales tax position.

Progress has been made on the following:

Azure for Application Streaming/Unidata Service Hosting

Unidata has received a second year of Azure resources from Microsoft under the “Azure for Research” program.  The primary focus of this award is continue work on creating an application-streaming platform for the IDV and other Unidata technologies.  Secondary focus is on testing Unidata services in the Azure cloud, and examining the performance of Azure when hosting Docker instances.

We have made available an EDEX Data Server in the Azure cloud (edex-cloud.unidata.ucar.edu), and have set up a similar server privately for Embry-Riddle Aeronautical University on an Amazon EC-2 instance.  This Azure EDEX machine serves data to CAVE clients for Linux, Mac, and Windows, as well as Python scrip.

t and projects using the AWIPS II Python Data Access Framework (python-awips) and the latest GEMPAK build (which uses  python-awips request Python data arrays objects and convert them into a renderable GEMPAK grid.

Progress has been made on the following:

Ongoing Activities

We plan to continue the following activities:

New Activities

Over the next three months, we plan to organize or take part in the following:

Over the next twelve months, we plan to organize or take part in the following:

Beyond a one-year timeframe, we plan to organize or take part in the following:

Areas for Committee Feedback

We are requesting your feedback on the following topics:

  1. What clouds are our community using, either commercial or private?
  2. What new cloud technologies are our community using/investigating on their own initiative?

Strategic Focus Areas

We support the following goals described in Unidata Strategic Plan:

  1. Enable widespread, efficient access to geoscience data
    Making Unidata data streams available via various commercial and private cloud services will allow subscribers to those services to access data quickly and at low cost.
  2. Develop and provide open-source tools for effective use of geoscience data
    Running existing Unidata-developed and supported tools and processes (e.g. IDV, RAMADDA, generation of composite imagery) in a range of cloud environments makes these tools and data streams available to cloud service subscribers at low cost. It also gives us insight into how best to configure existing and new tools for most efficient use in these environments.
  3. Provide cyberinfrastructure leadership in data discovery, access, and use
    Unidata is uniquely positioned in our community to experiment with provision of both data and services in the cloud environment. Our efforts to determine the most efficient ways to make use of cloud resources will allow community members to forego at least some of the early, exploratory steps toward full use of cloud environments. 
  4. Build, support, and advocate for the diverse geoscience community
    [Build a bigger community]

Prepared  April 2016