Published using Google Docs
Cloud Computing Activities
Updated automatically every 5 minutes

Status Report: Cloud Computing Activities

April - September 2015

Sean Arms, Julien Chastang, Ethan Davis, Steve Emmerson, Ward Fisher, Tom Hollingshead, Michael James, Ryan May, Jennifer Oxelson, Mike Schmidt, Christian Ward-Garrison, Jeff Weber, Tom Yoksas

Strategic Focus Areas

We support the following goals described in Unidata Strategic Plan:

  1. Enable widespread, efficient access to geoscience data
    Making Unidata data streams available via various commercial and private cloud services will allow subscribers to those services to access data quickly and at low cost.
  2. Develop and provide open-source tools for effective use of geoscience data
    Running existing Unidata-developed and supported tools and processes (e.g. IDV, RAMADDA, generation of composite imagery) in a range of cloud environments makes these tools and data streams available to cloud service subscribers at low cost. It also gives us insight into how best to configure existing and new tools for most efficient use in these environments.
  3. Provide cyberinfrastructure leadership in data discovery, access, and use
    Unidata is uniquely positioned in our community to experiment with provision of both data and services in the cloud environment. Our efforts to determine the most efficient ways to make use of cloud resources will allow community members to forego at least some of the early, exploratory steps toward full use of cloud environments. 
  4. Build, support, and advocate for the diverse geoscience community
    [Build a bigger community]

Activities Since the Last Status Report

Docker

With the goal of better serving our core community and in fulfillment of objectives articulated in Unidata 2018: __Transforming Geoscience through Innovative Data Services__ , Unidata is investigating how its technologies can best take advantage of cloud computing. To this end, we have been employing Docker container technology to streamline building, deploying, and running Unidata technology offerings in cloud-based resources. Specifically, we have created Docker images for the IDV, RAMADDA, THREDDS, Python with Unidata Technologies, and an initial attempt for the LDM, and we have been experimenting with these Docker containers in the Microsoft Azure and Amazon AWS commercial cloud computing environments.  Our preliminary efforts are available on Unidata’s github repository. Also in one instance, we are using Docker technology operationally with the testing of IDV bundles in the cloud.

While these efforts are promising initial steps, there are challenges ahead  in making these technologies useful to our community.  It is unlikely that most of our users will initially use these containers directly, rather they will be leveraged by experts on behalf of the community, or they will be abstracted from users by being integrated into a user-friendly workflow.

AWS Training/Technical Discussions at the University of Wyoming

A number of Unidata technical staff traveled to Laramie, WY to meet with  Amazon Web Services representatives for best practice training on the use of AWS resources including S3 and on efforts related to the  NOAA Big Data Project.  Meeting outside of Colorado was necessary to protect Amazon’s Colorado sales tax position.

Progress has been made on the following:

Azure for Application Streaming/Unidata Service Hosting

Unidata has received a second year of Azure resources from Microsoft under the “Azure for Research” program.  The primary focus of this award is continue work on creating an application-streaming platform for the IDV and other Unidata technologies.  Secondary focus is on testing Unidata services in the Azure cloud, and examining the performance of Azure when hosting Docker instances.

With the release of Unidata AWIPS II 14.4.1 we have made available an EDEX Data Server in the Azure cloud (http://edex-cloud.unidata.ucar.edu:9581/services), and have set up a similar server privately for Embry-Riddle Aeronautical University on an Amazon EC-2 instance.  Without a solid state drive these cloud deployments are ingesting a more limited data set than what can be ingested by a private EDEX Data Server located on campus.  Bandwidth becomes an issue with very large data sets such as high-resolution gridded model HDF5 files, though the recent compression improvements to EDEX is shown to reduce data transfer rates by an order of magnitude.

Progress has been made on the following:

Ongoing Activities

We plan to continue the following activities:

New Activities

Over the next three months, we plan to organize or take part in the following:

Over the next twelve months, we plan to organize or take part in the following:

Beyond a one-year timeframe, we plan to organize or take part in the following:

Areas for Committee Feedback

We are requesting your feedback on the following topics:

  1. What clouds are our community using, either commercial or private?
  2. What new cloud technologies are our community using/investigating on their own initiative?

Prepared  September 2015