Published using Google Docs
Python
Updated automatically every 5 minutes

Status Report: Python

September 2018 - March 2019

Ryan May, John Leeman, Sean Arms, Julien Chastang, Michael James

Areas for Committee Feedback

We are requesting your feedback on the following topics:

  1. How can Unidata’s Python training better serve your needs or the needs of your students? Are there other topics we need to add to the workshop? Are there additional opportunities (e.g. conferences) we should explore as convenient venues?
  2. What are the most useful functionalities in MetPy and Siphon for your needs? What do we do well?
  3. Are there any additions you’d like to make to MetPy’s roadmap? Anything you notice as lacking in MetPy or Siphon?

Activities Since the Last Status Report

Staffing Changes

John Leeman will unfortunately be stepping down from Unidata in May 2019. A search for a new Python Developer, with an emphasis on instructional skills and experience, has begun. We hope to have the search completed by the end of Spring 2019. Additionally, Ryan May has assumed more responsibilities by joining Unidata’s Management Team, which is requiring about 25% of his time. These changes and supporting activities have impacted the bandwidth of the Python development team. This should not have lasting impacts on our capacity, pending successful completion of the hire.

Python Training Efforts

Python training efforts continue to be a valuable portion of the Python portfolio. We continue to be successful in identifying opportunities to offer training within our resource constraints. Not only do these generate significant goodwill and grow our audience, but they are a significant source of information to inform our library development. One challenge is to balance time dedicated to creation of training materials, workshop preparation, and logistics against time devoted to support and Python software development. We have also begun looking at ways to unify our various training efforts (gallery, workshop, and online python training) into a proper “Python Training Portal”. The hope would be to make it simpler to have community contributions and to turn this into a better resource for the community’s Python training needs.

Progress has been made on the following:

MetPy

MetPy continues to grow, both in features and in community. The volume of support requests is the most remarkable area of growth; traffic and activity across GitHub, E-Support/E-mail, and Stack Overflow is steady. Community code contributions have been somewhat slower; this may be due in part to less activate solicitation of contributions, owing to the uptick in support requests and aforementioned staffing changes.

Development going forward will continue to be driven by requirements for our dedicated awards (in addition to bug reports and pull requests from community members). The primary efforts will be focused around the GEMPAK-like interface, improved units support, integration with xarray, and data formats. We do anticipate the release of MetPy 1.0 this year. Also, to try to foster more community discussion of MetPy’s goals and plans, we have published a general MetPy roadmap that tries to capture our plans from GitHub in a more friendly format.

Progress has been made on the following:

Siphon and Data Processing

Siphon continues to grow and develop, though at a slower pace than MetPy; its development tends to be driven by obstacles to access of remote data. The most pressing developments we anticipate for Siphon are improvements to working with Siphon in interactive sessions, like the Jupyter notebook environment: improved catalog crawling interface, better string representations, and tab completion. Siphon continues to see community contributions trickle in. We hope to have one of Unidata’s summer interns do some contributions to Siphon as part of their summer work.

We also continue to maintain the LDM Alchemy repository as a collection of LDM processing scripts in Python. Currently this includes the code powering the AWS NEXRAD archive as well as the program that reconstitutes NOAAPORT GOES-16/17 imagery. As we transition more of our internal data processing to Python, this repository will hold those scripts. We have seen several community questions regarding both the GOES and NEXRAD processing software.

External Participation

The Python team attends conferences as well as participates in other projects within the scientific Python ecosystem. This allows us to stay informed and to be able to advocate for our community, as well as keep our community updated on developments. As participants in a broader Open Source software ecosystem, the Python team regularly encounters issues in other projects relevant to our community’s needs. As such, we routinely engage these projects to address challenges and submit fixes. We also continue to host Jeff Whittaker’s netCDF4-python project repository; Jeff continues to be the active maintainer of the project. The overall involvement helps ensure that important portions of our community’s Python stack remain well-supported. Ryan May continues to serve as a core developer for CartoPy as well as a member of Matplotlib’s Steering Council.

Progress has been made on the following:

Python for AWIPS

We continue to update the Python Data Access Framework (python-awips) package with the latest changes from the AWIPS baseline. This package is used in both AWIPS and GEMPAK for remote retrieval of AWIPS data (grids, geometries, and imagery), as well as independently in Jupyter Notebooks.

Changes to python-awips since the last report include (through release 18.1.7):

Ongoing Activities

We plan to continue the following activities:

New Activities

Over the next three months, we plan to organize or take part in the following:

Over the next twelve months, we plan to organize or take part in the following:

Beyond a one-year timeframe, we plan to organize or take part in the following:

Relevant Metrics

MetPy

Siphon

Python-AWIPS

Strategic Focus Areas

We support the following goals described in Unidata Strategic Plan:

  1. Enable widespread, efficient access to geoscience data
    Python can facilitate data-proximate computations and analyses through Jupyter Notebook technology. Jupyter Notebook web servers can be co-located to the data source for analysis and visualization through web browsers. This capability, in turn, reduces the amount of data that must travel across computing networks.

  1. Develop and provide open-source tools for effective use of geoscience data
    Our current and forthcoming efforts in the Python arena will facilitate analysis of geoscience data. This goal will be achieved by continuing to develop Python APIs tailored to Unidata technologies. Starting with the summer 2013 Unidata training workshop, we developed an API to facilitate data access from a THREDDS data server. This effort has been encapsulated with the new siphon project, which is an API for accessing remote data, including the THREDDS data server. Moreover, Python technology coupled with the HTML5 Jupyter Notebook technology has the potential to address "very large datasets" problems. Jupyter Notebooks can be co-located to the data source and accessed via a web browser thereby allowing geoscience professionals to analyze data where the data reside without having to move large amounts of information across networks. This concept fits nicely with the "Unidata in the cloud" vision and the goals outlined Unidata 2018 Five-year plan. Lastly, as a general purpose programming language, Python has the capability to analyze and visualize diverse data in one environment through numerous, well-maintained open-source APIs. The additional development of MetPy fills the need for domain-specific analysis and visualization tools in Python.

  1. Provide cyberinfrastructure leadership in data discovery, access, and use
    The TDS catalog crawling capabilities found in siphon will facilitate access to data remotely served by the Unidata TDS, as well as other TDS instances around the world.

  1. Build, support, and advocate for the diverse geoscience community
    Based on interest from the geoscience community, Unidata, as part of its annual training workshop, now hosts a three day session to explore Python with Unidata technology. Also, to advance the use of NetCDF in Python, Unidata has promoted Jeff Whitaker’s NetCDF4-python project, including hosting its repository under Unidata’s GitHub account. Unidata is initiating a project to provide online Python training specifically targeting geoscience students. Unidata is also fostering some community development of meteorology-specific tools under the MetPy project.


Prepared  March 2019