Conversion routine for YOTC data

 

At the end of 2014 we introduced a new conversion routine and new file format and names. Main differences are:

1)   name and directory structure follow the CMIP5 DRS conventions;

2)   variable names follow the CMIP5 naming conventions where possible, otherwise we kept the ECMWF short names;

3)   we adapted the metadata to follow the CF conventions, in particular standard names have been used whenever available;

4)    we introduced a md5 checksum by variable as an attribute in the file, so the integrity of the data values can be assessed even if changes are done in the future to file names or other metadata;

5)    we stoppped using ncl which presented memory issues, had less flexibility and required more scripting, we started using a combination of cdo to go from grib to netcdf and nco/ncks to change the metadata.

 

As before we used a python script to create and submit to the queue the bash scripts which are contains the sequence of cdo and nco commands.

An example of a bash script to convert eastward air velocity is shown and commented below.

 

Loading necessary modules, moving to working directory:

 

#! /bin/bash -l

        module load netcdf/4.3.0

        module load cdo/1.6.4

        module load nco/4.3.8

        cd /g/data1/rq7/YOTC/Scripts/Run/oper_an_pl/

 

The first line of actual code is a sequence of cdo commands:

 

cdo -f nc4 -z zip_5 -setpartabp,/g/data1/rq7/YOTC/Scripts/cdo_param_table -seldate,2008-05-21 -setreftime,2008-05-01,00:00:00 /g/data1/rq7/YOTC/grib/oper_an_pl/U/2008/U_yt_oper_an_pl_0125x0125_90N0E90S35925E_20080521_20080525 ua_6hrs_YOTC_historical_an-pl_20080521.nc

 

“-f nc4” sets the output netcdf file format as netCDF4 – HDF5. This format supports per-variable compression, for a reference https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/NetCDF_002d4-Format.html

 

 “-z zip_5” sets the compression level to 5, which is a good compromise between size and writing/reading speed, this is based both on experiments conducted by the CMS and in available literature

{references: www.hdfgroup.org/pubs/papers/2008-06_netcdf4_perf_report.pdf

http://www.unidata.ucar.edu/blogs/developer/entry/converting_grib_to_netcdf_4 }

 

“-setpartabp,/g/data1/rq7/YOTC/Scripts/cdo_param_table” tells cdo to use the indicated parameter table to retrieve metadata about the variable.

In case of U, whose grib parameter code is 131, the table reference is

 

          param = 131.128  

          out_name =ua          

          long_name = "U velocity "

          standard_name = eastward_wind

          units = "m s**-1"

 

param is the grib parameter code (131) followed by the parameter table number (128)

out_name is the name to give to the variable in the file and follows CMIP5 conventions in this case

 

Then follows a series of nco commands, ncatted, which either modify existing or add new global attributes. This is to follow the CF conventions but also to add information like when the original grib file has been downloaded.

 

 

   ncatted -h -O -a 'title',global,c,c,'ERA-Interim U velocity [m s**-1] analysis on pressure (global 0.125X0.125 lat/lon grid) converted from original grib format' ua_6hrs_YOTC_historical_an-pl_20080521.nc

        ncatted -h -O -a 'institution',global,o,c,'ARCCSS ARC Centre of Excellence for Climate System Science www.climatescience.org.au' ua_6hrs_YOTC_historical_an-pl_20080521.nc

        ncatted -h -O -a 'source',global,c,c,'Original grib files obtained from http://apps.ecmwf.int/datasets/data/interim_full_daily/ on G2008-05-21' ua_6hrs_YOTC_historical_an-pl_20080521.nc

        ncatted -h -O -a 'references',global,c,c,'Please acknowledge both ECMWF for original files and the ARCCSS for conversion to netcdf format' ua_6hrs_YOTC_historical_an-pl_20080521.nc

 

Finally, ncks (another nco command) calculates md5 checksum for the main variable and its dimensions and writes the result as a variable attribute

 

        ncks -O --md5_wrt_att -v ua ua_6hrs_YOTC_historical_an-pl_20080521.nc -o ua_6hrs_YOTC_historical_an-pl_20080521.nc

 

For more information on the md5-digest http://nco.sourceforge.net/nco.html#MD5-digests