Wastewater
DataHarmonizer SOP 3.1
Action | Related docs | |||||||||||||||||||
1 | Download the zip file (“Source code (zip))” containing The DataHarmonizer application from the following link: https://github.com/cidgoh/pathogen-genomics-package/releases Extract the zip file’s contents, and navigate into the extracted folder. Open main.html. The validator application will open in your default browser. It should look like this: The DataHarmonizer enables contextual data harmonization for different pathogens and projects. Select one of the wastewater templates by clicking on the desired package (wastewater/WastewaterSars-CoV-2, wastewater/WastewaterAMR, wastewater/WastewaterPathogenAgnostic) from the Template menu beside the Help button. Data can be entered into the validator application manually, by typing values into the application’s spreadsheet, or data can be imported from local xlsx, xls, tsv and csv files. To import local data, click File on the top-left toolbar, and then click Open. To enter data in a new file, click File on the top-left toolbar, and then click New. Data entered into the spreadsheet can be copied and pasted. Note: Only files containing the headers expected by the DataHarmonizer can be opened in the application. If you are missing the first row, you will get the following warning: Resolve by declaring “1” as the row in which your column headers reside. | |||||||||||||||||||
2 | Before you begin to curate sample metadata:
| |||||||||||||||||||
3 | Familiarize yourself with DataHarmonizer functionality by reviewing the “Getting Started”. To access "Getting Started", click on the green Help button on the top-left toolbar, then click Getting Started. Definitions, examples and further guidance are available by double clicking on the field headers, or by using the “Reference Guide”. To access the “Reference Guide” click on the Help button, then click Reference Guide. | |||||||||||||||||||
4 | Confirm mapping of your data fields to those in the harmonized template with the data steward (e.g. your supervisor). If data is to be shared with trusted partners or public repositories, confirm the level of granularity of information that can be shared with the data steward. The most detailed information allowable should be included here. | |||||||||||||||||||
5 | Enter data into the validator spreadsheet.
If a desired term is not present in a picklist, use the New Term Request System to request new vocabulary. Alternatively, contact Emma Griffiths at ega12@sfu.ca. Note: Sometimes there will be constraints on what information can be shared, other times a field may not be applicable to your sample. Use the null values (controlled vocabulary indicating the reason why information is not provided) in the picklist to report missing data. Required fields are organized into subsections. A data curation SOP is available with specific instructions for how to fill in fields (see the Wastewater Metadata Curation SOP). Below summarises the required (in bold) and/or recommended fields for each subsection (Note: may vary by template).
| |||||||||||||||||||
6 | Validate the entered data by clicking on the Validate button on the top-left toolbar. Missing information and invalid entries in required fields will be highlighted in red.
Note: Row viewing options only appear after a validation attempt has been made. | |||||||||||||||||||
7 | Address any invalid data that was flagged in red in the template.
Note: It is possible to export incomplete or invalid data. Make sure to review any errors prior to exporting. | |||||||||||||||||||
8 | Export validated data by clicking File on the top-left toolbar, and then clicking on Save as. Enter the file name and press Save. |
Version | Date | Writer | Description of Change |
1.0 | Feb 09 2024 | Emma Griffiths | Initial release |
1.0 | Feb 09 2024 | Charlie Barclay | Updated links and required fields subsection |
3.1 | Oct 08 2024 | Charlie Barclay | Version compatibility update |
Page of