GRDI_AMR
DataHarmonizer SOP 14.5
Action | Related docs |
Action | Related docs | |||||||||||||||||
1 | Download the zip file (“Source code (zip)”) containing The DataHarmonizer application from the following link: https://github.com/cidgoh/pathogen-genomics-package/releases Extract the zip file’s contents, and navigate into the extracted folder. Open main.html. The validator application will open in your default browser. It should look like this: The DataHarmonizer enables contextual data harmonization for different pathogens and projects. Select the AMBR template by selecting “grdi/GRDI” from the Template menu beside the Help button. Data can be entered into the validator application manually, by typing values into the application’s spreadsheet, or data can be imported from local xlsx, xls, tsv and csv files. To import local data, click File on the top-left toolbar, and then click Open. To enter data in a new file, click File on the top-left toolbar, and then click New. Data entered into the spreadsheet can be copied and pasted. Note: Only files containing the headers expected by the DataHarmonizer can be opened in the application. If you are missing the first row, you will get the following warning: Resolve by declaring “1” as the row in which your column headers reside. | |||||||||||||||||
2 | Before you begin to curate sample metadata:
| |||||||||||||||||
3 | Familiarize yourself with DataHarmonizer functionality by reviewing the “Getting Started”. To access "Getting Started", click on the green Help button on the top-left toolbar, then click Getting Started. Definitions, examples and further guidance are available by double clicking on the field headers, or by using the “Reference Guide”. To access the “Reference Guide” click on the Help button, then click Reference Guide. | |||||||||||||||||
4 | Confirm mapping of your data fields to those in the harmonized template with the data steward (e.g. your supervisor). Confirm the level of granularity of information that can be shared in IRIDA with the data steward. The most detailed information allowable should be included here. | |||||||||||||||||
5 | Enter data into the validator spreadsheet.
If a desired term is not present in a picklist, use the New Term Request System to request new vocabulary. Alternatively, contact Emma Griffiths at ega12@sfu.ca. Note: Sometimes there will be constraints on what information can be shared, other times a field may not be applicable to your sample. Use the null values (controlled vocabulary indicating the reason why information is not provided) in the picklist to report missing data. Required fields are organized into subsections. Data should be entered into the DataHarmonizer in the same manner as other GRDI-AMR curation templates (i.e. the Excel version of the template). A data curation SOP is available with specific instructions for how to fill in fields (see the AMR-GRDI Metadata Curation SOP).
| |||||||||||||||||
6 | Validate the entered data by clicking on the Validate button on the top-left toolbar. Missing information and invalid entries in required fields will be highlighted in red.
Note: Row viewing options only appear after a validation attempt has been made. | |||||||||||||||||
7 | Address any invalid data that was flagged in red in the template.
Note: It is possible to export incomplete or invalid data. Make sure to review any errors prior to exporting. | |||||||||||||||||
8 | Export validated data by clicking File on the top-left toolbar, and then clicking on Save as. Enter the file name and press Save. | |||||||||||||||||
11 | Optional: Format validated data for IRIDA submission. You or your team members should have already created a project specific for the GRDI in IRIDA (irida.ca). Under File, select Save As. Make sure that the "Save as" type is "Excel Workbook" or .XLSX. Save the file with the name and location of your choice. Open the file and remove the top row containing the section headers. Re-save the file. Use the IRIDA Metadata Uploader to import your contextual data into the GRDI Project using the instructions provided here: https://phac-nml.github.io/irida-documentation/user/user/sample-metadata/ Note: The IRIDA uploader only accepts Excel files, not csv files. If the top row containing the broad headings (Sample collection and processing, Host information, Sequencing, etc) is not removed, the IRIDA metadata upload will fail. | Upload to IRIDA SOP: |
Version | Date | Writer | Description of Change |
7.6 | May 25 2023 | Emma Griffiths | Initial release |
7.7 | Sep 29 2023 | N/A | Version compatibility update |
8.8 | Oct 25 2023 | N/A | Version compatibility update |
8.9 | Dec 1 2023 | N/A | Version compatibility update |
9.0 | Feb 5 2024 | Charlie Barclay | Added new units fields to “Describing the material and/or site sampled” subsection. |
10.0 | March 25 2024 | N/A | Version compatibility update |
11.1 | April 19 2024 | Charlie Barclay | Added new fields for Bioinformatics and QC metrics |
12.2 | July 17, 2024 | N/A | Version compatibility update |
13.3 | Aug 15, 2024 | Charlie Barclay | Additional sample collection date/times for composite samples. |
13.4 | Nov 19, 2024 | Charlie Barclay | Version compatibility update |
14.5 | March 3rd, 2025 | Charlie Barclay | Additional field for sequencing_date. |
Page of