ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
#CommentsByAdded
2
1Data Publishers cannot START with Use Case as they don't know HOW the data will be used. The data repository must process records on receipt to provide value add by tests/assertions and fixes that can be exposed to users.Lee
3
2Tests/assertions (Specifications in Allan's Framework) are more fundamental and more stable than software tools that will use them. Recommendation 1: that a standard core suite of tests/assertions (Specifications) should ideally be applied by all Data Publishers. There are subtle variations on the tests/assertions (Specifications) and fixes used by the various data repositories. If we had a standard core suite users could use data from any with greater efficiency and confidence.Lee
4
3There are no reports on exactly how a data repository has matched names (according to …)? Other examples? Recommendation 2: Name matches must be qualified by authorityLee
5
4User evaluation of FFU can use Data Publisher record test results/assertions (Specification) (e.g., record is an environmental outlier) AND user tests against Darwin Core fields (e.g., coordinateUncertaintyInMeters < 10000)Lee
6
5Feedback on user FFU tests of Darwin Core fields should be used by Data Publishers to fill gaps where possible and encourage provision of more DwC fields in the futureLee
7
6Recommendation 3: Build a catalog of user DQ tests that contain Darwin Core fields, data repository or user assertions and application (e.g., if coordinateUncertaintyInMeters > 1000 then reject for SDM of species X)Lee
8
7Capturing user-tests based on record/dataset assertions and Darwin Core fields for FFU should be logged via workflows and fed back to Data Publishers. Hopefully workflows will become more standardized.Lee
9
8Recommendation 4: Data Publishers must provide two additional fields for each record - a) the number of Darwin Core fields that are populated and b) The number of TRUE assertions (flags/Specifications) applied.Lee
10
9The Data Publisher IDs of assertions (Specifications) by data repositories are there to assist linking of identical or similar assertions. Recommendation 5: When a standardized suite of assertions (Specifications) are agreed, these should be GUIDsLee
11
10Column K in the Tests worksheet shows the Data Publishers that apply the assertion but I recognised that we should also include specific software tools that use this 'specification'. Allan has added this as Column M. Again, one would however expect that the software tools will come and go faster than the Data Publishers.
Lee
12
11DATA_AMBIGUOUS: Definition: For Validations or Measures, indicates that the data are internally inconsistent in some way that makes the test result ambiguous. Example: Validation that tests for a collecting event date within the lifespan of a collector, where the event date is only provided as a two digit year '82 and the collector's lifespan is 1840-1925. A validation test for a coordinate being inside a country where a datum is not provided and the coordinate is closer to the country boundary than the variability between datums.Paul6/12/2016
13
12DATA_PREREQUISITES_NOT_MET: Definition: Some prerequisite inherent in the data for performing the tests or enhancements in the specification was not met. Example: Data value for decimal latitude is not provided for a test that compares georeference to a country boundary.Paul6/12/2016
14
13EXTERNAL_PREREQUISITES_NOT_MET: Definition: Some prerequisite external to the provided data for performing the tests or enhancements in the specification was not met. Note: The result would be expected to be different if the test were run again at a later time (when the external prerequsitie was available) Example webservice consulted by the test was down and unable to be consulted per the specification at the time the test was run.Paul6/12/2016
15
Also there is John's suggestion to add a term to the list of quality dimensions: Data qualtity dimension: Quantity: Example: Measure of the number of records compliant with a set of validations.John6/12/2016
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100