Deduplication
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
View only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Deduplication of one data table will affect all related tables. Consult your data model. Deleting or merging records (e.g., constituent records) will delete an identifier, which may orphan records in other tables (e.g., contact event records).Deduplication can be labor-intensive. Deduplication, if done too quickly or carelessly, can result in major damage. Plan, prepare and test methodology before making any irreversible changes to the data. Don't swim alone -- have at least one other knowledgeable person review the methodology and the results of testing the methodology before applying the methodology to the master data.
2
Before touching the data (fill in answers)
3
Goal of the deduplication actiivity:
4
Who (people, departments, stakeholders) will be affected?
5
How will related datasets might be affected by deduplication?
6
Communication plan;
7
Storage location for cross-reference of old and new id's
8
Mistake recovery plan:
9
Uniqueness criteria:
10
Merge criteria:
11
Deduplication software:
12
13
Touching the data (check off steps as they are completed)
14
Profile the data and resolve any tractable discrepancies.
15
Standardize data as thoroughly as possible
16
Test by deduplicating a copy of the data
17
Have someone other than the deduplicator review and approve the test deduplication
18
Repeat steps above as many times as necessary!
19
Dedupe!
20
21
After deduplication (check off steps as they are completed)
22
Re-profile the data if appropriate
23
Document patterns of duplication and possible causes.
24
Document lessons learned.
25
26
Contact the author at sfsinger@campaignscientific.com, @sfsinger, 267-414-3119. Guide available for download at bit.ly/DQGuide. Submit feedback at bit.ly/SingleStepsFeedback
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...
 
 
 
Sheet1