| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | From | Introduced number of duplicates, for a particular file, in a particular change. Outcome variable | Team making (authoring or commiting) the change. Exposure variable | Knowledge about the given component (file, or repository), prior to making the change. Latent (unmeasured) variable | Tendency to keep clean in their practices. Latent (unmeasured) variable | Existing number of duplicates in the changed file | Existing complexity in the changed file (McCabe metric) | Number of added lines to the file. Changing an existing line is modeled as one added and one removed line. | Number of deleted (REMOVED) lines in the file, in this file change. Changing an existing line is modeled as one added and one removed line. | ||||||||||||||||
2 | To | INTROD | TEAM | KNOW | CLEAN | DUP | COMP | ADD | REM | ||||||||||||||||
3 | INTROD | ||||||||||||||||||||||||
4 | TEAM | N/A. Teams do not directly affect INTROD, only through added or removed lines, or the quality of the added and removed lines (e.g. refactorings) | Teams have different knowledge in different components, based on their prior work experience. Furthermore, we posit that a team with greater knowledge about a component employs less copy-paste-behaviour (or at least refactor and clean up more after they have implemented their new feature), so the knowledge will impact the amount of added and removed code. | Teams have different tendencies to clean up their code, such as refactoring to clean up duplicates after having implemented a new feature. Cleanliness will affect the amount of code added and changed to the code base. | A component with many existing duplicates will be harder to maintain, and extend, especially for a team that is foreign to the component. So TEAM has an impact on how DUP affects the numer of added lines and on the number of introduced duplicates. | A component that is complex will be harder to maintain and extend, especially for a team that is foreign to the component. So TEAM has an impact on how COMP affects the number of added lines and on the number of introduced duplicates. | Author&team have clear impact on ADDED, as they are writing the code that is going to be merged. But in our DAG, we mainly model this as coming through the KNOW and CLEAN latent constructs. | Author&team have clear impact on REMOVED, as they write the code that is merged | |||||||||||||||||
5 | KNOW | N/A. Knowledge in a particular repository does not directly affect the introduction of clones. Only indirectly, through added lines. | N/A. A team has knowledge about a component, not the other way around. | N/A. In our model, teams are modeled to behave similarly with regards to their tendencies to clean their code, regardless of their preexisting knowledge | N/A. Existing duplicates in a file might impact the knowledge a team has over it, but not the other way around. | N/A. Complexity of a file might impact the knowledge a team has over it, but not the other way around. | We posit that the knowledge a team has of a component will impact the amount of ADDed lines. Teams lacking knowledge are more likely to add unnecessary lines (e.g. via copy-paste) | N/A. In this model, we assume that the preexisting knowledge will not impact the amount of removed lines. | |||||||||||||||||
6 | CLEAN | N/A. Tendency to "keep clean" does not directly affect the introduction of code clones. Only indirectly, through added, and possibly removed lines (e.g. due to refactorings) | N/A. A team has a certain cleanliness trait ("Way-of-working"), not the other way around. | N/A. The tendency to "keep clean" are modeled as independent of the knowledge a particular team have in a particular component. | N/A. We posit that a team that has a tendency to "keep clean" will do so irrespective of the existing number of duplicates. | N/A. We posit that a team that has a tendency to "keep clean" will do so irrespective of the existing complexity. | The "cleanliness factor" of a team is likely to influence the number of added lines (e.g. by reducing copy-paste, or refactoring after feature introduction) | The "cleanliness factor" of a team is likely to influence the number of deleted (or changed) lines (e.g. via refactoring, which will change or remove some lines) | |||||||||||||||||
7 | DUP | Possibly. The effect of team behaviour might be mediated through the existing number of duplicates. Teams with more knowledge, and a tendency to keep clean, will likely add fewer duplicates. | N/A. If an organizational policy mandated that certain teams should handle certain files, there might be a causal effect here. Not so in this organization. | Possibly. Existing duplicates mean harder times acquiring the requisite knowledge of the component. | N/A. | Existing duplicates would affect existing complexity, as another added duplicate means that the file will be more complex. | Possibly. Given that a file with many duplicates might cause a team to add more lines to a file (i.e. behaving less clean). | Unclear if existing duplicates would directly affect the amount of removed lines in a file. Most likely, the effect would be secondary (e.g. through refactorings), and this would then be covered through the cleanliness causal effect | |||||||||||||||||
8 | COMP | Possibly. The effect of team behaviour might be mediated through the existing complexity. Teams with more knowledge, and a tendency to keep clean, will likely add fewer duplicates. | N/A. If an organizational policy mandated that certain teams should handle certain files, there might be a causal effect here. Not so in this organization. | Possibly. Existing complexity mean harder times acquiring the requisite knowledge of the component. | N/A. | Duplicates cause complexity, but the reverse is not necessarily true (it is possible to write a very complex file without any duplicates at all) | Possibly. Given that a complex file might cause a team to add more lines to a file (i.e. behaving less clean). | Unclear if existing complexity would directly affect the amount of removed lines in a file. Most likely, the effect would be secondary (e.g. through refactorings), and this would then be covered through the cleanliness causal effect | |||||||||||||||||
9 | ADD | Added lines should impact the numer of introduced duplicates. If there are no added lines, most likely, no added duplicates either | N/A | N/A. Adding lines will not affect the knowledge the team has prior to the change. But the opposite might well be true | N/A. Adding lines will not affect the tendency to keep clean. But the opposite might well be true | N/A. Adding lines will not affect the number of duplicates prior to the change. | N/A. Adding lines will not affect the complexity of a file prior to the change | N/A. We model that the number of added lines are not influencing the number of removed lines. | |||||||||||||||||
10 | DEL | Possibly. At least not implausible causation (someone removing lines might cause new duplicates to appear, due to difference being removed) | N/A | N/A. Removing lines will not affect the knowledge the team has prior to the change. | N/A. Removing lines will not affect the tendency to keep clean. But the opposite might well be true | N/A. Removing lines will not affect the number of duplicates prior to the change. | N/A. Removing lines will not affect the complexity of a file prior to the change | N/A. We model that the number of removed lines are not influencing the number of added lines. | |||||||||||||||||
11 | |||||||||||||||||||||||||
12 | |||||||||||||||||||||||||
13 | |||||||||||||||||||||||||
14 | |||||||||||||||||||||||||
15 | |||||||||||||||||||||||||
16 | |||||||||||||||||||||||||
17 | |||||||||||||||||||||||||
18 | |||||||||||||||||||||||||
19 | |||||||||||||||||||||||||
20 | |||||||||||||||||||||||||
21 | |||||||||||||||||||||||||
22 | |||||||||||||||||||||||||
23 | |||||||||||||||||||||||||
24 | |||||||||||||||||||||||||
25 | |||||||||||||||||||||||||
26 | |||||||||||||||||||||||||
27 | |||||||||||||||||||||||||
28 | |||||||||||||||||||||||||
29 | |||||||||||||||||||||||||
30 | |||||||||||||||||||||||||
31 | |||||||||||||||||||||||||
32 | |||||||||||||||||||||||||
33 | |||||||||||||||||||||||||
34 | |||||||||||||||||||||||||
35 | |||||||||||||||||||||||||
36 | |||||||||||||||||||||||||
37 | |||||||||||||||||||||||||
38 | |||||||||||||||||||||||||
39 | |||||||||||||||||||||||||
40 | |||||||||||||||||||||||||
41 | |||||||||||||||||||||||||
42 | |||||||||||||||||||||||||
43 | |||||||||||||||||||||||||
44 | |||||||||||||||||||||||||
45 | |||||||||||||||||||||||||
46 | |||||||||||||||||||||||||
47 | |||||||||||||||||||||||||
48 | |||||||||||||||||||||||||
49 | |||||||||||||||||||||||||
50 | |||||||||||||||||||||||||
51 | |||||||||||||||||||||||||
52 | |||||||||||||||||||||||||
53 | |||||||||||||||||||||||||
54 | |||||||||||||||||||||||||
55 | |||||||||||||||||||||||||
56 | |||||||||||||||||||||||||
57 | |||||||||||||||||||||||||
58 | |||||||||||||||||||||||||
59 | |||||||||||||||||||||||||
60 | |||||||||||||||||||||||||
61 | |||||||||||||||||||||||||
62 | |||||||||||||||||||||||||
63 | |||||||||||||||||||||||||
64 | |||||||||||||||||||||||||
65 | |||||||||||||||||||||||||
66 | |||||||||||||||||||||||||
67 | |||||||||||||||||||||||||
68 | |||||||||||||||||||||||||
69 | |||||||||||||||||||||||||
70 | |||||||||||||||||||||||||
71 | |||||||||||||||||||||||||
72 | |||||||||||||||||||||||||
73 | |||||||||||||||||||||||||
74 | |||||||||||||||||||||||||
75 | |||||||||||||||||||||||||
76 | |||||||||||||||||||||||||
77 | |||||||||||||||||||||||||
78 | |||||||||||||||||||||||||
79 | |||||||||||||||||||||||||
80 | |||||||||||||||||||||||||
81 | |||||||||||||||||||||||||
82 | |||||||||||||||||||||||||
83 | |||||||||||||||||||||||||
84 | |||||||||||||||||||||||||
85 | |||||||||||||||||||||||||
86 | |||||||||||||||||||||||||
87 | |||||||||||||||||||||||||
88 | |||||||||||||||||||||||||
89 | |||||||||||||||||||||||||
90 | |||||||||||||||||||||||||
91 | |||||||||||||||||||||||||
92 | |||||||||||||||||||||||||
93 | |||||||||||||||||||||||||
94 | |||||||||||||||||||||||||
95 | |||||||||||||||||||||||||
96 | |||||||||||||||||||||||||
97 | |||||||||||||||||||||||||
98 | |||||||||||||||||||||||||
99 | |||||||||||||||||||||||||
100 |