ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Feature nameGroupPurposeDescriptionSumstats onlyTypeLimitsKey columnsRequired tables
2
has_sumstatsdummyAllows the model to learn to ignore variables (e.g. coloc) when sumstats are not available for a given studyIndicator variable showing whether the study has summary statsbinary0|1study_id- study table
3
sqtl_coloc_llr_maxcolocHigh confidence sQTL assignmentMax log likelihood ratio [log(h4/h3)] value for each (study, locus, gene) aggregating over all sqtl datasetsTRUEfloat-Inf, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
4
sqtl_coloc_llr_max_nbhcolocGives "neighbourhood" information about the max sqtl_coloc_llr value across all genes at the locussqtl_coloc_llr_max - (max coloc_llr across any sqtl gene at the locus). This is on log2 scaleTRUEfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
5
sqtl_coloc_llr_max_neglogpcolocGives an indication of the strength of the sQTL signal with the highest colocalisation evidence-log10(p-value) of the sQTL selected as sqtl_coloc_llr_maxTRUEfloat0, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
- (?) credible set: needed to get p-value for the sential variant at the locus
6
sqtl_pics_clpp_maxpics colocsQTL coloc assignment using LD information only (PICS method combined with eCaviar CCLP calculation)Max log(CCLP): sQTL colocalisation posterior probability based on the posterior probabilities estimated using LD PICS method, and eCaviar method for colocfloat0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- LD table (w PICS stats)
- credible set table (QTLs only)
7
sqtl_pics_clpp_max_nbhpics colocGives "neighbourhood" information about the max sqtl_pics_clpp_max value across all genes at the locussqtl_pics_clpp_max - (max pics_clpp across any sQTL gene at locus). This is in log-scale.float-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- LD table (w PICS stats)
- credible set table (QTLs only)
8
sqtl_pics_clpp_max_neglogppics colocGives an indication of the strength of the sQTL signal with the highest PICS colocalisation evidence-log10(p-value) of the sQTL selected as sqtl_pics_clpp_maxfloat0, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
- credible set
9
eqtl_coloc_llr_maxcolocHigh confidence eQTL assignmentMax log likelihood ratio [log(h4/h3)] value for each (study, locus, gene) aggregating over all eqtl datasetsTRUEfloat-Inf, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
10
eqtl_coloc_llr_max_nbhcolocGives "neighbourhood" information about the max eqtl_coloc_llr value across all genes at the locuseqtl_coloc_llr_max - (max coloc_llr across any eqtl gene at the locus). This is on log2 scaleTRUEfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
11
eqtl_coloc_llr_max_neglogpcolocGives an indication of the strength of the eQTL signal with the highest colocalisation evidence-log10(p-value) of the eQTL selected as eqtl_coloc_llr_maxTRUEfloat0, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
- (?) credible set: needed to get p-value for the sential variant at the locus
12
eqtl_pics_clpp_maxpics coloceQTL coloc assignment using LD information only (PICS method combined with eCaviar CCLP calculation)Max log(CCLP): eQTL colocalisation posterior probability based on the posterior probabilities estimated using LD PICS method, and eCaviar method for colocfloat0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- LD table (w PICS stats)
- credible set table (QTLs only)
13
eqtl_pics_clpp_max_nbhpics colocGives "neighbourhood" information about the max eqtl_pics_clpp_max value across all genes at the locuseqtl_pics_clpp_max - (max pics_clpp across any eQTL gene at locus). This is in log-scale.float-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- LD table (w PICS stats)
- credible set table (QTLs only)
14
eqtl_pics_clpp_max_neglogppics colocGives an indication of the strength of the eQTL signal with the highest PICS colocalisation evidence-log10(p-value) of the eQTL selected as eqtl_pics_clpp_maxfloat0, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
- credible set
15
pqtl_coloc_llr_maxcolocHigh confidence pQTL assignmentMax log likelihood ratio [log(h4/h3)] value for each (study, locus, gene) aggregating over all pqtl datasetsTRUEfloat-Inf, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
16
pqtl_coloc_llr_max_nbhcolocGives "neighbourhood" information about the max pqtl_coloc_llr value across all genes at the locuspqtl_coloc_llr_max - (max coloc_llr across any pqtl gene at the locus). This is on log2 scale.TRUEfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
17
pqtl_coloc_llr_max_neglogpcolocGives an indication of the strength of the pQTL signal with the highest colocalisation evidence-log10(p-value) of the pQTL selected as pqtl_coloc_llr_maxTRUEfloat0, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
- (?) credible set (QTL): needed to get p-value for the sential variant at the locus
18
pqtl_pics_clpp_maxpics colocpQTL coloc assignment using LD information only (PICS method combined with eCaviar CCLP calculation)Max log(CCLP): Max pQTL colocalisation posterior probability based on the posterior probabilities estimated using LD PICS method, and eCaviar method for colocfloat0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- LD table (w PICS stats)
- credible set table (QTLs only)
19
pqtl_pics_clpp_max_nbhpics colocGives "neighbourhood" information about the max pqtl_pics_clpp_max value across all genes at the locuspqtl_pics_clpp_max - (max pics_clpp across any pQTL gene at locus). This is in log scale.float-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- LD table (w PICS stats)
- credible set table (QTLs only)
20
pqtl_pics_clpp_max_neglogppics colocGives an indication of the strength of the pQTL signal with the highest PICS colocalisation evidence-log10(p-value) of the pQTL selected as pqtl_pics_clpp_maxfloat0, Infstudy_id, chrom, pos, ref, alt, gene_id- top loci
- colocalisation
- credible set
21
vep_credset_maxvepVEP evidence without downweighting by PP (which biologically doesn't make as much sense)Max VEP score across all tag variants in 95% credible set for each (study, locus, gene)float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- v2g
22
vep_credset_max_nbhvep"Neighbourhood" info about max VEP score at locusvep_credset_max / (max vep score across 95% credset for all genes at locus)float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- v2g
23
vep_avevepVEP evidence weighted by PP will give high confidence for well fine-mapped regionsMean VEP score weighted by posterior probabilities across all tags with PP > 0.001float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- v2g
24
vep_ave_nbhvep"Neighbourhood" info about weighted-mean VEP score at locusvep_ave / (max mean-weighted VEP score across tag variants for any gene at this locus)float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- v2g
25
dist_tss_mindistance tssDistance to tss information (unweighted)Minimum log(distance + 1) to gene TSS across all tag variants in 95% credset for each (study, locus, gene)float0, log10(500001)study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- gene table
26
dist_tss_min_nbhdistance tss"Neighbourhood" info about tss min distancelog(min tss distance for any gene across 95% credible set) - dist_tss_minfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- gene table
27
dist_tss_avedistance tssAverage distance to tss information (weighted)log(average tss distance weighted by posterior probability)float0, log10(500001)study_id, chrom, pos, ref, alt, gene_id
28
dist_tss_ave_nbhdistance tssNeighbourhood info about average tss distancemin(log(average distance) across any gene at locus) - dist_tss_avefloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id
29
dist_foot_mindistance footprintDistance to gene footprint information (unweighted)Minimum log(distance + 1) to gene footprint across all tag variants in 95% credset for each (study, locus, gene)float0, log10(500001)study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- gene table
30
dist_foot_min_nbhdistance footprint"Neighbourhood" info about gene footprint min distancelog(min gene footprint distance for any gene across 95% credible set) - dist_tss_minfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- gene table
31
dist_foot_avedistance footprintAverage distance to footprint information (weighted)log(average footprint distance weighted by posterior probability)float0, log10(500001)study_id, chrom, pos, ref, alt, gene_id
32
dist_foot_ave_nbhdistance footprintNeighbourhood info about average footprint distancemin(log(average distance) across any gene at locus) - dist_tss_avefloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id
33
pchic_maxpchicPCHiC evidence without weightingMax CHICAGO score across all tags in 95% credible set for each (study, locus, gene)float5, Infstudy_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
34
pchic_max_nbhpchic"Neighbourhood" info about PCHiC evidence without weightingpchic_max / (Max CHICAGO score across all tags in 95% credible set across all genes)float0, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
35
pchic_avepchicPCHiC evidence weighted by credible set PPMean CHICAGO score weighted by PP for all tags w PP > 0.001 for each (study, locus, gene)float0, Infstudy_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
36
pchic_ave_nbhpchic"Neighbourhood" info about PCHiC evidence weighted by credible set PPpchic_ave / (Mean CHICAGO score weighted by PP for all tags w PP > 0.001 across all genes)float0, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
37
enhc_tss_maxenhancer-tss correlationenhc_tss evidence without weightingMax correlation (r) score across all tags in 95% credible set for each (study, locus, gene)float0, Infstudy_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
38
enhc_tss_max_nbhenhancer-tss correlation"Neighbourhood" info about enhc_tss evidence without weightingenhc_tss_max / (Max correlation (r) score across all tags in 95% credible set across all genes)float0, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
39
enhc_tss_aveenhancer-tss correlationenhc_tss evidence weighted by credible set PPMean correlation (r) score weighted by PP for all tags w PP > 0.001 for each (study, locus, gene)float0, Infstudy_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
40
enhc_tss_ave_nbhenhancer-tss correlation"Neighbourhood" info about enhc_tss evidence weighted by credible set PPenhc_tss_ave / (Mean correlation (r) score weighted by PP for all tags w PP > 0.001 across all genes)float0, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
41
dhs_prmtr_maxDHS-promoter correlationdhs_prmtr evidence without weightingMax correlation (r) score across all tags in 95% credible set for each (study, locus, gene)float0.7, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
42
dhs_prmtr_max_nbhDHS-promoter correlation"Neighbourhood" info about dhs_prmtr evidence without weightingdhs_prmtr_max / (Max correlation (r) score across all tags in 95% credible set across all genes)float0, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
43
dhs_prmtr_aveDHS-promoter correlationdhs_prmtr evidence weighted by credible set PPMean correlation (r) score weighted by PP for all tags w PP > 0.001 for each (study, locus, gene)float0, Infstudy_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
44
dhs_prmtr_ave_nbhDHS-promoter correlation"Neighbourhood" info about dhs_prmtr evidence weighted by credible set PPdhs_prmtr_ave / (Mean correlation (r) score weighted by PP for all tags w PP > 0.001 across all genes)float0, 1study_id, chrom, pos, ref, alt, gene_id- combined credsets
- v2g
45
dist_tss_sentineldistance tssNeeded to compare performance against basic distancelog(distance + 1) from the sentinel variant to the gene tssfloat0, log10(500001)study_id, chrom, pos, ref, alt, gene_id- top loci
- gene table
46
dist_tss_sentinel_nbhdistance tssGive neighbourhood information about dist_tss_sentinellog(min dist_tss_sentinel for any gene at locus) - dist_tss_sentinelfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- gene table
47
dist_foot_sentineldistance footprintNeeded to compare performance against basic distancelog(distance + 1) from the sentinel variant to the gene footprintfloat0, log10(500001)study_id, chrom, pos, ref, alt, gene_id- top loci
- gene table
48
dist_foot_sentinel_nbhdistance footprintGive neighbourhood information about dist_foot_sentinellog(min dist_foot_sentinel for any gene at locus) - dist_foot_sentinelfloat-Inf, 0study_id, chrom, pos, ref, alt, gene_id- top loci
- gene table
49
count_credset_95otherMay allow the model to adjust confidence for well / poorly fine-mapping regionstotal number of tag variants in the 95% credible setint0, Infstudy_id, chrom, pos, ref, alt- combined credsets
50
polyphen_credset_maxcoding (polyphen)polyphen evidence without downweighting by PP (which biologically doesn't make as much sense)Max polyphen score across all tag variants in 95% credible set for each (study, locus, gene)float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- polyphen
51
polyphen_credset_max_nbhcoding (polyphen)"Neighbourhood" info about max polyphen score at locuspolyphen_credset_max / (max polyphen score across 95% credset for all genes at locus)float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- polyphen
52
polyphen_avecoding (polyphen)polyphen evidence weighted by PP will give high confidence for well fine-mapped regionsMean polyphen score weighted by posterior probabilities across all tags with PP > 0.001float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- polyphen
53
polyphen_ave_nbhcoding (polyphen)"Neighbourhood" info about weighted-mean polyphen score at locuspolyphen_ave / (max mean-weighted polyphen score across tag variants for any gene at this locus)float0, 1study_id, chrom, pos, ref, alt, gene_id- top loci
- combined credsets
- polyphen
54
gene_count_lte_50kotherAllows model to learn gene densityGene count within 50kb of sentinel variantintchrom, pos, ref, alt-gene dictionary
-top loci
55
gene_count_lte_100kotherAllows model to learn gene densityGene count within 100kb of sentinel variantintchrom, pos, ref, alt-gene dictionary
-top loci
56
gene_count_lte_250kotherAllows model to learn gene densityGene count within 250kb of sentinel variantintchrom, pos, ref, alt-gene dictionary
-top loci
57
gene_count_lte_500kotherAllows model to learn gene densityGene count within 500kb of sentinel variantintchrom, pos, ref, alt-gene dictionary
-top loci
58
proteinAttenuationotherGenes with high protein attenuation may be less likely to be causalProtein attenuation compared to cDNA as descriped here (supplementary table 2): https://www.mcponline.org/content/early/2019/06/25/mcp.RA118.001280.abstractfloatgene_id
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100