ABCDEFGHIJKLMNOPQRSTUVWXY
1
function1/9/2026 15:28:49(auto-updated)
2
USEDo NOT useNotesExample
3
terminase, small subunitTerSSisi_1
4
terminaseIf there are not two obvious large and small terminase genes in the same genome, just assign the function "terminase".TM4_4
5
terminase, large subunitTerLSisi_2
6
terminase, large subunit (ATPase domain) Only applicable to Cluster AY genomes (8-21-18), AT genomes (2-28-2020), and DT genomes (7-4-20). AS genomes appear to have a gene 1 with some alignment to the large subunit, but it is unclear if the domains are intact. (10-21-19, 2-21-2020)also applies to cluster GD genomesAuxilium_gp2
7
terminase, large subunit (nuclease domain) Only applicable to Cluster AY genomes (8-21-18), AT genomes (2-28-2020) and DT genomes (7-4-20). AS genomes appear to have a gene 1 with some alignment to the large subunit, but it is unclear if the domains are intact. (10-21-19, 2-21-2020)also applies to cluster GD genomesAuxilium_gp3
8
small terminase-like proteinToad24, gp58
9
DNA packaging ATPase protein
for tectiviridae onlyBadulia_12
10
DNA terminal proteinfor podovirus onlyPineapplePizza_gp4
11
12
portal proteinhead to tail connectorTM4_5
13
14
15
scaffolding proteinScaffoldD29_gp16
16
capsid maturation proteasewe are no longer using "capsid morphogenesis protein"sometimes the CMP hits to ClpP proteases. If so, look for a serine-type endopeptidase activity. A significant hit to the CMP of D29 and L5 is sufficient evidence.
Langerak_gp4 and D29_gp15
17
major capsid proteincapsidSisi_6
18
major capsid pentamer proteinRosebush gp16experimental evidence
https://pubmed.ncbi.nlm.nih.gov/32182721/
19
major capsid hexamer proteinRosebush gp15experimental evidence
https://pubmed.ncbi.nlm.nih.gov/32182721/
20
capsid decoration proteinhead decoration proteinPatience_gp29, Rosebush_gp17experimental evidence
https://pubmed.ncbi.nlm.nih.gov/32182721/
21
minor capsid proteinPatience_gp15, Myrna_gp98experimental evidence
https://pubmed.ncbi.nlm.nih.gov/32182721/
22
Hypothetical ProteinMuF-like minor capsid proteinIf an HHPred alignment to gp15 of phage D29, see capsid maturation protease6/16/21
23
capsid decoration protein, LamD-likelook for the several beta strands and 2 alpha heiices pf PBD's 1C%E_A (lamdba)Turuncu_32https://www.nature.com/articles/nsb0300_230.pdf
24
25
Capsid maturation protease and VIP2-like ADP-ribosyltransferase toxinAdolin_4
26
major capsid and protease fusion proteinin this case, the scaffolding function is also part of the fusion, but we don't explicitly write it in the function nameCluster AN arthrobacter phages, EE microbacterium phages
27
head fiber proteinBriton15_18
28
head-to-tail adaptorwe are no longer calling "head-to-tail connector" or "head-to-tail connector complex protein" 3-6-19must have an HHPRED alignment to one of the following crystal structures: SPP1 15 (5A21 chain C or D in the macromolecular complex) OR must have an HHPRED alignment to one of the following crystal structures: HK97 gp6 or or Bacillus protein yqbG GageAP_19 Please see the portal and head-to-tail connector case study at the links provided. Note: SPP1 gp17 and 17.1 are NOT h-t connectors (they are the tail terminator and major tail subunit.https://seaphages.org/meetings/33/8/1/23. The prescribe hits (namely the hits to PDB5A21) are not as likely to show up as they did when this function was added to the approved function list. You can now hit other head-to-tail adaptor calls that were based on this original data. I can include some example genes, but if there is a need for more let me know.
29
head-to-tail stoppermust have an HHPRED alignment to one of the following crystal structures: SPP1 16 (5A21 chain E or F in the macromolecular complex) or Bacillus protein yqbHGageAP_20 Please see the portal and head-to-tail connector case study at the links provided. Note: SPP1 gp17 and 17.1 are NOT h-t connectors (they are the tail terminator and major tail subunit.https://seaphages.org/meetings/33/8/1/23. The prescribe hits (namely the hits to PDB5A21) are not as likely to show up as they did when this function was added to the approved function list. You can now hit other head-to-tail stopper calls that were based on this original data. I can include some example genes, but if there is a need for more let me know.
30
tail terminatormust have an HHPRED alignment to one of the following: SPP1 17 (5A21 chain G in the macromolecular complex) or Lambda U (3FZ2_chains A through F)GageAP_22 Please see the portal and head-to-tail connector case study at the links provided. Note: SPP1 gp17 and 17.1 are NOT h-t connectors (they are the tail terminator and major tail subunit.)https://seaphages.org/meetings/33/8/1/23. The prescribe hits (namely the hits to PDB5A21) are not as likely to show up as they did when this function was added to the approved function list. You can now hit other tail terminator calls that were based on this original data. I can include some example genes, but if there is a need for more let me know.
31
tail assembly chaperoneTail scaffolding proteinEvidence needed to call TAC: Please see Bioinformatics Guide for what evidence is neededTM4_15; 16https://seaphagesbioinformatics.helpdocsonline.com/article-54
32
Hypothetical Proteintail assembly chaperonedo not call TAC when there is NO evidencecluster EA1
33
tape measure proteinTape Measure, tmp, tapemeasureTM4_17
34
minor tail proteintail fiber-like protein, collagen-like, glycine richIf you have significant hits to either collagen-like or glycine-rich proteins, and are in the syntenic region of minor tail proteins, you can call them minor tail proteins.Sisi_15-18, Nebs_gp4
35
minor tail protein, D-ala-D-ala carboxypeptidasemust include "minor tail" as part of the functional assingmentSisi_19
36
tail sheath proteinfound in contractile tailed phagesAlice_120
37
38
tailspike proteintailspike has triple beta coils. make sure you are matching the spike part of the protein and not the N-terminal tail tip binding domain.Turuncu_23
39
tail needle proteinonly assign to a podo- or myo-viridae genomeno good example is availble yethttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2713385/ https://pubmed.ncbi.nlm.nih.gov/36292999/ https://pubmed.ncbi.nlm.nih.gov/29209037/
40
baseplate J proteinAlice_133
41
tail tube proteinfound in contractile tailed phagesRosiePosie_19
42
baseplate wedge proteinRosiePosie_38
43
For next 6 assignments
From the Bxb1, Krista Freeman paper. https://pubmed.ncbi.nlm.nih.gov/40239650/
44
DitViolac_gp16
45
Dit/tail tip cageBxb1 _gp23
46
tail tip cageViolac_gp18
47
baseplate hubBxb1_gp25
48
tail spikeBxb1_gp29
49
tail tubemajor tail proteinBxb1_gp19
50
Hypothetical ProteinHK97_gp10Hendrix lab (studiers of HK(& and capsid construction) never id'd a function for this gene
51
lysin ALysA, endolysin Aonly appropriate for Mycobacteriophages; or for Actino phage in which you can identify a lysin b. if you only have one gene to call a lysin or endolysin, do not add a domain. if not a Mycobacteriophage, must have a lysin B, otherwise it is endolysinSisi_30
52
lysin A, protease M15 domainsome Gordonia phages have lysin A split into two genes. make sure to label each domain.if not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, protease M15 domain
53
lysin A, protease M23 domainif not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, protease M23 domain
54
lysin A, protease C39 domainsome Gordonia phages have lysin A split into two genes. make sure to label each domain.if not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, protease C39 domain
55
lysin A, glycosyl hydrolase domainsome Gordonia phages have lysin A split into two genes. make sure to label each domain.if not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, glycosyl hydrolase domain
56
lysin A, L-Ala-D-Glu peptidase domaindo not assign domains unless you can find genes that house the other needed domainsif not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, L-Ala-D-Glu peptidase domain
57
lysin A, N-acetylmuramoyl-L-alanine amidase domaindo not assign domains unless you can find genes that house the other needed domainsif not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, N-acetylmuramoyl-L-alanine amidase domain
58
lysin A, protease domainif not obviously M15 or C39some actinobacteriophages (not yet seen in the mycobacteriophages) have lysin A split into two genes. make sure to label each domain., do not assign domains unless you can find genes that house the other needed domainsif not a Mycobacteriophage, must have a lysin B, otherwise it is endolysin, protease domain
59
peptidoglycan hydrolasea lysin adjacent geneGoldenEssence_XX (maybe 138)
60
lysin BLysB, endolysin BSisi_31
61
endolysindo no use unless you have a lysin-like gene in an odd placeFor a period of ~5 years, we changed the name lysin A to endolysin (becasue there was no lysin B. The current literature suggests that those proteins function like lysin A. More is coming, but not yet. In the meantime, if you hit a single gene that causes lysis in the cell, its function aligns with a lysin A.therefore, the correct use of this term as of 12/15/2025 is when you have a gene with an endolysin fiuction that does not match a previously decribes type: Lysin A, Lysin B. New data is coming: LysP008, or LysC, LysZ is in the literature, just havn't documented them in our dataset.
62
serine hydrolaselysin B has 2 domains, peptidoglycan binding domain and a serine hydrolase domain. This protein only has the serine hydrolase domainDekHockey33_70
63
64
holinevidence needed to call a holin can include biochemical data (1), seqeunce similarity to genes with biochemical data (2), at least 2 transmembrane domians found and the gene be adjacent to the endolysins (s), conderved domain hits (4), and the abscence of additional transmembrane domains in the area. The literature suggests that some phages have more than one holin, for now when we seem multiple possibilities for a holin gene, let's call them membrane proteins.D29_11
65
serine integraseBxb1_35
66
tyrosine integrasetyrosine homologous recombinaseSisi_43
67
serine homologous recombinase
68
ParA-like dsDNA partitioning proteinRedRock_37
69
ParB-like dsDNA partitioning proteinDo not label anything a ParB or having a ParB partitioning domain without the presence of a ParA partner in the genomeRedRock_38https://pubmed.ncbi.nlm.nih.gov/27146086/
70
ParB-like nuclease domaindoes not have to have a ParA partner
71
ParB N-terminal-like domain
ParB-like nuclease domain, does not have a ParA partner.Yotsuba_48
72
methyltransferase with ParB N-terminal-like domain
at present (7-5-22), this is an orpham. It is big gene (becuase it contains the methyltransferase. Do not confuse with the hits to the DNA binding portion of the ParB.Evaa_gp2
73
RepA-like replication initiator
Rachaly_36
74
immunity repressorRepressorlikely to have an HHPred match to C1 protein in lambdaSisi_45
75
Imm-like superinfection immunity proteinthis is not this phage's immunity repressorthis a pfam hit to T4's superinfection immunity protein: significance was an e-value of 10e-14.Niza_72
76
hetero-immunity repressorimmunity repressor (Cluster A)to be used when there is a second immunity repressor that is NOT associated with the own phage's immunity casette (system).This assignment shares a pham with Cluster A immunity repressor. However these phages are not in cluster A; so far we see members in Clusters C, K, and F.LRRHood_44, SamScheppers_83, Rialto_43https://seaphages.org/forums/topic/5583/?page=1#post-10410
77
exciseExcisionase, Xis, only one per phage;check to see if CRODo not call a protein excise unless you can identify a tyrosine integrase and the immunity repressor in the phage. A more general "helix-turn-helix DNA binding protein" might be more appropriate if you can't distinguish otherwise.Excision occurs differently in a phage with a serine recombinase. Instead of excise (which it will not have), you may be able to find a RDF (recombination directionality factor) that does the same task.D29_36
78
recombination directionality factorRDFRedRock_58
79
Cro (control of repressor's operator)Do not call a protein Cro unless you can identify the integrase and the immunity repressor in the phage. Cro will be present with both the serine and tyrosine integrases. When you have mutliple HTH hits in this region and cannot differentiate which one is the immunity repressor and which is the Cro, consider using the HTH designation.Che9c_47
80
WhiB family transcription factor
Jasmine_32
81
antirepressorSisi_47
82
DnaE-like DNA polymerase III (alpha)Spud_203
83
DNA polymerase ILuchador_50
84
DNA polymerase III sliding clamp (Beta)Corndog_84
85
DnaC-like helicase loaderAlice_189
86
helicase loaderSamman98_70
87
DNA helicaseATP-dependent helicaseChah_54
88
DnaB-like dsDNA helicaseRedRock_68
89
DNA helicase/methylaseAuxilium_73
90
RepA-like helicaseSour_52
91
DNA primaseSpud_199
92
DNA primase/helicasemake sure it has both partsSchubert_31
93
DNA primase/polymerasemake sure it has both partsRosebush_54
94
DNA primase/polymerase/helicasemake sure it has all three partsGreenHearts_47
95
DNA topoisomerase
96
DnaQ-like (DNA polymerase III subunit)DNAQ is the exonuclease of Pol III (epsilon subunit)Sisi_35
97
nucleotidyl transferaseSpud_3
98
FIC domain nucleotidyl transferaseMUST contain HPFxxGNGR motifBradissa_34
99
polynucleotide kinasepnkSpud_250
100
Lsr2-like DNA bridging proteinLsr2Omega_61