------------------------------------------------------ EnTAP Run Information - Execution ------------------------------------------------------ Current EnTAP Version: 0.10.4 ------------------------------------------------------ Transcriptome Statistics ------------------------------------------------------ Protein sequences found Total sequences: 45116 Total length of transcriptome(bp): 52477251 Average sequence length(bp): 1163.00 n50: 1575 n90: 600 Longest sequence(bp): 18369 (MD17G1027400) Shortest sequence(bp): 159 (MD00G1136400) ------------------------------------------------------ Similarity Search - DIAMOND - complete ------------------------------------------------------ Search results: Total alignments: 260506 Total unselected results: 223636 Total unique transcripts with an alignment: 36870 Total unique transcripts without an alignment: 8246 Total unique informative alignments: 26957 Total unique uninformative alignments: 9913 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)malus domestica: 32264(87.51%) 2)pyrus x bretschneideri: 2255(6.12%) 3)prunus avium: 272(0.74%) 4)prunus persica: 199(0.54%) 5)prunus mume: 180(0.49%) 6)rosa chinensis: 171(0.46%) 7)ziziphus jujuba: 144(0.39%) 8)fragaria vesca subsp. vesca: 92(0.25%) 9)quercus suber: 84(0.23%) 10)quercus lobata: 81(0.22%) ------------------------------------------------------ Compiled Similarity Search - DIAMOND - Best Overall ------------------------------------------------------ Total unique transcripts with an alignment: 36870 Total unique transcripts without an alignment: 8246 Total unique informative alignments: 26997 Total unique uninformative alignments: 9873 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)malus domestica: 32312(87.64%) 2)pyrus x bretschneideri: 2212(6.00%) 3)prunus avium: 265(0.72%) 4)prunus persica: 203(0.55%) 5)prunus mume: 177(0.48%) 6)rosa chinensis: 173(0.47%) 7)ziziphus jujuba: 142(0.39%) 8)fragaria vesca subsp. vesca: 91(0.25%) 9)quercus suber: 86(0.23%) 10)quercus lobata: 81(0.22%) ------------------------------------------------------ Gene Family - Gene Ontology and Pathway - EggNOG ------------------------------------------------------ Statistics for overall Eggnog results: Total unique sequences with family assignment: 40593 Total unique sequences without family assignment: 4523 Top 10 Taxonomic Scopes Assigned: 1)Viridiplantae: 38967(95.99%) 2)Eukaryotes: 1338(3.30%) 3)Ancestor: 265(0.65%) 4)Arthropoda: 11(0.03%) 5)Animals: 6(0.01%) 6)Fungi: 4(0.01%) 7)Bacteria: 2(0.00%) Total unique sequences with at least one GO term: 40584 Total unique sequences without GO terms: 9 Total GO terms assigned: 1917035 Total molecular_function terms (lvl=1): 47083 Total unique molecular_function terms (lvl=1): 31 Top 10 molecular_function terms assigned (lvl=1): 1)GO:0005488-binding(L=1): 19296(40.98%) 2)GO:0003824-catalytic activity(L=1): 18745(39.81%) 3)GO:0005215-transporter activity(L=1): 2485(5.28%) 4)GO:0001071-nucleic acid binding transcription factor activity(L=1): 2301(4.89%) 5)GO:0005198-structural molecule activity(L=1): 1158(2.46%) 6)GO:0009055-electron carrier activity(L=1): 934(1.98%) 7)GO:0060089-molecular transducer activity(L=1): 692(1.47%) 8)GO:0004871-signal transducer activity(L=1): 692(1.47%) 9)GO:0016209-antioxidant activity(L=1): 415(0.88%) 10)GO:0000988-transcription factor activity, protein binding(L=1): 148(0.31%) Total cellular_component terms (lvl=1): 66348 Total unique cellular_component terms (lvl=1): 15 Top 10 cellular_component terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 22762(34.31%) 2)GO:0043226-organelle(L=1): 17480(26.35%) 3)GO:0016020-membrane(L=1): 12510(18.86%) 4)GO:0032991-macromolecular complex(L=1): 4751(7.16%) 5)GO:0030054-cell junction(L=1): 2264(3.41%) 6)GO:0055044-symplast(L=1): 2249(3.39%) 7)GO:0005576-extracellular region(L=1): 2122(3.20%) 8)GO:0031974-membrane-enclosed lumen(L=1): 2042(3.08%) 9)GO:0009295-nucleoid(L=1): 89(0.13%) 10)GO:0019012-virion(L=1): 34(0.05%) Total overall terms (lvl=1): 222809 Total unique overall terms (lvl=1): 194 Top 10 overall terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 23459(10.53%) 2)GO:0005623-cell(L=1): 22762(10.22%) 3)GO:0009987-cellular process(L=1): 21619(9.70%) 4)GO:0005488-binding(L=1): 19296(8.66%) 5)GO:0003824-catalytic activity(L=1): 18745(8.41%) 6)GO:0043226-organelle(L=1): 17480(7.85%) 7)GO:0044699-single-organism process(L=1): 13119(5.89%) 8)GO:0016020-membrane(L=1): 12510(5.61%) 9)GO:0050896-response to stimulus(L=1): 10731(4.82%) 10)GO:0065007-biological regulation(L=1): 9230(4.14%) Total biological_process terms (lvl=1): 109378 Total unique biological_process terms (lvl=1): 148 Top 10 biological_process terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 23459(21.45%) 2)GO:0009987-cellular process(L=1): 21619(19.77%) 3)GO:0044699-single-organism process(L=1): 13119(11.99%) 4)GO:0050896-response to stimulus(L=1): 10731(9.81%) 5)GO:0065007-biological regulation(L=1): 9230(8.44%) 6)GO:0051179-localization(L=1): 5169(4.73%) 7)GO:0032501-multicellular organismal process(L=1): 4953(4.53%) 8)GO:0032502-developmental process(L=1): 4913(4.49%) 9)GO:0071840-cellular component organization or biogenesis(L=1): 4107(3.75%) 10)GO:0023052-signaling(L=1): 3076(2.81%) Total unique sequences with at least one pathway (KEGG) assignment: 10080 Total unique sequences without pathways (KEGG): 30513 Total pathways (KEGG) assigned: 32802 ------------------------------------------------------ Final Annotation Statistics ------------------------------------------------------ Total Sequences: 45116 Similarity Search Total unique sequences with an alignment: 36870 Total unique sequences without an alignment: 8246 Gene Families Total unique sequences with family assignment: 40593 Total unique sequences without family assignment: 4523 Total unique sequences with at least one GO term: 33846 Total unique sequences with at least one pathway (KEGG) assignment: 9985 Totals Total unique sequences annotated (similarity search alignments only): 485 Total unique sequences annotated (gene family assignment only): 4208 Total unique sequences annotated (gene family and/or similarity search): 41078 Total unique sequences unannotated (gene family and/or similarity search): 4038