------------------------------------------------------ EnTAP Run Information - Execution ------------------------------------------------------ Current EnTAP Version: 0.10.4 ------------------------------------------------------ Transcriptome Statistics ------------------------------------------------------ Protein sequences found Total sequences: 54935 Total length of transcriptome(bp): 65779896 Average sequence length(bp): 1197.00 n50: 1509 n90: 627 Longest sequence(bp): 14577 (Eucgr.J02836.1|PACid:18802728) Shortest sequence(bp): 288 (Eucgr.F01998.1|PACid:18761587) ------------------------------------------------------ Similarity Search - DIAMOND - complete ------------------------------------------------------ Search results: Total alignments: 266941 Total unselected results: 219918 Total unique transcripts with an alignment: 47023 Total unique transcripts without an alignment: 7912 Total unique informative alignments: 6642 Total unique uninformative alignments: 40381 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)eucalyptus grandis: 38656(82.21%) 2)syzygium oleosum: 3835(8.16%) 3)rhodamnia argentea: 2177(4.63%) 4)camellia sinensis: 180(0.38%) 5)quercus suber: 149(0.32%) 6)ziziphus jujuba: 103(0.22%) 7)prosopis alba: 75(0.16%) 8)populus trichocarpa: 73(0.16%) 9)durio zibethinus: 73(0.16%) 10)nicotiana tomentosiformis: 64(0.14%) ------------------------------------------------------ Compiled Similarity Search - DIAMOND - Best Overall ------------------------------------------------------ Total unique transcripts with an alignment: 47023 Total unique transcripts without an alignment: 7912 Total unique informative alignments: 6668 Total unique uninformative alignments: 40355 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)eucalyptus grandis: 38635(82.16%) 2)syzygium oleosum: 3899(8.29%) 3)rhodamnia argentea: 2135(4.54%) 4)camellia sinensis: 181(0.38%) 5)quercus suber: 150(0.32%) 6)ziziphus jujuba: 100(0.21%) 7)populus trichocarpa: 80(0.17%) 8)prosopis alba: 77(0.16%) 9)durio zibethinus: 73(0.16%) 10)nicotiana tomentosiformis: 64(0.14%) ------------------------------------------------------ Gene Family - Gene Ontology and Pathway - EggNOG ------------------------------------------------------ Statistics for overall Eggnog results: Total unique sequences with family assignment: 49681 Total unique sequences without family assignment: 5254 Top 10 Taxonomic Scopes Assigned: 1)Viridiplantae: 46977(94.56%) 2)Eukaryotes: 2068(4.16%) 3)Ancestor: 621(1.25%) 4)Mammals: 6(0.01%) 5)Bacteria: 5(0.01%) 6)Animals: 2(0.00%) 7)Fungi: 1(0.00%) 8)Arthropoda: 1(0.00%) Total unique sequences with at least one GO term: 49680 Total unique sequences without GO terms: 1 Total GO terms assigned: 2331598 Total molecular_function terms (lvl=1): 60708 Total unique molecular_function terms (lvl=1): 28 Top 10 molecular_function terms assigned (lvl=1): 1)GO:0005488-binding(L=1): 25055(41.27%) 2)GO:0003824-catalytic activity(L=1): 24844(40.92%) 3)GO:0005215-transporter activity(L=1): 3138(5.17%) 4)GO:0001071-nucleic acid binding transcription factor activity(L=1): 2490(4.10%) 5)GO:0009055-electron carrier activity(L=1): 1315(2.17%) 6)GO:0005198-structural molecule activity(L=1): 1067(1.76%) 7)GO:0060089-molecular transducer activity(L=1): 948(1.56%) 8)GO:0004871-signal transducer activity(L=1): 948(1.56%) 9)GO:0016209-antioxidant activity(L=1): 482(0.79%) 10)GO:0000988-transcription factor activity, protein binding(L=1): 159(0.26%) Total cellular_component terms (lvl=1): 76634 Total unique cellular_component terms (lvl=1): 15 Top 10 cellular_component terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 26343(34.38%) 2)GO:0043226-organelle(L=1): 19743(25.76%) 3)GO:0016020-membrane(L=1): 15252(19.90%) 4)GO:0032991-macromolecular complex(L=1): 4867(6.35%) 5)GO:0030054-cell junction(L=1): 2759(3.60%) 6)GO:0055044-symplast(L=1): 2739(3.57%) 7)GO:0005576-extracellular region(L=1): 2736(3.57%) 8)GO:0031974-membrane-enclosed lumen(L=1): 2009(2.62%) 9)GO:0009295-nucleoid(L=1): 104(0.14%) 10)GO:0019012-virion(L=1): 42(0.05%) Total overall terms (lvl=1): 272619 Total unique overall terms (lvl=1): 175 Top 10 overall terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 29645(10.87%) 2)GO:0009987-cellular process(L=1): 26437(9.70%) 3)GO:0005623-cell(L=1): 26343(9.66%) 4)GO:0005488-binding(L=1): 25055(9.19%) 5)GO:0003824-catalytic activity(L=1): 24844(9.11%) 6)GO:0043226-organelle(L=1): 19743(7.24%) 7)GO:0044699-single-organism process(L=1): 16180(5.94%) 8)GO:0016020-membrane(L=1): 15252(5.59%) 9)GO:0050896-response to stimulus(L=1): 14293(5.24%) 10)GO:0065007-biological regulation(L=1): 11054(4.05%) Total biological_process terms (lvl=1): 135277 Total unique biological_process terms (lvl=1): 132 Top 10 biological_process terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 29645(21.91%) 2)GO:0009987-cellular process(L=1): 26437(19.54%) 3)GO:0044699-single-organism process(L=1): 16180(11.96%) 4)GO:0050896-response to stimulus(L=1): 14293(10.57%) 5)GO:0065007-biological regulation(L=1): 11054(8.17%) 6)GO:0051179-localization(L=1): 6315(4.67%) 7)GO:0032501-multicellular organismal process(L=1): 5720(4.23%) 8)GO:0032502-developmental process(L=1): 5428(4.01%) 9)GO:0023052-signaling(L=1): 4410(3.26%) 10)GO:0071840-cellular component organization or biogenesis(L=1): 4336(3.21%) Total unique sequences with at least one pathway (KEGG) assignment: 11469 Total unique sequences without pathways (KEGG): 38212 Total pathways (KEGG) assigned: 40477 ------------------------------------------------------ Final Annotation Statistics ------------------------------------------------------ Total Sequences: 54935 Similarity Search Total unique sequences with an alignment: 47023 Total unique sequences without an alignment: 7912 Gene Families Total unique sequences with family assignment: 49681 Total unique sequences without family assignment: 5254 Total unique sequences with at least one GO term: 41495 Total unique sequences with at least one pathway (KEGG) assignment: 11399 Totals Total unique sequences annotated (similarity search alignments only): 415 Total unique sequences annotated (gene family assignment only): 3073 Total unique sequences annotated (gene family and/or similarity search): 50096 Total unique sequences unannotated (gene family and/or similarity search): 4839