------------------------------------------------------ EnTAP Run Information - Execution ------------------------------------------------------ Current EnTAP Version: 0.10.4 ------------------------------------------------------ Transcriptome Statistics ------------------------------------------------------ Protein sequences found Total sequences: 73013 Total length of transcriptome(bp): 98232639 Average sequence length(bp): 1345.00 n50: 1659 n90: 726 Longest sequence(bp): 16335 (Potri.017G039100.1|PACid:26983338) Shortest sequence(bp): 138 (Potri.006G004700.3|PACid:27006951) ------------------------------------------------------ Similarity Search - DIAMOND - complete ------------------------------------------------------ Search results: Total alignments: 605223 Total unselected results: 539536 Total unique transcripts with an alignment: 65687 Total unique transcripts without an alignment: 7326 Total unique informative alignments: 50255 Total unique uninformative alignments: 15432 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)populus trichocarpa: 60745(92.48%) 2)populus euphratica: 2174(3.31%) 3)hevea brasiliensis: 298(0.45%) 4)jatropha curcas: 135(0.21%) 5)manihot esculenta: 132(0.20%) 6)quercus suber: 117(0.18%) 7)ricinus communis: 97(0.15%) 8)pistacia vera: 96(0.15%) 9)camellia sinensis: 96(0.15%) 10)quercus lobata: 94(0.14%) ------------------------------------------------------ Compiled Similarity Search - DIAMOND - Best Overall ------------------------------------------------------ Total unique transcripts with an alignment: 65687 Total unique transcripts without an alignment: 7326 Total unique informative alignments: 50403 Total unique uninformative alignments: 15284 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)populus trichocarpa: 60864(92.66%) 2)populus euphratica: 2056(3.13%) 3)hevea brasiliensis: 299(0.46%) 4)jatropha curcas: 134(0.20%) 5)manihot esculenta: 130(0.20%) 6)quercus suber: 120(0.18%) 7)ricinus communis: 99(0.15%) 8)pistacia vera: 96(0.15%) 9)camellia sinensis: 96(0.15%) 10)quercus lobata: 89(0.14%) ------------------------------------------------------ Gene Family - Gene Ontology and Pathway - EggNOG ------------------------------------------------------ Statistics for overall Eggnog results: Total unique sequences with family assignment: 68488 Total unique sequences without family assignment: 4525 Top 10 Taxonomic Scopes Assigned: 1)Viridiplantae: 61868(90.33%) 2)Eukaryotes: 5850(8.54%) 3)Ancestor: 761(1.11%) 4)Bacteria: 5(0.01%) 5)Archaea: 3(0.00%) 6)Animals: 1(0.00%) Total unique sequences with at least one GO term: 68484 Total unique sequences without GO terms: 4 Total GO terms assigned: 3723177 Total molecular_function terms (lvl=1): 82834 Total unique molecular_function terms (lvl=1): 35 Top 10 molecular_function terms assigned (lvl=1): 1)GO:0005488-binding(L=1): 34335(41.45%) 2)GO:0003824-catalytic activity(L=1): 31948(38.57%) 3)GO:0005215-transporter activity(L=1): 4357(5.26%) 4)GO:0001071-nucleic acid binding transcription factor activity(L=1): 4290(5.18%) 5)GO:0005198-structural molecule activity(L=1): 2167(2.62%) 6)GO:0009055-electron carrier activity(L=1): 1473(1.78%) 7)GO:0060089-molecular transducer activity(L=1): 1360(1.64%) 8)GO:0004871-signal transducer activity(L=1): 1360(1.64%) 9)GO:0016209-antioxidant activity(L=1): 716(0.86%) 10)GO:0000988-transcription factor activity, protein binding(L=1): 382(0.46%) Total cellular_component terms (lvl=1): 120090 Total unique cellular_component terms (lvl=1): 15 Top 10 cellular_component terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 40512(33.73%) 2)GO:0043226-organelle(L=1): 32092(26.72%) 3)GO:0016020-membrane(L=1): 21820(18.17%) 4)GO:0032991-macromolecular complex(L=1): 9206(7.67%) 5)GO:0030054-cell junction(L=1): 4128(3.44%) 6)GO:0031974-membrane-enclosed lumen(L=1): 4086(3.40%) 7)GO:0055044-symplast(L=1): 3992(3.32%) 8)GO:0005576-extracellular region(L=1): 3775(3.14%) 9)GO:0009295-nucleoid(L=1): 201(0.17%) 10)GO:0045202-synapse(L=1): 148(0.12%) Total overall terms (lvl=1): 402367 Total unique overall terms (lvl=1): 264 Top 10 overall terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 40512(10.07%) 2)GO:0008152-metabolic process(L=1): 40397(10.04%) 3)GO:0009987-cellular process(L=1): 37938(9.43%) 4)GO:0005488-binding(L=1): 34335(8.53%) 5)GO:0043226-organelle(L=1): 32092(7.98%) 6)GO:0003824-catalytic activity(L=1): 31948(7.94%) 7)GO:0044699-single-organism process(L=1): 23197(5.77%) 8)GO:0016020-membrane(L=1): 21820(5.42%) 9)GO:0050896-response to stimulus(L=1): 19190(4.77%) 10)GO:0065007-biological regulation(L=1): 17103(4.25%) Total biological_process terms (lvl=1): 199443 Total unique biological_process terms (lvl=1): 214 Top 10 biological_process terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 40397(20.25%) 2)GO:0009987-cellular process(L=1): 37938(19.02%) 3)GO:0044699-single-organism process(L=1): 23197(11.63%) 4)GO:0050896-response to stimulus(L=1): 19190(9.62%) 5)GO:0065007-biological regulation(L=1): 17103(8.58%) 6)GO:0032502-developmental process(L=1): 9815(4.92%) 7)GO:0032501-multicellular organismal process(L=1): 9789(4.91%) 8)GO:0051179-localization(L=1): 9517(4.77%) 9)GO:0071840-cellular component organization or biogenesis(L=1): 8292(4.16%) 10)GO:0000003-reproduction(L=1): 5673(2.84%) Total unique sequences with at least one pathway (KEGG) assignment: 17995 Total unique sequences without pathways (KEGG): 50493 Total pathways (KEGG) assigned: 62847 ------------------------------------------------------ Final Annotation Statistics ------------------------------------------------------ Total Sequences: 73013 Similarity Search Total unique sequences with an alignment: 65687 Total unique sequences without an alignment: 7326 Gene Families Total unique sequences with family assignment: 68488 Total unique sequences without family assignment: 4525 Total unique sequences with at least one GO term: 57177 Total unique sequences with at least one pathway (KEGG) assignment: 17859 Totals Total unique sequences annotated (similarity search alignments only): 520 Total unique sequences annotated (gene family assignment only): 3321 Total unique sequences annotated (gene family and/or similarity search): 69008 Total unique sequences unannotated (gene family and/or similarity search): 4005