------------------------------------------------------ EnTAP Run Information - Execution ------------------------------------------------------ Current EnTAP Version: 0.10.4 ------------------------------------------------------ Transcriptome Statistics ------------------------------------------------------ Protein sequences found Total sequences: 34015 Total length of transcriptome(bp): 44065911 Average sequence length(bp): 1295.00 n50: 1596 n90: 705 Longest sequence(bp): 16392 (Acc12792.1) Shortest sequence(bp): 45 (Acc31635.1) ------------------------------------------------------ Similarity Search - DIAMOND - complete ------------------------------------------------------ Search results: Total alignments: 499342 Total unselected results: 467656 Total unique transcripts with an alignment: 31686 Total unique transcripts without an alignment: 2329 Total unique informative alignments: 24376 Total unique uninformative alignments: 7310 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)camellia sinensis: 25199(79.53%) 2)vitis vinifera: 606(1.91%) 3)olea europaea var. sylvestris: 519(1.64%) 4)sesamum indicum: 434(1.37%) 5)coffea arabica: 358(1.13%) 6)nicotiana tomentosiformis: 280(0.88%) 7)quercus suber: 224(0.71%) 8)quercus lobata: 219(0.69%) 9)hevea brasiliensis: 183(0.58%) 10)pistacia vera: 166(0.52%) ------------------------------------------------------ Compiled Similarity Search - DIAMOND - Best Overall ------------------------------------------------------ Total unique transcripts with an alignment: 31686 Total unique transcripts without an alignment: 2329 Total unique informative alignments: 24386 Total unique uninformative alignments: 7300 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)camellia sinensis: 25221(79.60%) 2)vitis vinifera: 591(1.87%) 3)olea europaea var. sylvestris: 534(1.69%) 4)sesamum indicum: 389(1.23%) 5)coffea arabica: 379(1.20%) 6)nicotiana tomentosiformis: 279(0.88%) 7)quercus lobata: 223(0.70%) 8)quercus suber: 208(0.66%) 9)hevea brasiliensis: 194(0.61%) 10)pistacia vera: 165(0.52%) ------------------------------------------------------ Gene Family - Gene Ontology and Pathway - EggNOG ------------------------------------------------------ Statistics for overall Eggnog results: Total unique sequences with family assignment: 33035 Total unique sequences without family assignment: 980 Top 10 Taxonomic Scopes Assigned: 1)Viridiplantae: 32245(97.61%) 2)Eukaryotes: 713(2.16%) 3)Ancestor: 76(0.23%) 4)Nematodes: 1(0.00%) Total unique sequences with at least one GO term: 33032 Total unique sequences without GO terms: 3 Total GO terms assigned: 1548261 Total molecular_function terms (lvl=1): 37040 Total unique molecular_function terms (lvl=1): 26 Top 10 molecular_function terms assigned (lvl=1): 1)GO:0005488-binding(L=1): 15741(42.50%) 2)GO:0003824-catalytic activity(L=1): 13906(37.54%) 3)GO:0001071-nucleic acid binding transcription factor activity(L=1): 2348(6.34%) 4)GO:0005215-transporter activity(L=1): 1874(5.06%) 5)GO:0005198-structural molecule activity(L=1): 935(2.52%) 6)GO:0009055-electron carrier activity(L=1): 672(1.81%) 7)GO:0004871-signal transducer activity(L=1): 508(1.37%) 8)GO:0060089-molecular transducer activity(L=1): 508(1.37%) 9)GO:0016209-antioxidant activity(L=1): 291(0.79%) 10)GO:0000988-transcription factor activity, protein binding(L=1): 113(0.31%) Total cellular_component terms (lvl=1): 55632 Total unique cellular_component terms (lvl=1): 14 Top 10 cellular_component terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 19183(34.48%) 2)GO:0043226-organelle(L=1): 14943(26.86%) 3)GO:0016020-membrane(L=1): 10213(18.36%) 4)GO:0032991-macromolecular complex(L=1): 4050(7.28%) 5)GO:0030054-cell junction(L=1): 1890(3.40%) 6)GO:0055044-symplast(L=1): 1882(3.38%) 7)GO:0031974-membrane-enclosed lumen(L=1): 1759(3.16%) 8)GO:0005576-extracellular region(L=1): 1601(2.88%) 9)GO:0009295-nucleoid(L=1): 58(0.10%) 10)GO:0019012-virion(L=1): 29(0.05%) Total overall terms (lvl=1): 181234 Total unique overall terms (lvl=1): 135 Top 10 overall terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 19183(10.58%) 2)GO:0008152-metabolic process(L=1): 18365(10.13%) 3)GO:0009987-cellular process(L=1): 17526(9.67%) 4)GO:0005488-binding(L=1): 15741(8.69%) 5)GO:0043226-organelle(L=1): 14943(8.25%) 6)GO:0003824-catalytic activity(L=1): 13906(7.67%) 7)GO:0044699-single-organism process(L=1): 10790(5.95%) 8)GO:0016020-membrane(L=1): 10213(5.64%) 9)GO:0050896-response to stimulus(L=1): 8604(4.75%) 10)GO:0065007-biological regulation(L=1): 7898(4.36%) Total biological_process terms (lvl=1): 88562 Total unique biological_process terms (lvl=1): 95 Top 10 biological_process terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 18365(20.74%) 2)GO:0009987-cellular process(L=1): 17526(19.79%) 3)GO:0044699-single-organism process(L=1): 10790(12.18%) 4)GO:0050896-response to stimulus(L=1): 8604(9.72%) 5)GO:0065007-biological regulation(L=1): 7898(8.92%) 6)GO:0032502-developmental process(L=1): 4302(4.86%) 7)GO:0032501-multicellular organismal process(L=1): 4237(4.78%) 8)GO:0051179-localization(L=1): 4125(4.66%) 9)GO:0071840-cellular component organization or biogenesis(L=1): 3499(3.95%) 10)GO:0023052-signaling(L=1): 2392(2.70%) Total unique sequences with at least one pathway (KEGG) assignment: 8044 Total unique sequences without pathways (KEGG): 24991 Total pathways (KEGG) assigned: 26207 ------------------------------------------------------ Final Annotation Statistics ------------------------------------------------------ Total Sequences: 34015 Similarity Search Total unique sequences with an alignment: 31686 Total unique sequences without an alignment: 2329 Gene Families Total unique sequences with family assignment: 33035 Total unique sequences without family assignment: 980 Total unique sequences with at least one GO term: 27537 Total unique sequences with at least one pathway (KEGG) assignment: 7947 Totals Total unique sequences annotated (similarity search alignments only): 72 Total unique sequences annotated (gene family assignment only): 1421 Total unique sequences annotated (gene family and/or similarity search): 33107 Total unique sequences unannotated (gene family and/or similarity search): 908