------------------------------------------------------ EnTAP Run Information - Execution ------------------------------------------------------ Current EnTAP Version: 0.10.4 ------------------------------------------------------ Transcriptome Statistics ------------------------------------------------------ Protein sequences found Total sequences: 32222 Total length of transcriptome(bp): 35721177 Average sequence length(bp): 1108.00 n50: 1419 n90: 576 Longest sequence(bp): 11307 (Hma1.2p1_1353F.1_g277680) Shortest sequence(bp): 186 (Hma1.2p1_0237F.1_g099430) ------------------------------------------------------ Similarity Search - DIAMOND - complete ------------------------------------------------------ Search results: Total alignments: 539266 Total unselected results: 512403 Total unique transcripts with an alignment: 26863 Total unique transcripts without an alignment: 5359 Total unique informative alignments: 18713 Total unique uninformative alignments: 8150 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)camellia sinensis: 12516(46.59%) 2)olea europaea var. sylvestris: 1115(4.15%) 3)coffea arabica: 989(3.68%) 4)vitis vinifera: 963(3.58%) 5)sesamum indicum: 861(3.21%) 6)nicotiana tomentosiformis: 785(2.92%) 7)quercus suber: 517(1.92%) 8)quercus lobata: 479(1.78%) 9)coffea eugenioides: 450(1.68%) 10)cynara cardunculus var. scolymus: 366(1.36%) ------------------------------------------------------ Compiled Similarity Search - DIAMOND - Best Overall ------------------------------------------------------ Total unique transcripts with an alignment: 26863 Total unique transcripts without an alignment: 5359 Total unique informative alignments: 18746 Total unique uninformative alignments: 8117 Total unique contaminants: 0(0.00%): Top 10 alignments by species: 1)camellia sinensis: 12327(45.89%) 2)olea europaea var. sylvestris: 1148(4.27%) 3)coffea arabica: 1083(4.03%) 4)vitis vinifera: 929(3.46%) 5)sesamum indicum: 894(3.33%) 6)nicotiana tomentosiformis: 777(2.89%) 7)quercus suber: 492(1.83%) 8)quercus lobata: 479(1.78%) 9)coffea eugenioides: 388(1.44%) 10)cynara cardunculus var. scolymus: 376(1.40%) ------------------------------------------------------ Gene Family - Gene Ontology and Pathway - EggNOG ------------------------------------------------------ Statistics for overall Eggnog results: Total unique sequences with family assignment: 31179 Total unique sequences without family assignment: 1043 Top 10 Taxonomic Scopes Assigned: 1)Viridiplantae: 29640(95.06%) 2)Eukaryotes: 1365(4.38%) 3)Ancestor: 130(0.42%) 4)Fungi: 18(0.06%) 5)Bacteria: 10(0.03%) 6)Arthropoda: 6(0.02%) 7)Opisthokonts: 4(0.01%) 8)Animals: 4(0.01%) 9)Nematodes: 1(0.00%) 10)Mammals: 1(0.00%) Total unique sequences with at least one GO term: 31177 Total unique sequences without GO terms: 2 Total GO terms assigned: 1456089 Total cellular_component terms (lvl=1): 49875 Total unique cellular_component terms (lvl=1): 15 Top 10 cellular_component terms assigned (lvl=1): 1)GO:0005623-cell(L=1): 17057(34.20%) 2)GO:0043226-organelle(L=1): 13330(26.73%) 3)GO:0016020-membrane(L=1): 8956(17.96%) 4)GO:0032991-macromolecular complex(L=1): 3764(7.55%) 5)GO:0030054-cell junction(L=1): 1713(3.43%) 6)GO:0055044-symplast(L=1): 1700(3.41%) 7)GO:0031974-membrane-enclosed lumen(L=1): 1617(3.24%) 8)GO:0005576-extracellular region(L=1): 1584(3.18%) 9)GO:0009295-nucleoid(L=1): 93(0.19%) 10)GO:0045202-synapse(L=1): 27(0.05%) Total molecular_function terms (lvl=1): 35540 Total unique molecular_function terms (lvl=1): 30 Top 10 molecular_function terms assigned (lvl=1): 1)GO:0005488-binding(L=1): 15132(42.58%) 2)GO:0003824-catalytic activity(L=1): 13911(39.14%) 3)GO:0001071-nucleic acid binding transcription factor activity(L=1): 1719(4.84%) 4)GO:0005215-transporter activity(L=1): 1706(4.80%) 5)GO:0005198-structural molecule activity(L=1): 904(2.54%) 6)GO:0009055-electron carrier activity(L=1): 678(1.91%) 7)GO:0060089-molecular transducer activity(L=1): 418(1.18%) 8)GO:0004871-signal transducer activity(L=1): 418(1.18%) 9)GO:0016209-antioxidant activity(L=1): 349(0.98%) 10)GO:0000988-transcription factor activity, protein binding(L=1): 121(0.34%) Total overall terms (lvl=1): 165990 Total unique overall terms (lvl=1): 170 Top 10 overall terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 17506(10.55%) 2)GO:0005623-cell(L=1): 17057(10.28%) 3)GO:0009987-cellular process(L=1): 16138(9.72%) 4)GO:0005488-binding(L=1): 15132(9.12%) 5)GO:0003824-catalytic activity(L=1): 13911(8.38%) 6)GO:0043226-organelle(L=1): 13330(8.03%) 7)GO:0044699-single-organism process(L=1): 9560(5.76%) 8)GO:0016020-membrane(L=1): 8956(5.40%) 9)GO:0050896-response to stimulus(L=1): 7768(4.68%) 10)GO:0065007-biological regulation(L=1): 6738(4.06%) Total biological_process terms (lvl=1): 80575 Total unique biological_process terms (lvl=1): 125 Top 10 biological_process terms assigned (lvl=1): 1)GO:0008152-metabolic process(L=1): 17506(21.73%) 2)GO:0009987-cellular process(L=1): 16138(20.03%) 3)GO:0044699-single-organism process(L=1): 9560(11.86%) 4)GO:0050896-response to stimulus(L=1): 7768(9.64%) 5)GO:0065007-biological regulation(L=1): 6738(8.36%) 6)GO:0032502-developmental process(L=1): 3663(4.55%) 7)GO:0051179-localization(L=1): 3647(4.53%) 8)GO:0032501-multicellular organismal process(L=1): 3623(4.50%) 9)GO:0071840-cellular component organization or biogenesis(L=1): 3247(4.03%) 10)GO:0051704-multi-organism process(L=1): 2191(2.72%) Total unique sequences with at least one pathway (KEGG) assignment: 7583 Total unique sequences without pathways (KEGG): 23596 Total pathways (KEGG) assigned: 24980 ------------------------------------------------------ Final Annotation Statistics ------------------------------------------------------ Total Sequences: 32222 Similarity Search Total unique sequences with an alignment: 26863 Total unique sequences without an alignment: 5359 Gene Families Total unique sequences with family assignment: 31179 Total unique sequences without family assignment: 1043 Total unique sequences with at least one GO term: 25749 Total unique sequences with at least one pathway (KEGG) assignment: 7519 Totals Total unique sequences annotated (similarity search alignments only): 58 Total unique sequences annotated (gene family assignment only): 4374 Total unique sequences annotated (gene family and/or similarity search): 31237 Total unique sequences unannotated (gene family and/or similarity search): 985 EnTAP has completed! Total runtime (minutes): 432