Dear arthropod scientists,
Publicly available RNA-Seq data from NCBI sequence archive are listed here
http://arthropods.eugenes.org/EvidentialGene/arthropods/rnasets_srapublic/
These are suitable for assembly to complete species gene sets. Some of these
arthropod species lack existing public gene sets, or have fragmented low-quality genes.
These will be interesting and valuable to assemble into good quality gene sets, which
the EvidentialGene pipeline is now ready to do.
Please see the subset lists, sra_rnaseq*.arpods,insects,flies.listan,
arpods = arthropods not insects, insects = not diptera, flies = dipterans (Drosophila, etc.)
Find some of these species gene set assemblies in EvidentialGene/arthropods/
Ixodes scapularis (deer tick) is an assembly in progress. A few more good gene sets
of non-insect arthropods will help flesh out the arthropod gene taxonomy.
Potential collaborators or biologists with interest in species in these RNA sets
please contact me. As I'm out of funds just now, anyone who can pay for expert
gene construction from mRNA has priority.
Don Gilbert, gilbertd at indiana.edu, EvidentialGene project at euGenes.org
.....................
EvidentialGene: Gene-omes from mRNA-seq assembly overtake genome gene-predictions.
An existing dogma in genome projects, that quality of a gene set is dependent on the
quality of the genome assembly, is no longer accurate. mRNA-seq assembly now does as well
or better than genome-gene modelling.
Learn more of this at http://arthropods.eugenes.org/EvidentialGene/
Gene set completeness (percent alignment to reference species proteins)
for mRNA-genes and Genome-genes of select Arthropods
mRNA Genome
Clade Evigene Genes Organisms
-------------------------------------------------------------------
Insects 75% 56% beetles, whitefly, locust
Crustacea 77% 66% waterfleas, black & white shrimps
Ticks 74% 57% zebra-tick, spider-mite, deer-tick
-------------------------------------------------------------------
Ticks, adding recent Ixodes mRNA-seq assembly
Human genes found (n=16631)
geneset hit% alnh alnt Gene set method, species
................................................................
ixodes.evg 95.7 434 415 mRNA-assembly, deer tick
ztick.evg 91.4 416 380 mRNA-assembly, zebra tick
ixodes.gno 89.5 364 326 genome-predict, deer tick
tetur.gno 83.2 399 332 genome-predict, spider mite
................................................................
hit%= percent of ref genes found
alnh= alignment average, for hit genes
alnt= alignment average, for all ref genes