I am interested in assemble some of the EST sequences that I have downloaded
from Entrez. So far, I am using csplit command in unix, then use a perl
script to rename files. Finally, use a shell script to generate fake phd
files for Consed. This approach works well if I have less than 100
sequences, because csplit only split up to 99 files. Id like to know how
to split and rename the fasta file according to the gi numbers in the
definition lines when I have large number of sequences to assemble. A hint
in how to write a perl script for this purpose will be greatly appreciated.