In article <7u5hg0$g1l$1 at news.tamu.edu>, Mei <hmpeng at ppserver.tamu.edu> wrote:
>>I am interested in assemble some of the EST sequences that I have downloaded
>from Entrez. So far, I am using csplit command in unix, then use a perl
>script to rename files. Finally, use a shell script to generate fake phd
>files for Consed. This approach works well if I have less than 100
>sequences, because csplit only split up to 99 files. Id like to know how
>to split and rename the fasta file according to the gi numbers in the
>definition lines when I have large number of sequences to assemble. A hint
>in how to write a perl script for this purpose will be greatly appreciated.
Incidentally, the GNU version of split does not have these limitations.
It's part of the GNU textutils package, I think.