Building Suffix Trees using DNA sequences
I am a 'newbie' in bioinformatics and I have been using Windows all my life.
If you do not mind, I hope you
can help me out. I am now working on a project in building suffix tree using DNA sequences
This is my knowledge of building a suffix tree using DNA sequences:
1)Go to NCBI Genbank FTP site to download files which contain the
genomes in FLASTA format
2)Setup a PC with linux OS that includes a C or C++ compiler as most
suffix tree construction codes are in C or C++
3) Run the program with the flat file and benchmark the suffix tree
construction results using some benchmarking program
Is this roughly the correct procedure? Is there any websites or docs
which serves as a guideline on how to go about doing it?