Extending Genomic Contiguous Sequence Using Databases


Software Required: DNAStar (SegMan and SeqEdit modules) and web browser.


  1. Open the file containing the growing contiguous sequence (contig), which is a SeqMan Document.
  2. Double click on the contig name to open it. A horizontal array of overlapping reads should appear with a consensus sequence at the top. Resize window so that all the reads can be seen. The names of the reads are listed vertically in another window.
  3. Use this URL (or others) to look for raw sequencing data ("reads"): http://dictygenome.bcm.tmc.edu/bd/dicty_blast.html. Paste the query sequence into the window on the page.
  4. For a query sequence use about 100 bp of sequence that is consistent with other reads. If the web site you are using allows it, reduce the expect score to about 10^-5.
  5. A list of reads should appear. Some will be reads that are already in your contig. Hits labeled "contig" may seem new, but usually do not have new information.
  6. Retrieve a new read by double clicking on it. Select the sequence and copy it into a new EditSeq file. Use the name of the read for the name of the new file. It is convenient to save the read in the same folder as the growing contig.
  7. Return to SeqMan. Under the "Sequence" window select "Add one..." and open the new read.
  8. If the new sequence contains enough identity with the growing contig it will be incorporated. Otherwise, a new contig will be added.
  9. Edit the new read. Delete polylinker sequence if present. Adding gaps or deletions may help. New sequence may extend the contig or help resolve sequence ambiguities.