CROSS
a graphic BLAST cruncher

Summary

The BLAST series of programs (Altschul et al.) were designed to compare nucleotide and peptide sequences with those in sequence databases. They do this in a statistically valid way to produce a series of high scoring pairs (HSPs) which are aligned pairs of sequences from the query and the database (sbjct). HSPs are identified by their loci in the query and sbjct sequences as well as by their similarity scores and percentage identities and similarities.

Output from blast is typically in human readable text format, which is intelligible when the output is of limited size, as is typical for short sequences. However when there are long sequences in the comparison, many HSPs result. For a typical small genome comparison there are typically approximately 10000-20000 HSPs with an output file consisting of several megabytes of text.


CROSS is a Windows 9x/NT program which does the following:

  1. read BLAST output (multiple sbjct - multiple query) either from the system clipboard or file,
  2. parse the data into HSP lists (scores, query and sbjct loci)
  3. presents the data in several interactive forms:
      1. Diagonal plot (dot plot)
      2. Map plot (positions of query on multiple sbjct lines)
  4. Filters the presented data:
      1. Minimum - maximum displayed score ranges
      2. Color by strand
      3. Statistics display (dynamic display of the object under the cursor)
      4. Repeat finder (color coded haloes)
      5. Indel finder generates lists of gaps in query/sbjct
  5. Outputs various data to the system clipboard
      1. Graphics - copy and paste (straight into MS Office apps)
      2. Text - tab delimited data for spreadsheets (copy and paste to Excel)
          1. Hits
          2. Indels
          3. Sbjct list
          4. Repeats

What is not obvious at first is that the different modalities of blast query produce output which translates into information useful for a variety of purposes:

Examples:

Blast query sbjct information
blastN Whole genome Whole genome Dot plot - reveals large scale inversions, transpositions and indels Pyrococcus spp. Mycobacterium spp. Helicobacter pylori and Campylobacter jejuni
blastN Whole genome self Repeats and inversions
blastN Whole genome    
blastN      
blastN      
blastN      

Screen Shot of CROSS

A blastN comparison of P. horikoshii v P. furiosus is displayed in the Diagonal window.
The coverage of the query sequence against the single subject is shown in the Map window below.
The statistics window to the right shows the statistics for the HSP under the cursor.

CROSS screen shot


email Dennis Maeder m a e d e r [at] umbi [dot] umd [dot] edu with any comments or suggestions.