Sequence homology searches are commonly performed using Basic Local Alignment Search Tool (BLAST; www.ncbi.nlm.nih.gov/BLAST). However, the BLAST results interface does not provide direct simple access to useful sequence analysis tools, such as restriction enzyme maps for DNA sequences and secondary structure prediction for protein sequences. Moreover, it is difficult to navigate smoothly between different sequence alignment records. Therefore, there exists a need for a single interface that would allow a user to directly link each of the sequences from alignment reports to different domains/servers offering analysis and manipulation options. To meet this need, we developed SEQUEROME, a Java-based tool that acts as a front-end to BLAST queries and provides simplified access to web-distributed resources for protein and nucleic acid analysis.
While there are a number of web portals to bioinformatics tools and resources, such as Bioinformatic Harvester (1) and the Helmholtz Network for Bioinformatics (2), these existing systems generally do not incorporate BLAST analysis. Similarly, the National Center for Supercomputing Applications (NCSA) Biology Workbench (3,4) requires users to perform the BLAST search externally and then import the sequence hits into the program for further analysis. Finally, although there are commercial products such as Lasergene® (DNASTAR, Madison, WI, USA; www.dnastar.com) that facilitate sequence analysis, these programs mainly provide basic statistics about the sequences (e.g., nucleotide usage and molecular weight predictions) and are not intended for more sophisticated queries such as protein motif analysis or viewing tertiary structure information.
SEQUEROME can be accessed freely and directly as a web-server at sequerome.georgetown.edu. The basic display includes three panes: (i) the Query pane; (ii) a Results pane; and (iii) a Search History pane ((Figure 1)). Each browser session starts with the Query pane, which takes input nucleic acid or protein sequences in any format. Users also have the choice of alternative inputs, including feeding in a GenBank® accession number, uploading sequences from FASTA files stored on the local computer, or obtaining sequences from a third-party URL (using the appropriate box) and directly dragging and dropping the relevant sequence into the query box. In addition, the user can also perform a variety of other sequence manipulation operations by using the Analyze/Manipulate button.Figure 1.
Once the results of the sequence alignment are procured from BLAST, they are presented in the Results pane. The View Sequence link allows instant visualization of the aligned record as highlighted regions ((Figure 1)), while View Match shows the alignment in familiar BLAST format. After selecting the record of interest, the user simply clicks the button corresponding to the web-based server/tool of interest for direct analysis of the sequence. This workflow is in contrast to servers that provide plain links to the various resources, leaving the user to input the sequence for analysis every time. Analysis tools include translate function (see (Figure 1)), GenBank (www.ncbi.nlm.nih.gov/entrez/query.fcgi?db = Nucleotide), Protein Data Bank (PDB; www.rcsb.org/pdb), and Restriction Enzyme Database (REBASE; rebase.neb.com/rebase/rebase.html). Most of the results from third party servers/domains can be viewed directly in the Results pane without opening up additional browser windows. While the user is performing these operations, the results of each operation are recorded in the Search History pane with unique identifiers, thus enabling instant retrieval and selection of earlier results ((Figure 1)). Users also have the option of archiving each of the results separately using the print, save, and mail links on top of each icon in the panel.
SEQUEROME's flexibility allows for a number of applications. Users can run a BLAST analysis on a given DNA sequence, translate a resulting sequence hit, analyze the reading frames, perform a second (protein) BLAST with the translated protein, and then view any available 3-dimensional (3-D) structural information of retrieved records. Other typical applications would include performing a repetitive series of sequence manipulations/operations from a BLAST report or following a complete open reading frame (ORF) analysis with repeated operations on each reading frame (e.g., carrying out secondary structure prediction or finding nuclear localization signals).
SEQUEROME has a three-tiered architecture that uses Java servlet and Server Page technologies with Java database connectivity (JDBC), making it both server- and platform-independent. SEQUEROME is compatible with essentially all Java-enabled graphical browsers, but is better accessed using Internet Explorer and can be run on most operating systems equipped with a Java Virtual Machine (JVM) and Jakarta Tomcat server. For viewing structures of molecules, end-users must have downloaded appropriate plugins, such as Cn3D (ncbi.nih.gov/Structure/CN3D/cn3d.shtml), Rasmol/Protein Explorer (www.umass.edu/microbio/rasmol), and SwissPdbviewer (au.expasy.org/spdbv/).
SEQUEROME has been supported in part by funds from the Multidisciplinary University Research Initiative (MURI) grant DAAD19-00-1-0165 and the National Science Foundation (NSF) grant CCR 0098271.
The authors declare no competing interests.