How to use
- BLAST search
Search for homologous sequences of the query sequence by BLAST.
- Select a data format of query protein in the [Input] field. PDB ID, file in the PDB format, amino acid sequence in the FASTA format and file of amino acid sequence in the FASTA format are accepted.
- PDB ID
- Input PDB ID of the query protein in lowercase. If the PDB ID is found in our database, you can select a chain ID from all chain IDs in the PDB entry.
- PDB format file
- Select the 3D structure file of the query protein in the PDB format.
- FASTA format file
- Select the amino acid sequence file of the query protein in the FASTA format.
- FASTA format text
- Input the amino acid sequence of query protein in the FASTA format.
- Modify following BLAST parameters in the [BLAST parameters] field if needed.
- Database
- Database name searched by BLAST
- E-value
- Expectation value threshold
- Click [BLAST] to run a BLAST search.
Command: blastp -query (amino acid sequence of query) -db (database name) -evalue (Expectation value threshold)
- Homologous sequences to query data
Select sequences from the BLAST results and then submit them to the evolutionary trace analysis. In the evolutionary trace analysis, a phylogenetic tree is constructed from the multiple alignment of those sequences by the NJ method, and grouping of sequences is performed on the tree.
- BLAST results are shown in the table below (click each ID to view the pairwise alignment of query and subject sequences). Sequences satisfying default conditions for sequence identity (≥20%) and coverage (≥90%) are shown in red and selected with checkboxes initially. Selections can be changed by modifying the conditions on the drop-down lists or check/uncheck each checkbox.
- Modify a condition for the evolutionary trace analysis if needed.
- Multiple alignment method
- Program for multiple alignment. ClustalW2 or MAFFT can be used.
Command-line options for clustalW2: -align -infile -outfile
Command-line options for MAFFT: --auto --clustalout
- Click [Exec trace] to perform evolutionary trace analysis.
Results of evtrace
Results of the evolutionary trace analysis are shown.
Following parameters for sequence grouping and clustering of trace residues can be changed here.
- Tident
- Cutoff value of identity for sequence grouping. Click a radio button to change the results to those under the corresponding cutoff condition.
- Tdist
- Distance threshold for nearest-neighbor clustering of trace residues. Two clusters are merged if they are closer than this threshold. Click [Change] after entering a value to update results.
- Tresnum
- Number threshold for nearest-neighbor clustering of trace residues. Clusters with the number of residues less than this threshold are ignored. Click [Change] after entering a value to update results.
Description of the result
Following results are shown on each tab.
- [Sequence group]
- List of sequences with their group ID. All sequences selected in the previous page are grouped using the phylogenetic tree shown in the [Tree] tab. The pairwise alignment of query and subject sequences is shown when the ID is clicked.
- [Multiple alignment (group)]
- Summary of the multiple alignment obtained by the selected program. Only consensus sequences for each group and the query sequence are shown here. The invariant and class-specific trace resides are shown in pink and gray, respectively. Click [Download raw data] to download the file of the result.
- [Multiple alignment (sequence)]
- Multiple alignment obtained by the selected program. All sequences and the query sequence are shown here. The invariant and class-specific trace resides are shown in pink and gray, respectively. Click [Download the multiple alignment] to download the file of the result.
- [Trace residues (TR)]
- Invariant (pink) and class-specific (gray) trace residues on the amino acid sequence and 3D structure.
- [Clustering TRs]
- Spatial clusters of trace residues on the amino acid sequence and 3D structure. If the trace residue is located at internal position (the surface accessibility < 0.1), the trace residue is not used in the clustering.
- [Tree]
- Phylogenetic trees of the amino acid sequences.