English / Japanese

PreDom Help

Predict 3D structure of multi-domain proteins

How to use

Select the data format of the query by changing the [Input type] pulldown menu.

FASTA: Using amino acid sequence data in FASTA format, the whole 3D structure of the protein is predicted.
PDB: Using 3D structure data of each domain, the whole 3D structure of the query protein is predicted. Currently, PreDom:Structure can accept up to three domain structures.

prediction using amino acid sequence data

Select “Amino acid sequence in FASTA format” or “FASTA format file”. If you select “Amino acid sequence in FASTA format”, input the amino acid sequence of the query protein in FASTA format into the text area field. If you select “FASTA format file”, click [Browse…] and select the FASTA format file of the query protein.
Click [Start Prediction] to execute prediction.
There are four types of prediction results according to the query protein sequence.
1. If the 3D structure data of the almost identical protein to the query protein (sequence identity ≥ 95%) is found in the PDB by the BLAST search, the 3D structure of the found protein is shown in JSmol viewer. Additionally, the alignment obtained from the BLAST result is shown.
2. If the 3D structure data of the homologous protein (sequence identity ≥ 25%) is found in the PDB by the BLAST search, you can perform homology modeling using the found structure as template. To perform the homology modeling, the license key of MODELLER is needed.
  1. If [Show] button is clicked, the alignment obtained from the BLAST result is shown. The 3D structure of the template protein is also shown in JSmol viewer.
  2. Click [Click here] button in the "Status" row to open a new window for setting some parameters of a homology modeling execution.
  3. Enter your job name (optional), e-mail address and license key of MODELLER into the window and click [Run prediction] to run a homology modeling.
  4. Completion of the program execution is notified by e-mail.
  5. When you access the result page of which URL is described in the e-mail, results of a homology modeling, such as five predicted structures of the query, information of the template structure, the alignment of query and template proteins obtained from MODELLER and quality scores of model structures (molpdf, DOPE score and GA341 score), are shown.
  6. You can download the coordinate data in PDB format of predicted structures.
3. If 3D structures of continuous two domains in the query protein sequence are already known or predictable by homology modeling, a whole 3D structure of the two domains is predicted using DINE score.
  1. If [Show] button is clicked, the alignment obtained from the BLAST result is shown. The 3D structure of the PDB entry is also shown in JSmol viewer.
  2. Click [Click here] button in the "Status" row to open a new window for setting some parameters of DINE score. Details of parameters are described in "prediction using 3D structure data" section.
  3. Enter your job name (optional) and e-mail address and click [Run prediction] to run a prediction process. If homology modeling is required for predicting 3D structure(s) of one domain or both two domains, the license key of MODELLER is also needed.
  4. Completion of the program execution is notified by e-mail.
  5. When you access the result page of which URL is described in the e-mail, prediction results are shown. See "2-domain protein" paragraph in "Description of the result page" section.
4. If 3D structures of more than two domains in the query protein sequence are already known or predictable by homology modeling, known structures or template structures for homology modeling are shown.
  1. If [Show] button is clicked, the alignment obtained from the BLAST result is shown. The 3D structure of the PDB entry is also shown in JSmol viewer.
  2. Click [Click here] button in the "Status" row to open a new window for setting parameters of homology modeling.
  3. Enter your job name (optional), e-mail address and license key of MODELLER. Then, click [Run prediction] to run homology modeling.
  4. Completion of the program execution is notified by e-mail.
  5. When you access the result page of which URL is described in the e-mail, predicted structures by homology modeling are shown. If the query contains only three domains and all domain structures are already known or can be modeled by homology modeling, a whole structure of them is predicted. See "multi-domain protein composed of more than 2 domains" paragraph in "Description of the result page" section.

prediction using 3D structure data

Click [Browse...] next to one of the [PDB format file] fields and select the PDB format file of the domain. Then click the next [Find chains] button to list all the chain IDs in the specified PDB data, and select the chain ID from the list. To clear the chain list, click [Reset]. (The [PDB format file] field is also initialized.)
Repeat this procedure for another domain.
Modify the value of the following parameters if needed.
- Parameters for KIP method
  
  Threshold of sequence identity
  
  The minimum of the required sequence identity with the target domain of the homologous domains for the domain-domain interface prediction.
  
  Threshold of sequence coverage
  
  The minimum of the required sequence coverage by the target domain of the homologous domains for the domain-domain interface prediction.
  
  hold of BLAST E-value
  
  The maximum of the allowed E-value for performing BLAST search.
- Parameters for IP method
  
  Threshold of IP score
  
  The minimum of the required IP score (score evaluated on the basis of the occurrence of each amino acid residue on the domain-domain interface).
  
  Threshold of residue contact
  
  The maximum of the allowed distance between atoms in the candidate domain-domain interface residues. A residue pair is regarded to be in contact with one another when all the distances between atoms of each residue are within this value.
  
  Minimum size of interface
  
  The minimum of the required number of contacted residues. Predicted domain-domain interaction residues are discarded when the number of contacted residues among them is less than this value.
Completion of the program execution is notified by e-mail. Enter arbitrary name in the [Job name] field for identification of the job, and the e-mail address to receive notification in the [E-Mail] field.
Click [Start Prediction] to execute prediction.

Description of the result page

2-domain protein

List of the candidate structures (top table)

Ten most plausible candidate structures on the basis of the DINE score are listed in descending order of the DINE score.

The DINE score defined as:

DINE = w_dock * S_dock + w_int * S_int + w_ete * S_ete

S_dock，S_int，S_ete:

Scores of the docking (interface complementarity of physicochemical aspect by ZRANK), interface (ratio of predicted interface residues in the domain-domain interface) and end-to-end distance (fitness of the distance between the domain ends to statistical one), respectively.

w_dock，w_int，w_ete:

Weights of the docking score, domain interface score and end-to-end distance score, respectively.
The default values of w_dock, w_int and w_ete are 7, 8 and 1, respectively. You can change the weight values in this list and re-calculate the DINE score for each candidate structure based on the changed weight values. Then, the candidate structures are re-sorted by re-calculated values.

Bottom area

Information about a candidate structure selected by the radio buttons in the above list is shown in this area.

Amino acid sequence (top left box)

The amino acid sequence of the candidate structure. By placing the mouse pointer over one of these residues, the residue name and the residue number are displayed in the “interaction residue” field below. Numerical data about the domain-domain interface residues is also shown in the legend.

List of the domains (bottom left table)

List of the input domains.

Field name	Description
domain number	domain number of each query
residue number	range of the residue number of the domain and the linkers
show	switch to show/hide the domain (uncheck to hide)
scheme of predicted interface residues	buttons to change the scheme of the domain-domain interface residues

3D Structure (right box)

The JSmol visualization of the candidate structure. Initially, the domain-domain interface residues are shown in wireframe format, while other residues in cartoon format. As with the amino acid sequence, the domain linker residues are indicated with deep color. By placing the mouse pointer over one of the interface residues or the domain linker residues in the amino acid sequence, the corresponding residue in this figure turns white to show its position in the structure.

Results of the docking simulation (below the 3D structure)

Numerical results of the docking simulation are shown here. Click the [Download PDB] button to download a PDB file of the above structure.

multi-domain protein composed of more than 2 domains

If the query protein is composed of more than two domains, the 3D structure is predicted as follows.

3D structures of franking two domains (domain1-domain2, domain2-domain3 and so on) are predicted using DINE score.
Decoy structures are constructed by superposing domain-n of predicted domain-(n-1)-domain-n and domain-n-domain-(n+1) structures.
The decoy structures without any collisions of all domains are sorted by harmonic mean of the DINE scores of domain1-domain2, domain2-domain3 and so on.

Using the pull-down menu on the top left, one of ten most plausible candidate structures can be shown.

Amino acid sequence (top left box)

List of the domains (bottom left table)

List of the input domains.

Field name	Description
domain number	domain number of each query
residue number	range of the residue number of the domain and the linkers
show	switch to show/hide the domain (uncheck to hide)
scheme of predicted interface residues	buttons to change the scheme of the domain-domain interface residues

3D Structure (right box)

Top of page