TY - JOUR
T1 - Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. II. Secondary structures
AU - Wako, Hiroshi
AU - Blundell, Tom L.
N1 - Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 1994/5/19
Y1 - 1994/5/19
N2 - A three-step method is presented to predict secondary structures of proteins, by utilizing aligned sequences of homologous proteins. Mean propensities and amino acid substitution patterns at a given site in the aligned sequences are first evaluated for four conformational states (i.e. α-helix, β-strand, buried coil and exposed coil). Capping rules are applied in order to define boundaries of the secondary-structure segments more precisely. In the second step , B-strand is predicted by searching regions predicted as coil for the two patterns characteristic of alternating and fully buried p-strands. The complete sequences of the solvent-accessibility classes predicted by substitution tables and propensities are also searched using Fourier transform methods for α-helical periodicity. After applying capping rules, the α-helices and β-strands predicted in the second step replace, where appropriate, the conformational states predicted in the first step. Finally, in the third step, if one of the four conformational states is assigned to the residues at an equivalent site of aligned sequences in more than a given fraction of the proteins, such a state is reassigned to all the residues at that site. The method is applied to 13 protein families, which contain four folding types, α, β, α/β and α+β. The accuracy of the prediction ranges from 60 to 79% (mean percentage over the 13 families is 69%). For comparison the Garnier-Osguthorpe-Robson (GOR) method is also applied to them. Although the mean prediction accuracy for the GOR method, 58%, can be improved to 63% by applying the second and third steps in this method, there remain four families with less than 55% accuracy. The mean accuracy is relatively higher and poor predictions are reduced in this method.
AB - A three-step method is presented to predict secondary structures of proteins, by utilizing aligned sequences of homologous proteins. Mean propensities and amino acid substitution patterns at a given site in the aligned sequences are first evaluated for four conformational states (i.e. α-helix, β-strand, buried coil and exposed coil). Capping rules are applied in order to define boundaries of the secondary-structure segments more precisely. In the second step , B-strand is predicted by searching regions predicted as coil for the two patterns characteristic of alternating and fully buried p-strands. The complete sequences of the solvent-accessibility classes predicted by substitution tables and propensities are also searched using Fourier transform methods for α-helical periodicity. After applying capping rules, the α-helices and β-strands predicted in the second step replace, where appropriate, the conformational states predicted in the first step. Finally, in the third step, if one of the four conformational states is assigned to the residues at an equivalent site of aligned sequences in more than a given fraction of the proteins, such a state is reassigned to all the residues at that site. The method is applied to 13 protein families, which contain four folding types, α, β, α/β and α+β. The accuracy of the prediction ranges from 60 to 79% (mean percentage over the 13 families is 69%). For comparison the Garnier-Osguthorpe-Robson (GOR) method is also applied to them. Although the mean prediction accuracy for the GOR method, 58%, can be improved to 63% by applying the second and third steps in this method, there remain four families with less than 55% accuracy. The mean accuracy is relatively higher and poor predictions are reduced in this method.
KW - Homologous sequences
KW - Protein structure prediction
KW - Secondary structure
KW - Substitution tables
UR - http://www.scopus.com/inward/record.url?scp=0028304961&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0028304961&partnerID=8YFLogxK
U2 - 10.1006/jmbi.1994.1330
DO - 10.1006/jmbi.1994.1330
M3 - Article
C2 - 8182744
AN - SCOPUS:0028304961
SN - 0022-2836
VL - 238
SP - 693
EP - 708
JO - Journal of Molecular Biology
JF - Journal of Molecular Biology
IS - 5
ER -