TY - JOUR
T1 - Local interactions that contribute minimal frustration determine foldability
AU - Zou, Taisong
AU - Woodrum, Brian W.
AU - Halloran, Nicholas
AU - Campitelli, Paul
AU - Bobkov, Andrey A.
AU - Ghirlanda, Giovanna
AU - Ozkan, Sefika Banu
N1 - Funding Information:
Support from NSF MCB Award 1715591 is gratefully acknowledged by S.B.O. and G.G. We also thank Andrei Bobkov, Protein Analysis Core, Sanford Burnham Prebys Medical Discovery Institute for his assistance in obtaining the binding affinity analysis of our designed WW domain. We also acknowledge computing time from Arizona State University Research Computing.
Publisher Copyright:
© 2021 The Authors. Published by American Chemical Society.
PY - 2021/3/18
Y1 - 2021/3/18
N2 - Earlier experiments suggest that the evolutionary information (conservation and coevolution) encoded in protein sequences is necessary and sufficient to specify the fold of a protein family. However, there is no computational work to quantify the effect of such evolutionary information on the folding process. Here we explore the role of early folding steps for sequences designed using coevolution and conservation through a combination of computational and experimental methods. We simulated a repertoire of native and designed WW domain sequences to analyze early local contact formation and found that the N-terminal β-hairpin turn would not form correctly due to strong non-native local contacts in unfoldable sequences. Through a maximum likelihood approach, we identified five local contacts that play a critical role in folding, suggesting that a small subset of amino acid pairs can be used to solve the “needle in the haystack” problem to design foldable sequences. Thus, using the contact probability of those five local contacts that form during the early stage of folding, we built a classification model that predicts the foldability of a WW sequence with 81% accuracy. This classification model was used to redesign WW domain sequences that could not fold due to frustration and make them foldable by introducing a few mutations that led to the stabilization of these critical local contacts. The experimental analysis shows that a redesigned sequence folds and binds to polyproline peptides with a similar affinity as those observed for native WW domains. Overall, our analysis shows that evolutionary-designed sequences should not only satisfy the folding stability but also ensure a minimally frustrated folding landscape.
AB - Earlier experiments suggest that the evolutionary information (conservation and coevolution) encoded in protein sequences is necessary and sufficient to specify the fold of a protein family. However, there is no computational work to quantify the effect of such evolutionary information on the folding process. Here we explore the role of early folding steps for sequences designed using coevolution and conservation through a combination of computational and experimental methods. We simulated a repertoire of native and designed WW domain sequences to analyze early local contact formation and found that the N-terminal β-hairpin turn would not form correctly due to strong non-native local contacts in unfoldable sequences. Through a maximum likelihood approach, we identified five local contacts that play a critical role in folding, suggesting that a small subset of amino acid pairs can be used to solve the “needle in the haystack” problem to design foldable sequences. Thus, using the contact probability of those five local contacts that form during the early stage of folding, we built a classification model that predicts the foldability of a WW sequence with 81% accuracy. This classification model was used to redesign WW domain sequences that could not fold due to frustration and make them foldable by introducing a few mutations that led to the stabilization of these critical local contacts. The experimental analysis shows that a redesigned sequence folds and binds to polyproline peptides with a similar affinity as those observed for native WW domains. Overall, our analysis shows that evolutionary-designed sequences should not only satisfy the folding stability but also ensure a minimally frustrated folding landscape.
UR - http://www.scopus.com/inward/record.url?scp=85103228876&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103228876&partnerID=8YFLogxK
U2 - 10.1021/acs.jpcb.1c00364
DO - 10.1021/acs.jpcb.1c00364
M3 - Article
C2 - 33687216
AN - SCOPUS:85103228876
SN - 1520-6106
VL - 125
SP - 2617
EP - 2626
JO - Journal of Physical Chemistry B
JF - Journal of Physical Chemistry B
IS - 10
ER -