Back to "Biological Language Modeling Seminar Topics"
Back to "Protein structure prediction"
Rosetta method:
References:
- based on assumption that distribution of conformations sampled by a local segment of aa chain is similar to those found by the same sequences in the pdb
- Principle:
Step 1. represent a protein sequence by 3- and 9-grams
Step 2. find the same n-gram sequences for n=3,9 in pdb
Step 3. Monte Carlo search through this structure space. Since Monte Carlo would take too long without any scoring function, they use an energy function that favors compact structures with paired beta-strands and buried hydrophobic residues.
Step 4. Cluster several structures obtained from different seeds
Step 5. Select center of largest cluster