Topics in Bioinformatics

Back to "Biological Language Modeling Seminar Topics"

Rosetta method:

References:

Bonneau and Baker 2001

Bonneau et al. (2001)

- based on assumption that distribution of conformations sampled by a local segment of aa chain is similar to those found by the same sequences in the pdb

- Principle:

Step 1. represent a protein sequence by 3- and 9-grams

Step 2. find the same n-gram sequences for n=3,9 in pdb

Step 3. Monte Carlo search through this structure space. Since Monte Carlo would take too long without any scoring function, they use an energy function that favors compact structures with paired beta-strands and buried hydrophobic residues.

Step 4. Cluster several structures obtained from different seeds

Step 5. Select center of largest cluster