Back to "Biological Language Modeling Seminar Topics"
Back to "Protein structure prediction"
Homology modeling
- needs minimum 30% sequence identity, but to be useful usually need 40-50%
- note that ~30% of genomes have sequence identity of 20% (reference)
Step 1. align sequences
Step 2. identify loops (generally between secondary structure elements)
Step 3. replace sidechains
Step 4. remove obvious clashes
Step 5. limited energy minimization
Collection of homology models
* MODBASE
uses PSI-BLAST plus MODELLER to model and stores coordinates in this database
* SWISS-MODEL
automatic structure prediction