Back to "Biological Language Modeling Seminar Topics"

Back to "Protein structure prediction"

 

Homology modeling

- needs minimum 30% sequence identity, but to be useful usually need 40-50%

- note that ~30% of genomes have sequence identity of 20% (reference)

 

Step 1. align sequences

Step 2. identify loops (generally between secondary structure elements)

Step 3. replace sidechains

Step 4. remove obvious clashes

Step 5. limited energy minimization

 

Collection of homology models

* MODBASE

    uses PSI-BLAST plus MODELLER to model and stores coordinates in this database

* SWISS-MODEL

    automatic structure prediction