Back to "Biological Language Modeling Seminar Topics"
Back to "Protein structure prediction"
Back to "Methods to detect structural similarity"
DALI, Distance matrix alignment
L.Holm, C. Sander
Assumption: if two residues are in contact, then the aligned residues in the query protein should also be in contact
Method:
Step 1. compute a matrix of contact patterns in the to-be compared proteins
Step 2. seek the maximal matching submatrix (using certain approximations)
- has been used to perform a classification of protein domain structures from an all-against-all comparison of structures in the pdb
- runs very fast, i.e. it is feasible to compare your own query protein again the entire pdb database
- was able to recognize the TIM-barrel fold, in which members of this fold have almost no sequence similiarity (sequence conservation well below 20%, e.g. mouse adenosine deaiminase and Pseudomonas phosphotriesterase share 13% identity)
- until fraction of identical residues drops below 40-50%,, the deformation of the mainchain atoms is <1Å