Topics in Bioinformatics

Back to "Biological Language Modeling Seminar Topics"

Macromolecular protein-ligand docking: Example Protein-Protein Interactions

In contrast to small molecular docking, two macromolecules are docked, such as protein and DNA, or protein and protein. This differs from small molecule docking in

- large contact area

- molecules have fixed overall shape

=> methods based on geometric properties like shape complementarities alone can be efficiently used to create energetically favorable complexes

Need for macromolecular docking:

3000 single protein structures and only 300 protein complex structures: need to combine two structures into a complex structure by computational means. This is useful for drug design, i.e. several lead compounds have been designed based on the structure of a protein receptor interaction with a small molecule ligand (Colman (1994) Structure-based drug design. Current Opinion in Structural Biology 4, 868-874).

Principles of molecular recognition

- from examining known protein-protein complexes

- major problem: changes in conformational flexibility on interaction: induced-fit, but some systems are approximated well by lock and key model

Protein-protein docking strategies

1. FTDOCK

Principle:

first use a rigid-body approach, then introduce conformational flexibility at a later stage during modeling:

- good for docking two proteins of size 50-500 amino acids

Step 1. Rigid body docking

- search for complexes that are favorable in terms of shape complementarity and electrostatics

- two requirements:

1. realistic computation time to get a set of coarse complex models that contains the true one

2. scoring functions need to be soft to allow for conformational changes upon complex formation

- Fourier correlation approach meets these requirements (FTDOCK1 and 2)

1. generate a grid representation (discretise): in the grid, whenever a grid cell contains an atomic position, it is turned "on". Grid cells within 1.8A are also turned "on". The surface of the grid will be the surface of the molecule.

2. evaluate shape complementarity. Shape complementarity of two grid is computationally intensive. Speed up by using discrete Fourier transforms.

3. rotate molecule to perform global search

4. include electrostatic effects in the Fourier correlation approach

Result: a set of putative complexes (on the order of 10,000)

Step 2. Residue-residue scoring scheme to select good models

- same approaches as used in fold recognition

Step 3. Use of distance constraints

only includes removing if pairs of atoms are closer than 4.5 A cutoff. Experimental distance constraints cannot be used in the method to reduce the search space, unless they are available for both molecules.

Step 4. Refinement

- allows for conformational changes in side-chains (MULTIDOCK)

- use potential energy functions

2. Low-resolution docking using Fourier Correlation

by Vakser

uses same Fourier correlation approach as above, but a lower resolution grid is used

3. HEX

using spherical polar Fourier correlations

4. DOCK

using spheres, see small molecule docking

is the predominant rigid body approach

5. Matching critical points

another rigid body approach

based on defining the knobs and holes on two interacting surfaces

6. Lenhof approach

another rigid body approach

based on identification of points on the two surfaces that could be equivalened in a close-packed association

7. ESCHER

start with shape complementarity based on slices of the protein surface mapped onto sets of polygons

another rigid body approach

8. Flexible protein-protein docking

starting conformations

Monte Carlo search with random rigid-body shifts

vary side chain torsion angles