What Is Molecular Docking?

 What is Molecular Docking?

Essentially, the goal of molecular docking is to predict the ligand-receptor complex structure using computational approaches. Docking can be accomplished in two steps: 
  1. first, by sampling ligand conformations in the protein's active site
  2.  then ranking these conformations using a scoring function. 
Ideally, sampling methods should be capable of reproducing the experimental binding mode, and the scoring function should place it first among all generated conformations. From these two approaches, we provide a quick introduction of basic docking theory.




Why Molecular Docking?

The molecular docking approach can be used to mimic the interaction between a small molecule and a protein at the atomic level, allowing us to define the behavior of small molecules in the binding site of target proteins and elucidate essential biological processes.

How Molecular Docking Works?






What is Sampling Algorithm in Molecular Docking?

With six degrees of translational and rotational freedom, as well as conformational degrees of freedom for both the ligand and the protein, there are numerous different binding mechanisms between two molecules. Unfortunately, it would be too expensive to computationally generate all potential conformations.


What are Matching Algorithms?

Based on molecular shape, map a ligand onto a protein's active site using shape features and chemical information. Pharmacophores are molecules that represent both the protein and the ligand. Each pharmacophore distance within the protein and ligand is calculated for a match; new ligand conformations are determined by the distance matrix between the pharmacophore and the matching ligand atoms. Chemical features such as hydrogen-bond donors and acceptors can be accounted for during the match. Matching algorithms have the advantage of speed, therefore they can be employed for the enrichment of active chemicals from huge libraries.

What is Incremental Construction?

Incremental construction (IC) approaches introduce the ligand into an active site in a fragmented and stepwise manner. The ligand is broken into many pieces by breaking its rotatable links, and one of these fragments is chosen to dock first in the active site. This anchor is typically the biggest fragment or component that may play an important functional role or interact with a protein. The remaining fragments can be inserted incrementally. Different orientations are created to fit the active site, demonstrating the ligand's versatility. 



What is Multiple Copy Simultaneous Search(MCSS)?

Fragment-based strategies for designing new ligands and modifying existing ligands to improve their binding to the target protein. MCSS generates 1,000 to 5,000 copies of a functional group, which are randomly inserted in the desired binding site and subjected to energy minimization and/or quenched molecular dynamics in the protein's forcefield.
Copies solely interact with the proteins; any interactions between the copies are ignored.
The interaction energies are then used to identify a collection of energetically favorable binding locations and orientations for the functional group.
The binding site is mapped using various functional groupings. The linking of those diverse functional groups allows for the creation of new molecules that perfectly match the binding site.


What is LUDI?

LUDI focuses on hydrogen bonds and hydrophobic interactions that could occur between the ligand and protein. Its basic concept is interaction sites, which are discrete places in space that can establish hydrogen bonds or fill hydrophobic pockets . A set of interaction sites is created by either searching the database or applying the rules. The fragment is then fitted to the interaction locations and graded using distance criteria. The final stage involves connecting some or all of the fitting pieces to a single molecule.


What is Monte Carlo method?

Monte Carlo (MC) techniques produce ligand movements via bond rotation, rigid-body translation, or rotation. The conformation acquired by this transformation is evaluated using an energy-based selection criterion. If it meets the criteria, it will be saved and subsequently adjusted to produce the next conformation. The iterations will continue until the predetermined number of conformations is obtained. The key advantage of MC is that the change can be extremely significant, allowing the ligand to pass the energy barriers on the potential energy surface, which is difficult to achieve using molecular dynamics simulation approaches.

What is Genetic Algorithm?

Degrees of freedom of the ligand are encoded as binary strings called genes. These genes make up the ‘chromosome’ which actually represents the pose of the ligand. Mutation and crossover are two kinds of genetic operators in GA. Mutation makes random changes to the genes; crossover exchanges genes between two chromosomes. When the genetic operators affect the genes, the result is a new ligand structure. New structures will be assessed by scoring function, and the ones that survived (i.e., exceeded a threshold) can be used for the next generation

 

What is Molecular Dynamics?

Molecular dynamics (MD) [68-70] is a popular simulation method in many areas of molecular modeling. MD simulation better portrays the flexibility of both the ligand and the protein than other docking techniques because it moves each atom separately in the field of the remaining atoms. However, MD simulations have the problem of progressing in very small increments, making it difficult to step over high energy conformational barriers, which may result in inadequate sampling. On the other hand, MD simulations are frequently effective in local optimization. Thus, a contemporary technique is to employ random search to determine the structure of the ligand, followed by more subtle MD simulations.




Scoring Functions

The scoring function's goal is to quickly distinguish between proper and incorrect poses, or binders and inactive substances. However, scoring functions estimate rather than calculate the binding affinity between the protein and ligand, and these functions make a variety of assumptions and simplifications. There are three types of scoring functions: 
  1. Force-field-based
  2. Empirical
  3. Knowledge-based
Table 2 displays various scoring function formulae from each of the three scoring function classes.





What is Classic Force-Field Scoring Function?

Classical force-field-based scoring functions calculate the binding energy by including the non-bonded (electrostatic and van der Waals) interactions. The electrostatic terms are determined using a Coulombic model. Because such point charge calculations have difficulty representing the protein's actual environment, a distance-dependent dielectric function is commonly utilized to regulate the contribution of charge-charge interactions.
 A Lennard-Jones potential function characterizes the van der Waals interactions. The "hardness" of the Lennard-Jones potential, which regulates how close a contact between protein and ligand atoms can be acceptable, can be varied by using different parameter sets. 
Force-field-based scoring functions also have the issue of sluggish computing performance. Thus, cut-off distance is employed to handle non-bonded interactions.  This also results in decreasing the accuracy of long-range effects involved in binding.



What is Empirical Scoring Functions?

According to empirical scoring functions, binding energy is composed of multiple energy components, including hydrogen bonds, ionic interactions, hydrophobic effects, and binding entropy. Each component is multiplied by a coefficient and then added together to yield a final score. Coefficients are calculated using regression analysis applied to a test set of ligand-protein complexes with known binding affinities.
Empirical scoring functions use relatively simple energy terms to evaluate. However, it is uncertain how well they are suited to ligand-protein interactions outside of the training set.


What is Knowledge Based Scoring Function?

The appeal of knowledge-based functions is their computational simplicity, which can be used to screen massive complex datasets. They can also represent unusual interactions, such as Sulphur-aromatic or cation-π, which are generally poorly addressed in empirical methods. However, they are still confronted with the issue of some interactions being underrepresented in the limited training sets of crystal structures, as well as the bias inherent in the selection of proteins for successful structure determination, which means that the obtained parameters may not be suitable for widespread use, particularly with interactions involving metals or halogens. 


What is Consensus Scoring Function?

Consensus scoring is a novel technique that incorporates many ratings to evaluate docking conformation. A ligand or possible binder pose may be accepted if it performs well across many grading schemes. Consensus scoring typically enhances enrichments (the percentage of strong binders among high-scoring ligands) in virtual screening, as well as the prediction of bound conformations and poses. However, the prediction of binding energies may still be erroneous. Furthermore, the utility of consensus scoring decreases when terms in distinct scoring functions are substantially connected.

What are the Various methods for receptor flexibility and molecular docking?


 


Conclusion

Receptor flexibility, particularly backbone flexibility and the movement of numerous important secondary components of the receptor involved in ligand binding and the catalysis, remains a significant challenge in docking research. In some circumstances, strategies for dealing with side chain flexibility have proven to be effective and adequate. In terms of global flexibility, an ensemble of proteins is a popular approach that follows conformer selection principles. It requires an effective method of obtaining and selecting reliable protein structures for docking, which means that structures in which the ligand can fit should be included in the ensembles. Furthermore, computational expense is another disadvantage of this strategy. 

References








Comments