Next Article in Journal
Solubility and Permeability Enhancement of Oleanolic Acid by Solid Dispersion in Poloxamers and γ-CD
Next Article in Special Issue
Prediction of Drug–Target Interaction Using Dual-Network Integrated Logistic Matrix Factorization and Knowledge Graph Embedding
Previous Article in Journal
Biotechnological Approaches for Production of Artemisinin, an Anti-Malarial Drug from Artemisia annua L.
Previous Article in Special Issue
Prediction of Drug-Drug Interaction Using an Attention-Based Graph Neural Network on Drug Molecular Graphs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Accelerating AutoDock Vina with GPUs

1
School of Geographic and Biological Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
2
Smart Health Big Data Analysis and Location Services Engineering Research Center of Jiangsu Province, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
3
VeriMake Research, Nanjing Renmian Integrated Circuit Technology Co., Ltd., Nanjing 210088, China
4
National ASIC System Engineering Technology Research Center, Southeast University, Nanjing 210096, China
5
School of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
*
Author to whom correspondence should be addressed.
Molecules 2022, 27(9), 3041; https://doi.org/10.3390/molecules27093041
Submission received: 28 March 2022 / Revised: 1 May 2022 / Accepted: 2 May 2022 / Published: 9 May 2022
(This article belongs to the Special Issue Recent Advances in Artificial Intelligence-Based Drug Discovery)

Abstract

:
AutoDock Vina is one of the most popular molecular docking tools. In the latest benchmark CASF-2016 for comparative assessment of scoring functions, AutoDock Vina won the best docking power among all the docking tools. Modern drug discovery is facing a common scenario of large virtual screening of drug hits from huge compound databases. Due to the seriality characteristic of the AutoDock Vina algorithm, there is no successful report on its parallel acceleration with GPUs. Current acceleration of AutoDock Vina typically relies on the stack of computing power as well as the allocation of resource and tasks, such as the VirtualFlow platform. The vast resource expenditure and the high access threshold of users will greatly limit the popularity of AutoDock Vina and the flexibility of its usage in modern drug discovery. In this work, we proposed a new method, Vina-GPU, for accelerating AutoDock Vina with GPUs, which is greatly needed for reducing the investment for large virtual screens and also for wider application in large-scale virtual screening on personal computers, station servers or cloud computing, etc. Our proposed method is based on a modified Monte Carlo using simulating annealing AI algorithm. It greatly raises the number of initial random conformations and reduces the search depth of each thread. Moreover, a classic optimizer named BFGS is adopted to optimize the ligand conformations during the docking progress, before a heterogeneous OpenCL implementation was developed to realize its parallel acceleration leveraging thousands of GPU cores. Large benchmark tests show that Vina-GPU reaches an average of 21-fold and a maximum of 50-fold docking acceleration against the original AutoDock Vina while ensuring their comparable docking accuracy, indicating its potential for pushing the popularization of AutoDock Vina in large virtual screens.

Graphical Abstract

1. Introduction

Molecular docking studies how two or more molecular structures (e.g., drug and target) fit together. Molecular docking analysis has become one of the most common methods for modern drug discovery [1]. It allows the prediction of molecular interactions where a protein and a ligand can be inducted to fit together in the bound state. Additionally, molecular docking tools provide an efficient and cheap method, in the early stage of drug design, for the identification of leading compounds and their binding affinities [2,3,4]. Among all molecule docking tools, the AutoDock suite is the most popular, which consists of various tools including AutoDock4 [5], AutoDock Vina [6], AutoDock Vina 1.2.0 [7], AutoDock FR [8], AutoDock Crank Pep [9,10], AutoDock-GPU [11,12] etc. AutoDock Vina is usually recommended as the first-line tool in the implementation of molecular docking due to its docking speed and accuracy [13]. Moreover, it shows the best docking power in the last benchmark CASF-2016 for comparative assessment of scoring functions [14] and the best scoring power under the comparison with 10 docking programs on a diverse protein—ligand complex sets [15]. AutoDock Vina uses a Monte-Carlo iterated local search method, which comprises iterations of sampling, scoring and optimization. First, an initial random conformation is sampled for a given ligand, which is represented by its position, orientation and torsion (POT). Then, its position, orientation or torsion is randomly mutated with a disturbance. Finally, the affinity is evaluated for the binding pose of a ligand and a protein. In AutoDock Vina, the binding affinity is calculated by a scoring function which describes the sum of the intermolecular energy (ligand–receptor) and the intramolecular energy (ligand–ligand). Moreover, the conformation is optimized with a Broyden–Fletcher–Goldfarb–Shanno (BFGS) method [16] that considers the gradients of the scoring function. These gradients can guide the ligand to achieve a better conformation with a lower docking score. In addition, a metropolis acceptance rule [17], which relies on the difference between the docking score of the initial conformation and that of the optimized conformation, is arranged to decide whether the optimized conformation can be accepted or not. The accepted conformation will be recorded as the initial conformation and further optimized in the next iterations. As all we know, the Monte-Carlo based iterated local search method in AutoDock Vina is highly serialized because the ongoing iteration depends on the previous outputs.
Preceding virtual screens pipeline typically operates only on a scale of 10 6 10 7 compound molecules. Such scale of compounds will heavily descend the capability and increase the failure risk of modern drug discovery. Fortunately, the whole chemical space of drug-like molecules has been estimated to reach more than 10 60 [18]. The scale of compounds in virtual screens is vital since the more candidate compounds there are to be screened, the lower rate of failure and the more favorable quality of leading compounds can be reached [19]. Hence, the virtual screens on huge compound databases are urgent for identifying excellent drug candidates in modern drug discovery. However, original virtual screening with AutoDock Vina on huge databases is very slow, which cannot meet the need for modern drug discovery. Therefore, an acceleration of AutoDock Vina has become a central problem in current virtual screens of drug hits from huge compound databases. Till now, there have been several attempts for the acceleration of AutoDock Vina in large virtual screens [20,21,22]. For instance, VirtualFlow provides a drug discovery platform that speeds up AutoDock Vina in virtual screening of an ultra-large database with more than 1.4 billion molecules by leveraging over 160,000 CPUs [20]. Such huge resource investment and expenditure, as well as the high entry threshold for users, seriously weaken the popularization of AutoDock Vina and the flexibility of customer’s usage (such as a self-defined target and small molecule dataset). Due to the overall serial design of the AutoDock Vina algorithm, its parallelization mostly depends on the stacking of computing powers as well as the allocation of resources and tasks under such a common scenario that faces large virtual screens in modern drug discovery. A reduction of computational resource investment and user access threshold will advance the broad spread of AutoDock Vina in large virtual screening for modern drug discovery.
Graphic processing unit (GPU) is a powerful parallel programmable processor with thousands of computing cores that provide a tremendous computational performance. GPU has become an integral part of mainstream computing systems due to the high price–performance ratio and the ease of developing an implementation with well-established standards such as Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL). GPU has been applied to accelerate molecular docking in several tools [11,12,23,24,25,26,27,28,29,30,31,32]. For example, AutoDock-GPU provides an OpenCL implementation of AutoDock4 to exploit both GPU and CPU parallel architectures. By exploring three levels of parallelism (runs, individual, fine-grained tasks) on the Lamarckian Genetic Algorithm (LGA) algorithm, AutoDock-GPU achieves the maximum of 350-fold acceleration of total runtime against the single-threaded CPU implementation [11]. Recently, an attempt has been made into the GPU parallel acceleration of AutoDock Vina where the Viking method tried to rewrite the pose search stage of AutoDock Vina on GPU [30]. Till now, no positive acceleration result has been reported. The reasons for its deficiency probably involve the following three points. Firstly, the Monte Carlo-based optimization process in AutoDock Vina is the most time-consuming (typically more than 90 % ) and highly dependent whose next iteration relies on the previous outputs. Secondly, each ligand file was presented as a heterogeneous tree structure whose nodes are traversed recursively. Thirdly, the CUDA architecture on NVIDIA GPU cards limits its cross-platform portability.
In this work, we proposed an efficient parallel method, namely Vina-GPU, to accelerate the classic algorithm—Monte-Carlo based simulated annealing as well as the optimizer—BFGS in AutoDock Vina with GPUs. First, Vina-GPU applies a large-scale parallelism on the Monte-Carlo based iterated docking threads and significantly reduces the search depth in each thread. Second, a heterogeneous OpenCL implementation was efficiently deployed on Vina-GPU by converting the heterogeneous tree structure into a list structure whose nodes are stored in the traversed order. The two implementations ensure that Vina-GPU can leverage thousands of computational cores on GPU and achieve a large-scale parallelization and acceleration, and realizes the cross-platform portability on both CPUs and GPUs. Large benchmark tests show that Vina-GPU reaches an average of 21.66× and a maximum of 50.80× speed-up on NVIDIA Geforce RTX 3090 against the original AutoDock Vina on a 20-threaded CPU while ensuring their comparable docking accuracy. To further enlarge its potential of pushing the popularity of AutoDock Vina in large virtual screening, more efforts have been taken, as follows. First, we fitted a heuristic function for automatically determining the most important hyperparameter (search_depth) on the basis of large testing experiments in order to lower the usage threshold. Second, we developed a user-friendly graphical user interface (GUI) for a convenient operation of Vina-GPU. Third, we enable the implementation of Vina-GPU to be built on Windows, Linux and macOS, ensuring their usability on personal computers, station servers and cloud computations etc. The code and tool of Vina-GPU are freely available at https://github.com/DeltaGroupNJUPT/Vina-GPU (accessed on 27 March 2022) or http://www.noveldelta.com/Vina_GPU (accessed on 27 March 2022).

2. Methodology

The heterogeneous OpenCL implementation of Vina-GPU is depicted in Figure 1, which consists of a host part (on CPU) and a device part (on GPU). The host part is mainly in charge of the preparation and post-refinement of the conformations. The device part focuses on the acceleration of the most time-consuming Monte Carlo-iterated local search method by enlarging the scale of parallelism as well as reducing the number of iterations.

2.1. Host Part

The host part consists of two sections (see Figure 1). The first section includes four operations, which are the file reading, the OpenCL setup, the data preparation and the device memory allocation, and all operations are implemented for the input to the device part. Specifically, the file-reading operation is to read the ligand and protein files in .pdbqt format, and the OpenCL setup operation is to setup the OpenCL environment (platform, device, context, queue, program and kernels). Furthermore, the host part prepares all the required data, including grid cache (for calculating the energy of a conformation), random maps (for generating probability random numbers) and random initial conformations (for the Monte Carlo-based method to start from). The data is then re-organized to load in the device memory according to how it is accessed (read-only or read-write). The read-only grid cache, random maps and random conformations are allocated in the constant device memory while the read–write of best conformations returned by the device part is allocated in the global device memory. Such kind of memory management could efficiently boost the speed of reading and writing on GPU. The second section includes multiple operations after the device part. All the best conformations returned from the device part are clustered and sorted in the container by their docking scores. The best 20 conformations will be concretely refined and optimized before generating the final output ligand files.
In the data preparation operation, AutoDock Vina treats each conformation as a heterogeneous tree structure whose nodes are stored with its frame information and a pointer to its children node. Each node is traversed by a depth-first search policy to calculates the conformation energy in a recursive process. However, in Vina-GPU, the OpenCL standard cannot support any recursion in kernels because the allocation of stack space for thousands of threads is too expensive. In addition, various ligands would generate different heterogeneous trees that are not suitable for the OpenCL implementation. Therefore, we transformed the heterogeneous tree structure into a list type (see Figure 2), each node of which is stored in line with their traversed order. These nodes can be traversed simply by the order of the node list. In addition, a children map was created to denote the relationship among these nodes. For example, the node 0 has two children-nodes (the node 1 and the node 4) and so the row 0 has two “T”s (indicating “True”) in the 1st and 4th column (Figure 2). Thus, the recursive traverse of the heterogeneous tree can be converted into an iterative traverse of the node list and children map, which fits the OpenCL standard.

2.2. Device Part

On the device part, the allocated constant memory (highlighted in orange in Figure 1) is assigned for the initialization and the calculation during the reduced-step Monte-Carlo iterated local search processes (highlighted in green in Figure 1) and the final best conformations are stored in global memory (highlighted in gold in Figure 1).
Vina-GPU enables thousands of reduced-steps iterated local search processes running concurrently within the GPU computational cores. We denote each reduced-step iterated local search process as a docking thread. Within each thread, an OpenCL work item is assigned to a randomly initialized conformation C , which can be represented by its position, orientation and torsion (POT):
C = { x , y , z , a , b , c , d , ψ 1 , ψ 2 , , ψ N r o t }
where x , y , z correspond to the position of the conformation in a pre-determined searching space; a , b , c , d denote its orientation as a rigid body in the quaternion form; ψ 1 , ψ 2 , , ψ N r o t represent torsions of N r o t rotatable bonds. Then, each conformation C is to be randomly mutated in one of its POT with the uniform distribution. The conformation will be continuously evaluated with a scoring function that quantifies the potential energy of the binding pose. Generally, the potential energy e is calculated with the sum of intermolecular energy and intramolecular energy:
e = e i n t e r + e i n t r a
where e i n t e r represents the interaction energy between the ligand and the receptor, and it is calculated using trilinear interpolation that approximates the energy of each atom pair by looking up the grid cache; and e i n t r a indicates the interaction energy of the pairwise atoms within the ligand. Considering that both e i n t e r and e i n t r a are related to the binding pose, the scoring function SF can be denoted as a function of POT variables:
SF = f ( x , y , z , a , b , c , d , ψ 1 , ψ 2 , , ψ N r o t )
After the energy evaluation, a Broyden–Fletcher–Goldfarb–Shanno (BFGS) [16] optimization is applied to update the ligand conformation by minimizing of the scoring function SF . Essentially, the BFGS method is to substitute the Hessian matrix H R ( 7 + N r o t ) × ( 7 + N r o t ) with an approximate matrix B R ( 7 + N r o t ) × ( 7 + N r o t ) whose inverse matrix B 1 is iteratively updated by the first-order derivatives SF ( C ) R ( 7 + N r o t ) . B k + 1 1 in the ( k + 1 ) th iteration can be calculated by
B k + 1 1 = B k 1 + s k T y k + y k B k 1 y k s k s k T s k T y k 2 B k 1 y k s k T + s k y k T B k 1 s k T y k
where
s k = α k B k 1 S F C k
α k = a r g m i n S F C k + α p k
y k = S F C k + α k p k S F C k
where B 0 is initiated with identity matrix E and the detailed calculation of SF ( C ) is described in [6]. Furthermore α ( α k ) means the step size in the direction p ( p k ), along the decrease of the SF value. Next, a metropolis-acceptance criterion is adopted to decide whether to accept the optimized conformation or not, by comparing the energy e 0 before the mutation and the energy e o p t after the optimization. Here, the accept probability P is represented by:
P = 1 e 0 > e o p t exp ( e 0 e o p t ) 1.2 e 0 e o p t
This indicates that the accepted conformation is more likely to have a lower energy. Once accepted, the conformation will be further evaluated and optimized by BFGS. Then, the next iteration continues to update the previous optimized conformations until convergence. Finally, all the best conformations found by work items are returned to the host part. Algorithm 1 proposed the pseudocode of our Vina-GPU. In Algorithm 1, M u t a t i o n ( . ) means a random mutation of the POT in a ligand conformation; B F G S ( . ) represents the BFGS optimization method which is described in Equations (4)–(7); S c o r i n g ( . ) is the potential energy of a binding pose described in Equations (2) and (3); M e t r o p o l i s ( . ) is the metropolis acceptance criterion described in Equation (8); and C l u s t e r i n g & S o r t i n g ( . ) is the aggregation and reordering (based on the docking score) of all ligand conformations among all threads.
Algorithm 1 Vina-GPU method
Input: random ligand conformations: C t m p 0 , C t m p 1 , , C t m p N
Output: top k ligand conformations C 0 , C 1 , , C k 1 C t m p 0 , C t m p 1 , , C t m p N
  1:
 for all  C t m p i i = 0 , 1 , , N  concurrently do
  2:
  for all s e a r c h _ d e p t h = 0 , 1 , 2 , , r do
  3:
    C c a n d i = M u t a t i o n C t m p i
  4:
    C c a n d i , e c a n d i = B F G S C c a n d i & S c o r i n g C c a n d i
  5:
   if  s e a r c h _ d e p t h = = 0 M e t r o p o l i s e c a n d i , e t m p i  then
  6:
     C t m p i = C c a n d i
  7:
     C c a n d i , e c a n d i = B F G S C t m p i & S c o r i n g C t m p i
  8:
   end if
  9:
  end for
10:
 end for
11:
  C l u s t e r i n g & S o r t i n g ( C t m p 0 , C t m p 1 , , C t m p N )
12:
 return C 0 , C 1 , , C k 1 C t m p 0 , C t m p 1 , , C t m p N

3. Results and Discussion

3.1. Experimental Settings

All 140 complexes in the AutoDock-GPU study [12] are assigned as our experimental dataset, which is comprised of 85 complexes from the Astex Diversity Set [33], 35 complexes from CASF-2013 [34], and 20 complexes from the Protein Data Bank [35]. They cover a wide range of ligand complexities and targets properties. Each complex file includes an X-ray structure, an initial random pose of its ligand and the corresponding receptor (in .pdbqt format). We created a config.txt file for each complex (see the example in Supplementary Table S1), which involves the center (indicated by c e n t e r x , c e n t e r y , c e n t e r z ) and the recommended volume of the docking box (indicated by s i z e x , s i z e y , s i z e z ). We classified the 140 complexes into three subsets by their atom sizes N a t o m (small—5–23 atoms; medium—24–36 atoms; large—37–108 atoms). The details of our experimental data can be seen in Supplementary Table S2.
AutoDock Vina was executed on Intel (R) Core (TM) i9-10900K CPU @ 3.7 GHz using Windows 10 Operating System with 64 GB RAM. AutoDock Vina was customized by several configurable arguments, including the center and the volume of searching spaces, the number of CPU cores (cpu ) to be utilized and the docking runs ( e x h a u s t i v e n e s s ) etc. The argument e x h a u s t i v e n e s s was set to 128 [36], and the argument c p u was set to the maximum value of 20 for taking a full use of the CPU computational power.
Vina-GPU was developed with OpenCL v.3.0 and executed on three different GPUs (Nvidia Geforce GTX 1080Ti, Nvidia Geforce RTX 2080Ti, Nvidia Geforce RTX 3090 (Nvidia Corporation, CA, USA) ) under single-precision floating-point format (FP32). Details are included in Supplementary Table S3. In our Vina-GPU, we replaced c p u and e x h a u s t i v e n e s s with the number of threads ( t h r e a d ) and the size of searching iterations in each thread ( s e a r c h _ d e p t h ). These two hyperparameters are of the most importance, and their values are vital to the docking performance of Vina-GPU. For a convenient usage of Vina-GPU, as in AutoDock Vina [6], a heuristic formula was fitted to automatically determine the proper size of s e a r c h _ d e p t h for a given complex. Specifically, a large number of tests were executed on all 140 complexes to examine their docking performance under various sizes of s e a r c h _ d e p t h , where the proper s e a r c h _ d e p t h that guarantees a comparable docking performance was selected. Then, the least squares method was used to fit an empirical formula of the proper s e a r c h _ d e p t h with respect to the N a t o m (the number of atoms) and N r o t (the number of rotatable bonds) in a ligand. The heuristic formula is given as follows,
s e a r c h _ d e p t h = m a x ( 1 , f l o o r ( 0.24 N a t o m + 0.29 N r o t 3.41 ) )
where the function f l o o r ( ) gives the largest integer less than or equal to ∗.

3.2. Influence of Hyperparameters

We evaluated the influence of the hyperparameters t h r e a d (from 100 to 15,000) and s e a r c h _ d e p t h (from 1 to 50) on the docking accuracy (evaluated by docking score and RMSD) as well as the docking runtime of Vina-GPU. The docking score represents the binding affinity between a ligand and a receptor (the lower the score, the better) and the RMSD measures the atom distance difference between an output conformation and the ground truth X-ray structure (again, the lower the better) [6]. An acceptable docking is defined if the least RMSD among all output conformations of Vina-GPU is smaller than 2 Å [13]. Three complexes (5tim, 2bm2 and 1jyq) were randomly selected, which represent various levels of complexities (small, medium and large). The influence of t h r e a d and s e a r c h _ d e p t h on the docking score, RMSD and docking runtime are shown in Figure 3 and Figure 4, respectively. All experiments were executed under NVIDIA Geforce RTX 3090 GPU card.
With the increase of t h r e a d , the docking score improves and it becomes convergent when the size of t h r e a d reaches around 6000 for 2bm2 and 1jyq, and about 1000 for 5tim (Figure 3a). The same trend is also observed on the RMSD performance (Figure 3b), where 2bm2 and 1jyq converge at around 8000, and 5tim fluctuates slightly nearby 2 Å. In Figure 3c, with the raise of t h r e a d , the docking runtimes of all three complexes increase slowly. Although it is enough for the small complex 5tim to obtain the best docking accuracy with 1000 t h r e a d , the size of t h r e a d needs to be set around 8000 for the medium complex 2bm2 and large complex 1jyq. Thus, the size of t h r e a d was set to be 8000 for all 140 complexes in this paper.
For the small-complex 5tim, with the increase of s e a r c h _ d e p t h , its docking score, RMSD and docking runtime stay steady. The size of s e a r c h _ d e p t h does not influence the docking results, because Vina-GPU can achieve the best performance with a few s e a r c h _ d e p t h for such a small complex (Figure 4). For the medium complex 2bm2, the docking score and RMSD converge quickly with the raise of s e a r c h _ d e p t h , and the docking runtime increases slowly. This is because a medium complex needs more s e a r c h _ d e p t h to reach the convergence (Figure 4). For the large complex 1jyq, the docking score and RMSD converge slowly with s e a r c h _ d e p t h . The docking runtime for 1jyq increases rapidly with s e a r c h _ d e p t h , because the device runtime for such a large complex 1jyq takes the major part of the total (host + device) docking runtime, increasing s e a r c h _ d e p t h leads to a great expense on the total docking runtime.

3.3. Docking Accuracy

We compare the overall docking accuracy of Vina-GPU with AutoDock Vina in terms of the docking score and RMSD performances on all 140 complexes (Figure 5). The color bar encodes the number of atoms in a ligand. For the docking score, most complexes distribute around the diagonal line and fall into the lavender margin of 0.5 kcal/mol difference and their Pearson correlation coefficient of the scores is 0.965 (Figure 5a), which denotes a significant positive correlation. The average docking score of AutoDock Vina and Vina-GPU are −8.9 and −8.7, respectively. These results show that our Vina-GPU achieves the very close docking scores with AutoDock Vina on CPU core.
A docking conformation is typically acceptable when its RMSD difference with the ground truth structure is smaller than 2 Å [12]. In Figure 5b, the red dashed line distinguishes whether a docking conformation is acceptable or not from the RMSD aspect. Figure 5b demonstrates that most complexes fall into the lower left region where both Vina-GPU and AutoDock Vina succeed to obtain the acceptable docking. For Vina-GPU, 107 out of 140 RMSD results are within 2 Å, while 114 out of 140 for AutoDock Vina. The average RMSD of AutoDock Vina and Vina-GPU are 1.5 and 1.7, respectively. These results show that our Vina-GPU achieves the similar docking RMSD with AutoDock Vina. Thus, these findings indicate that Vina-GPU exhibits the comparable docking accuracy with respect to AutoDock Vina on both docking score and RMSD.

3.4. Runtime Comparison

The runtime acceleration ( A c c ) of Vina-GPU against AutoDock Vina is defined by
A c c = t v i n a t v i n a - g p u
where t v i n a and t v i n a - g p u is the runtime of AutoDock Vina and Vina-GPU, respectively. Figure 6 shows the runtime acceleration ( A c c ) on various scales of complexity (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms) and different GPU cards (Nvidia Geforce GTX 1080Ti, Nvidia Geforce RTX 2080Ti, Nvidia Geforce RTX 3090, respectively). The average acceleration is highlighted by a white dot in the center.
As indicated in Figure 6, Vina-GPU achieves the maximal acceleration of 50.80×, as well as the average of 8.84×, 12.70× and 21.66× on the 1080ti, 2080ti and 3090 GPU cards, respectively. The results show that the average acceleration increases along with the complexity of the complex (from small to large) and also raises with higher-end GPU cards (from NVIDIA Geforce GTX 1080ti to NVIDIA Geforce GTX 3090). Figure 7 shows the A c c performance of all 140 complexes along with different N a t o m and N r o t . Each bar represents a complex coupling with its corresponding acceleration. As shown in Figure 7, the acceleration varies from 1.03× to 50.80×. The maximal acceleration 50.80× is achieved on the 1xm6 (PDBid) complex under Nvidia Geforce RTX 3090 GPU card.
Among the whole Vina-GPU program, the Monte Carlo-based optimization process is the most time-consuming part (typically more than 90%), which is performed in the “device” part utilizing GPU computational cores. For a better exhibition of the acceleration in the most time-consuming part, we defined the device runtime acceleration A c c d as
A c c d = t m c t d
where t m c is the Monte Carlo-based optimization part runtime of AutoDock Vina and t d is the device part runtime of Vina-GPU. As shown in Figure 8 and Figure 9, Vina-GPU achieves the maximum of 191.68× and the average of 18.94×, 43.58× and 48.48× acceleration on Nvidia Geforce GTX 1080ti, Nvidia Geforce RTX 2080Ti and Nvidia Geforce RTX 3090 GPU cards, respectively.

3.5. Conformation Spaces Analysis

To verify their equivalence in molecular docking, we intend to analyze the full conformation spaces explored by AutoDock Vina and our Vina-GPU. Firstly, we discussed the searching strategy of Vina-GPU and explain why Vina-GPU can achieve a great acceleration on the premise of comparable docking accuracy. Then, we visualized and compared their whole searching of conformation spaces.
Vina-GPU enables thousands of docking threads to run concurrently. These docking threads divide the whole search space into thousands of subspaces, and in each subspace an initial conformation is being optimized. We define the search space that covers all possible conformations as a high-dimensional space S = C 0 , C 1 , C 2 , . By dividing S into n sub-spaces, we have
S = S s u b 0 , S s u b 1 , , S s u b n
and each initial conformation belongs to a sub-space
C i S s u b i i = 0 , 1 , 2 ,
For each initial conformation C i , the corresponding searching space S s u b i is much smaller than the whole searching space S . Therefore, we can greatly reduce the searching iterations of each initial conformation in each S s u b i . By clustering and sorting all the best conformations, Vina-GPU ensures a comparable docking accuracy with original AutoDock Vina.
Then, we detailed a case (PDBid: 2bm2, N a t o m = 33 , N r o t = 7 ) and visualized their full searching of conformation spaces in Figure 10. AutoDock Vina was executed with the configuration of “ c p u = 1 , e x h a u s t i v e n e s s = 1 and s e a r c h _ d e p t h = 22 , 365 (default value)”. Vina-GPU was executed under various strategies, where different sizes of t h r e a d and s e a r c h _ d e p t h were used. The whole searching spaces ( t h r e a d s × s e a r c h _ d e p t h ) were kept almost the same as that of original AutoDock Vina. All conformations searched by AutoDock Vina or our Vina-GPU are indicated as orange or blue dots, respectively. Each conformation is represented by its POT in Cartesian coordinates, where a principal component analysis (PCA) method was used to reduce the dimensionality of orientation and torsion into three. The best conformation is shown by the red star (indicated by an arrow).
As shown in Figure 10a,b, the whole conformation space reached by Vina-GPU or AutoDock Vina is almost the same in their position, orientation or torsion. With the increase of Vina-GPU on its parallel threads and the reduce of s e a r c h _ d e p t h in each thread, these observations stay unchanged (Figure 10c,d). Moreover, the best conformations found by AutoDock Vina or our Vina-GPU are very close to each other. These results demonstrate that our Vina-GPU can achieve comparable docking accuracy with the original AutoDock Vina.

3.6. Comparison with the Implementation of Vina-GPU on CPUs

Due to the inherently serial characteristic of the AutoDock Vina algorithm, our Vina-GPU proposed an improved algorithm and then accelerated it with GPUs. For evaluating the contributions of the algorithm improvement and the GPU hardware acceleration separately, we gave out the performance comparison with the implementation of Vina-GPU on CPUs (Figure 11 and Figure 12). The implementation of Vina-GPU on CPUs was executed on Intel (R) Core (TM) i9-10900K CPU @ 3.7 GHz. Both these implementations were executed on all 140 complexes with the same settings of t h r e a d (8000) and s e a r c h _ d e p t h . The results of our Vina-GPU on GPUs are identical to those in Figure 6.
For the docking score, most complexes lie around the diagonal line and fall into the lavender margin of a 0.5 kcal/mol difference, only with a few exceptions due to the randomness of Vina-GPU algorithm (Figure 11a). The Pearson correlation coefficient of their docking scores is 0.966 (Figure 11a). The average docking score of Vina-GPU on CPUs and GPUs are −8.6 and −8.7, respectively. The results show that the implementation of Vina-GPU on CPUs achieves comparable docking scores. For the docking RMSD, most complexes gather in the bottom-left region, which indicates that they are acceptable dockings for the implementations of Vina-GPU on CPUs and GPUs (Figure 11b). And for Vina-GPU on CPUs, 104 out of 140 RMSD results are within 2 Å, while 107 out of 140 on GPUs. The average RMSD of Vina-GPU on CPUs and GPUs are 1.8 and 1.7, respectively. The comparable docking scores and RMSD mean that the implementation of Vina-GPU on CPU obtains the almost same docking accuracy. For the docking runtime, the acceleration of the implementation of Vina-GPU on GPUs is higher than that on CPUs, and the latter one only achieves the maximal acceleration of 26.19× and the average of 3.49×, 8,81× and 18.76× on the small, medium and large complexes against the original AutoDock Vina, respectively (Figure 12). These results indicate that our improved algorithm with the implementation on both the CPUs and GPUs gains the comparable docking accuracy, and it is more suitable for the implementation on GPU hardware to achieve higher accelerations.

3.7. A Case for Virtual Screening

To show the acceleration effect of our Vina-GPU in implementing real virtual screens of compound databases, a case was detailed on the receptor 1xm6 (PDBid) with the docking of DrugBank [37]. The receptor 1xm6 is the catalytic domain of human phosphodiesterase 4B in complex with (R)-mesopram, and DrugBank is one of the most popular drug databases that contains comprehensive information on drugs and drug targets. A total of 9125 molecules were downloaded from the DrugBank database at https://go.drugbank.com/releases/latest#structures (accessed on 27 March 2022). Both Vina-GPU and AutoDock Vina were executed on the same computer with Intel (R) Core (TM) i9-10900K CPU @ 3.7 GHz and NVIDIA Geforce RTX 3090 GPU card. The e x h a u s t i v e n e s s and c p u of AutoDock Vina were set to 128 and 20, respectively. The t h r e a d and s e a r c h _ d e p t h of Vina-GPU were set to 8000 and the heuristic value, respectively. Only ~9.44 h was needed to execute the whole docking process by Vina-GPU while ~133.90 h was needed by AutoDock Vina, indicating that the acceleration of 14.18× are achieved by our Vina-GPU. The docking scores of all 9125 molecules on Vina-GPU and AutoDock Vina are shown in Supplementary Data S1. We evaluated the similarity of top i compounds with the lowest docking scores on AutoDock Vina or our Vina-GPU by Jaccard index [38] as defined by
J i = T v i n a i T v i n a - G P U i T v i n a i T v i n a - G P U i
where i = 15 , 50 , 100 , 200 , 300 , and T v i n a i and T v i n a - G P U i represents subset of top i compounds of AutoDock Vina and Vina-GPU, respectively. Table 1 shows that all the Jacard indexes are larger than 0.8, indicating a high similarity of the docking results of Vina-GPU and AutoDock Vina.
Figure 13 shows the comparison of docking scores between Vina-GPU and AutoDock Vina, where most compounds lie around the diagonal line and within the margin (in lavender) of 0.5 kcal/mol difference on the docking score. The Pearson correlation coefficient of their docking scores is 0.981. The average docking socre of AutoDock Vina and Vina-GPU are −7.9 and −7.8, respectively. These results show that our Vina-GPU achieves highly similar docking scores to AutoDock Vina on CPU.

3.8. Usage of Vina-GPU

We developed a user-friendly graphic user interface (GUI) instead of the original terminal form. Our GUI can be utilized without installation and is described in Supplementary Figure S1. In addition, we provided a detailed guideline on how to build and run Vina-GPU on mainstream operating systems (Windows, Linux and MacOS), and it can also ensure the usability of Vina-GPU on personal computers, station servers and cloud computations, etc. (see Supplementary Text S1). All source codes and tools of Vina-GPU are freely available at http://www.noveldelta.com/Vina_GPU (accessed on 27 March 2022) or https://github.com/DeltaGroupNJUPT/Vina-GPU (accessed on 27 March 2022).

4. Conclusions

In modern drug discovery, huge resource investment and high entry threshold seriously weaken the popularity of AutoDock Vina in large virtual screening from compound databases. To advance the spread of AutoDock Vina in large virtual screens, we proposed a novel method Vina-GPU to speed up AutoDock Vina with GPUs. Vina-GPU obtains a large-scale of parallelism on the classic algorithm—Monte Carlo-based simulated annealing—and greatly reduces the search depth in each iteration. With one of the traditional optimizers, BFGS, Vina-GPU can update the ligand conformation on the consideration of its gradient. Furthermore, a heterogeneous OpenCL implementation of Vina-GPU was efficiently assigned by transforming the heterogeneous tree structure into a list structure whose nodes are visited in the traversed line. Vina-GPU can fully utilize abundant computational GPU cores to reach a large-scale of parallelization and acceleration. Moreover, Vina-GPU can realize cross-platform operation on both CPUs and GPUs. Large benchmarks demonstrate that Vina-GPU achieves an average speed-up of 21-fold and a maximal speed-up of 50-fold on NVIDIA Geforce RTX 3090 over the original AutoDock Vina when keeping their comparable docking accuracy. To further enlarge the popularity of AutoDock Vina in large virtual screens, more efforts were taken, as the follows. A heuristic function was fitted based on large testing experiments to automatically set the most important hyperparameter ( s e a r c h _ d e p t h ). Moreover, a graphical user interface (GUI) was designed for a convenient usage of Vina-GPU. In addition, an extension of Vina-GPU was provided on Windows, Linux and macOS, and also ensure its usage on personal computers, station servers, and cloud computations, etc. The source codes of Vina-GPU can be freely accessible at https://github.com/DeltaGroupNJUPT/Vina-GPU (accessed on 27 March 2022) or http://www.noveldelta.com/Vina_GPU (accessed on 27 March 2022). In future studies, the following aspects would be taken into consideration for pushing the popularization of AutoDock Vina in large virtual screens. We will further analyze and mend the AutoDock Vina algorithm so that it can obtain a higher acceleration with GPUs. In addition, we will study other mainstream tools in the AutoDock Vina suites and accelerate them with GPUs. Moreover, we will rewrite the AutoDock Vina algorithm to realize its acceleration on FPGA with higher price–performance ratio and more flexibility. Furthermore, we will further consider the influence of the hyperparameters (GPU threads and depths) on model performance for more levels of complexity of compounds or protein targets.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/molecules27093041/s1, Figure S1: Graphic user interface (GUI) of Vina-GPU; Table S1: An example of config.txt file; Table S2: Descriptions of experimental data; Table S3: Details of GPU cards used in this work; Text S1: Guideline for running Vina-GPU; Data S1: Virtual screening docking results on DrugBank using AutoDock Vina and Vina-GPU.

Author Contributions

Conceptualization, J.W., M.L. (Ming Ling) and H.H.; software, S.T., R.C. and M.L. (Mengru Lin); methodology, S.T., M.L. (Ming Ling), R.C., Q.L. and Y.Z.; writing—original draft preparation, J.W., S.T. and M.L. (Ming Ling); formal analysis, S.T. and J.D.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (61872198, 81771478 and 61971216); the Basic Research Program of Science and Technology Department of Jiangsu Province (BK20201378).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the source codes, the documentation and the updates related to Vina-GPU have been uploaded to our website http://www.noveldelta.com/Vina_GPU (accessed on 27 March 2022) (in the userguideline section) under an Apache-2.0 license. The 140 complexes described in Section 3.1 are available at https://zenodo.org/record/4031961#.Yags3NBByUk (accessed on 27 March 2022) and the 9125 Drug- Bank molecules used in Section 3.7 are available at https://go.drugbank.com/releases/latest#structures (accessed on 27 March 2022). If there is any problem in reproducing our work, please feel free to contact us.

Acknowledgments

We acknowledge the support from Xianqiang Shi for providing the Huawei cloud service and Yemin Diao for offering us a work place.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Meng, X.Y.; Zhang, H.X.; Mezei, M.; Cui, M. Molecular docking: A powerful approach for structure-based drug discovery. Curr. Comput.-Aided Drug Des. 2011, 7, 146–157. [Google Scholar] [CrossRef] [PubMed]
  2. Lengauer, T.; Rarey, M. Computational methods for biomolecular docking. Curr. Opin. Struct. Biol. 1996, 6, 402–406. [Google Scholar] [CrossRef]
  3. Cherkasov, A.; Muratov, E.N.; Fourches, D.; Varnek, A.; Baskin, I.I.; Cronin, M.; Dearden, J.; Gramatica, P.; Martin, Y.C.; Todeschini, R.; et al. QSAR modeling: Where have you been? Where are you going to? J. Med. Chem. 2014, 57, 4977–5010. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Golbraikh, A.; Shen, M.; Xiao, Z.; Xiao, Y.D.; Lee, K.H.; Tropsha, A. Rational selection of training and test sets for the development of validated QSAR models. J. Comput.-Aided Mol. Des. 2003, 17, 241–253. [Google Scholar] [CrossRef]
  5. Morris, G.M.; Huey, R.; Lindstrom, W.; Sanner, M.F.; Belew, R.K.; Goodsell, D.S.; Olson, A.J. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 2009, 30, 2785–2791. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Trott, O.; Olson, A.J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010, 31, 455–461. [Google Scholar] [CrossRef] [Green Version]
  7. Eberhardt, J.; Santos-Martins, D.; Tillack, A.; Forli, S. AutoDock Vina 1.2. 0: New docking methods, expanded force field, and Python bindings. J. Chem. Inf. Model. 2021, 61, 3891–3898. [Google Scholar] [CrossRef]
  8. Ravindranath, P.A.; Forli, S.; Goodsell, D.S.; Olson, A.J.; Sanner, M.F. AutoDockFR: Advances in protein-ligand docking with explicitly specified binding site flexibility. PLoS Comput. Biol. 2015, 11, e1004586. [Google Scholar] [CrossRef] [Green Version]
  9. Zhang, Y.; Sanner, M.F. AutoDock CrankPep: Combining folding and docking to predict protein—Peptide complexes. Bioinformatics 2019, 35, 5121–5127. [Google Scholar] [CrossRef]
  10. Zhang, Y.; Sanner, M.F. Docking flexible cyclic peptides with AutoDock CrankPep. J. Chem. Theory Comput. 2019, 15, 5161–5168. [Google Scholar] [CrossRef]
  11. Santos-Martins, D.; Eberhardt, J.; Bianco, G.; Solis-Vasquez, L.; Ambrosio, F.A.; Koch, A.; Forli, S. D3R Grand Challenge 4: Prospective pose prediction of BACE1 ligands with AutoDock-GPU. J. Comput.-Aided Mol. Des. 2019, 33, 1071–1081. [Google Scholar] [CrossRef] [PubMed]
  12. Santos-Martins, D.; Solis-Vasquez, L.; Tillack, A.F.; Sanner, M.F.; Koch, A.; Forli, S. Accelerating AutoDock4 with GPUs and gradient-based local search. J. Chem. Theory Comput. 2021, 17, 1060–1073. [Google Scholar] [CrossRef] [PubMed]
  13. Goodsell, D.S.; Sanner, M.F.; Olson, A.J.; Forli, S. The AutoDock suite at 30. Protein Sci. 2021, 30, 31–43. [Google Scholar] [CrossRef] [PubMed]
  14. Su, M.; Yang, Q.; Du, Y.; Feng, G.; Liu, Z.; Li, Y.; Wang, R. Comparative assessment of scoring functions: The CASF-2016 update. J. Chem. Inf. Model. 2018, 59, 895–913. [Google Scholar] [CrossRef]
  15. Wang, Z.; Sun, H.; Yao, X.; Li, D.; Xu, L.; Li, Y.; Tian, S.; Hou, T. Comprehensive evaluation of ten docking programs on a diverse set of protein–ligand complexes: The prediction accuracy of sampling power and scoring power. Phys. Chem. Chem. Phys. 2016, 18, 12964–12975. [Google Scholar] [CrossRef]
  16. Fletcher, R. Practical Methods of Optimization; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
  17. Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 1953, 21, 1087–1092. [Google Scholar] [CrossRef] [Green Version]
  18. Bohacek, R.S.; McMartin, C.; Guida, W.C. The art and practice of structure-based drug design: A molecular modeling perspective. Med. Res. Rev. 1996, 16, 3–50. [Google Scholar] [CrossRef]
  19. Lyu, J.; Wang, S.; Balius, T.E.; Singh, I.; Levit, A.; Moroz, Y.S.; O’Meara, M.J.; Che, T.; Algaa, E.; Tolmachova, K.; et al. Ultra-large library docking for discovering new chemotypes. Nature 2019, 566, 224–229. [Google Scholar] [CrossRef]
  20. Gorgulla, C.; Boeszoermenyi, A.; Wang, Z.F.; Fischer, P.D.; Coote, P.W.; Das, K.M.P.; Malets, Y.S.; Radchenko, D.S.; Moroz, Y.S.; Scott, D.A.; et al. An open-source drug discovery platform enables ultra-large virtual screens. Nature 2020, 580, 663–668. [Google Scholar] [CrossRef]
  21. Li, H.; Leung, K.S.; Wong, M.H. idock: A multithreaded virtual screening tool for flexible ligand docking. In Proceedings of the 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), San Diego, CA, USA, 9–12 May 2012; pp. 77–84. [Google Scholar]
  22. Jaghoori, M.M.; Bleijlevens, B.; Olabarriaga, S.D. 1001 Ways to run AutoDock Vina for virtual screening. J. Comput.-Aided Mol. Des. 2016, 30, 237–249. [Google Scholar] [CrossRef] [Green Version]
  23. Mermelstein, D.J.; Lin, C.; Nelson, G.; Kretsch, R.; McCammon, J.A.; Walker, R.C. Fast and flexible gpu accelerated binding free energy calculations within the amber molecular dynamics package. J. Comput. Chem. 2018, 39, 1354–1358. [Google Scholar] [CrossRef] [PubMed]
  24. Hwu, W.M.W. GPU Computing Gems Emerald Edition; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 2011. [Google Scholar]
  25. Stone, J.E.; Hynninen, A.P.; Phillips, J.C.; Schulten, K. Early experiences porting the NAMD and VMD molecular simulation and analysis software to GPU-accelerated OpenPOWER platforms. In Proceedings of the International Conference on High Performance Computing, Frankfurt, Germany, 19–23 June 2016; pp. 188–206. [Google Scholar]
  26. LeGrand, S.; Scheinberg, A.; Tillack, A.F.; Thavappiragasam, M.; Vermaas, J.V.; Agarwal, R.; Larkin, J.; Poole, D.; Santos-Martins, D.; Solis-Vasquez, L.; et al. GPU-accelerated drug discovery with docking on the summit supercomputer: Porting, optimization, and application to COVID-19 research. In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Virtual Event, 21–24 September 2020; pp. 1–10. [Google Scholar]
  27. Fan, M.; Wang, J.; Jiang, H.; Feng, Y.; Mahdavi, M.; Madduri, K.; Kandemir, M.T.; Dokholyan, N.V. Gpu-accelerated flexible molecular docking. J. Phys. Chem. B 2021, 125, 1049–1060. [Google Scholar] [CrossRef] [PubMed]
  28. Ding, X.; Wu, Y.; Wang, Y.; Vilseck, J.Z.; Brooks III, C.L. Accelerated CDOCKER with GPUs, parallel simulated annealing, and fast Fourier transforms. J. Chem. Theory Comput. 2020, 16, 3910–3919. [Google Scholar] [CrossRef] [PubMed]
  29. Imbernón, B.; Serrano, A.; Bueno-Crespo, A.; Abellán, J.L.; Pérez-Sánchez, H.; Cecilia, J.M. METADOCK 2: A high-throughput parallel metaheuristic scheme for molecular docking. Bioinformatics 2021, 37, 1515–1520. [Google Scholar] [CrossRef] [PubMed]
  30. Shin, J.H.; Kim, J.; Chae, J.; Yun, S.J. GPU-Accelerated Autodock Vina: Viking. 2020. Available online: https://www.morressier.com/o/event/5e733c5acde2b641284a7e27/article/5e73656bcde2b641284aa4e5 (accessed on 27 March 2022).
  31. Solis-Vasquez, L.; Santos-Martins, D.; Tillack, A.F.; Koch, A.; Eberhardt, J.; Forli, S. Parallelizing Irregular Computations for Molecular Docking. In Proceedings of the 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3), Atlanta, GA, USA, 11 November 2020; pp. 12–21. [Google Scholar]
  32. Kannan, S.; Ganji, R. Porting autodock to CUDA. In Proceedings of the IEEE Congress on Evolutionary Computation, Barcelona, Spain, 18–23 July 2010; pp. 1–8. [Google Scholar]
  33. Hartshorn, M.J.; Verdonk, M.L.; Chessari, G.; Brewerton, S.C.; Mooij, W.T.; Mortenson, P.N.; Murray, C.W. Diverse, high-quality test set for the validation of protein-ligand docking performance. J. Med. Chem. 2007, 50, 726–741. [Google Scholar] [CrossRef]
  34. Li, Y.; Han, L.; Liu, Z.; Wang, R. Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J. Chem. Inf. Model. 2014, 54, 1717–1736. [Google Scholar] [CrossRef]
  35. Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The protein data bank. Nucleic Acids Res. 2000, 28, 235–242. [Google Scholar] [CrossRef] [Green Version]
  36. Handoko, S.D.; Ouyang, X.; Su, C.T.T.; Kwoh, C.K.; Ong, Y.S. QuickVina: Accelerating AutoDock Vina using gradient-based heuristics for global optimization. IEEE/ACM Trans. Comput. Biol. Bioinform. 2012, 9, 1266–1272. [Google Scholar] [CrossRef]
  37. Wishart, D.S.; Feunang, Y.D.; Guo, A.C.; Lo, E.J.; Marcu, A.; Grant, J.R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z.; et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res. 2018, 46, D1074–D1082. [Google Scholar] [CrossRef]
  38. Jaccard, P. The distribution of the flora in the alpine zone. New Phytol. 1912, 11, 37–50. [Google Scholar] [CrossRef]
Figure 1. The OpenCL architecture for implementing Vina-GPU, which consists of a host (CPU) and a device (GPU) part of execution. The device part implements thousands of docking threads, each of which is assigned with an OpenCL work item to perform a Monte Carlo-based local search method with largely reduced search iterations.
Figure 1. The OpenCL architecture for implementing Vina-GPU, which consists of a host (CPU) and a device (GPU) part of execution. The device part implements thousands of docking threads, each of which is assigned with an OpenCL work item to perform a Monte Carlo-based local search method with largely reduced search iterations.
Molecules 27 03041 g001
Figure 2. Transformation the original node tree structure into the node-list format. The heterogeneous node tree was reconstructed in its traversed order (depth-first) into the node list where an additional children map was built to reflect the relationship among the nodes. For example, the node 0 has two children-nodes (the node 1 and the node 4) and so the row 0 has two “T”s (indicating “True”) in the 1st and 4th column.
Figure 2. Transformation the original node tree structure into the node-list format. The heterogeneous node tree was reconstructed in its traversed order (depth-first) into the node list where an additional children map was built to reflect the relationship among the nodes. For example, the node 0 has two children-nodes (the node 1 and the node 4) and so the row 0 has two “T”s (indicating “True”) in the 1st and 4th column.
Molecules 27 03041 g002
Figure 3. Influence of the size of t h r e a d on docking accuracy (score and RMSD) and docking runtime of Vina-GPU in (a), (b) and (c), respectively. Three typical PDB complexes are randomly selected from all 140 complexes which represent small, medium and large ones, respectively (5tim—small, 5 atoms; 2bm2—medium, 33 atoms; 1jyq—large, 60 atoms). All experiments were executed on NVIDIA RTX 3090 GPU card.
Figure 3. Influence of the size of t h r e a d on docking accuracy (score and RMSD) and docking runtime of Vina-GPU in (a), (b) and (c), respectively. Three typical PDB complexes are randomly selected from all 140 complexes which represent small, medium and large ones, respectively (5tim—small, 5 atoms; 2bm2—medium, 33 atoms; 1jyq—large, 60 atoms). All experiments were executed on NVIDIA RTX 3090 GPU card.
Molecules 27 03041 g003
Figure 4. Influence of the size of s e a r c h _ d e p t h on docking accuracy (score and RMSD) and docking runtime of Vina-GPU in (a), (b) and (c), respectively. Three typical PDB complexes were randomly selected from all 140 complexes which represent small, medium and large ones, respectively (5tim—small, 5 atoms; 2bm2—medium, 33 atoms; 1jyq—large, 60 atoms). All experiments were executed on NVIDIA RTX 3090 GPU card.
Figure 4. Influence of the size of s e a r c h _ d e p t h on docking accuracy (score and RMSD) and docking runtime of Vina-GPU in (a), (b) and (c), respectively. Three typical PDB complexes were randomly selected from all 140 complexes which represent small, medium and large ones, respectively (5tim—small, 5 atoms; 2bm2—medium, 33 atoms; 1jyq—large, 60 atoms). All experiments were executed on NVIDIA RTX 3090 GPU card.
Molecules 27 03041 g004
Figure 5. Comparable docking accuracy between AutoDock Vina and our Vina-GPU on all 140 complexes. The color bar encodes the number of atoms in a ligand. A margin of 0.5 kcal/mol difference on the docking score between Vina-GPU and AutoDock is highlighted in lavender in (a). The Pearson correlation coefficient of their docking scores is 0.965 (indicated by “pearson”). The RMSD value that indicates an acceptable binding pose (<2 Å) are separated by a red dashed line in (b).
Figure 5. Comparable docking accuracy between AutoDock Vina and our Vina-GPU on all 140 complexes. The color bar encodes the number of atoms in a ligand. A margin of 0.5 kcal/mol difference on the docking score between Vina-GPU and AutoDock is highlighted in lavender in (a). The Pearson correlation coefficient of their docking scores is 0.965 (indicated by “pearson”). The RMSD value that indicates an acceptable binding pose (<2 Å) are separated by a red dashed line in (b).
Molecules 27 03041 g005
Figure 6. Acceleration of docking time ( A c c ) of our Vina-GPU against AutoDock Vina on three different GPUs and various scales of complexity (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms). 1080ti—NVIDIA Geforce GTX 1080ti; 2080ti—Nvidia Geforce RTX 2080Ti; 3090—Nvidia Geforce RTX 3090.
Figure 6. Acceleration of docking time ( A c c ) of our Vina-GPU against AutoDock Vina on three different GPUs and various scales of complexity (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms). 1080ti—NVIDIA Geforce GTX 1080ti; 2080ti—Nvidia Geforce RTX 2080Ti; 3090—Nvidia Geforce RTX 3090.
Molecules 27 03041 g006
Figure 7. Details for the acceleration of docking time ( A c c ) of our Vina-GPU against AutoDock Vina on all 140 complexes. The complexity is depicted with their number of atoms ( N a t o m ) and rotatable bonds ( N r o t ). The vertical axis ranges from 0 to 60, and each bar represents a complex coupling with its corresponding acceleration ( A c c ). Vina-GPU 1080ti, Vina-GPU 2080ti and Vina-GPU 3090 mean that Vina-GPU was executed on Nvidia Geforce GTX 1080ti, Nvidia Geforce RTX 2080Ti and Nvidia Geforce RTX 3090, respectively. The color depth of each bar indicates the value of accelerations (the darker the higher).
Figure 7. Details for the acceleration of docking time ( A c c ) of our Vina-GPU against AutoDock Vina on all 140 complexes. The complexity is depicted with their number of atoms ( N a t o m ) and rotatable bonds ( N r o t ). The vertical axis ranges from 0 to 60, and each bar represents a complex coupling with its corresponding acceleration ( A c c ). Vina-GPU 1080ti, Vina-GPU 2080ti and Vina-GPU 3090 mean that Vina-GPU was executed on Nvidia Geforce GTX 1080ti, Nvidia Geforce RTX 2080Ti and Nvidia Geforce RTX 3090, respectively. The color depth of each bar indicates the value of accelerations (the darker the higher).
Molecules 27 03041 g007
Figure 8. Acceleration ratio on A c c d of our Vina-GPU against AutoDock Vina on three different GPUs and various scales of complexity according to their N a t o m sizes (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms). 1080ti—NVIDIA Geforce GTX 1080ti; 2080ti—Nvidia Geforce RTX 2080Ti; 3090—Nvidia Geforce RTX 3090.
Figure 8. Acceleration ratio on A c c d of our Vina-GPU against AutoDock Vina on three different GPUs and various scales of complexity according to their N a t o m sizes (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms). 1080ti—NVIDIA Geforce GTX 1080ti; 2080ti—Nvidia Geforce RTX 2080Ti; 3090—Nvidia Geforce RTX 3090.
Molecules 27 03041 g008
Figure 9. Details for the acceleration of docking time ( A c c d ) of our Vina-GPU against AutoDock Vina on all 140 complexes. The complexity is depicted with their number of atoms ( N a t o m ) and rotatable bonds ( N r o t ). The vertical axis ranges from 0 to 210, and each bar represents a complex coupling with its corresponding acceleration ratio. Vina-GPU 1080ti, Vina-GPU 2080ti and Vina-GPU 3090 mean that Vina-GPU was executed on Nvidia Geforce GTX 1080ti, Nvidia Geforce RTX 2080Ti and Nvidia Geforce RTX 3090, respectively. The color depth of each bar indicates the value of accelerations (the darker the higher).
Figure 9. Details for the acceleration of docking time ( A c c d ) of our Vina-GPU against AutoDock Vina on all 140 complexes. The complexity is depicted with their number of atoms ( N a t o m ) and rotatable bonds ( N r o t ). The vertical axis ranges from 0 to 210, and each bar represents a complex coupling with its corresponding acceleration ratio. Vina-GPU 1080ti, Vina-GPU 2080ti and Vina-GPU 3090 mean that Vina-GPU was executed on Nvidia Geforce GTX 1080ti, Nvidia Geforce RTX 2080Ti and Nvidia Geforce RTX 3090, respectively. The color depth of each bar indicates the value of accelerations (the darker the higher).
Molecules 27 03041 g009
Figure 10. Similar distribution of full conformation spaces (PDBid: 2bm2) explored by AutoDock Vina or our Vina-GPU with various hyperparameter settings. AutoDock Vina ( c p u = 1 , e x h a u s t i v e n e s s = 1 ) was executed with one initial conformation and s e a r c h _ d e p t h = 22 , 365 by default. Vina-GPU was performed under different scales of docking threads with various s e a r c h _ d e p t h in each thread. The whole searching iterations (= t h r e a d × s e a r c h _ d e p t h ) keep almost the same. Each conformation is represented by its position, orientation and torsion (POT), all of which are plotted with orange dots in AutoDock Vina and bule dots in Vina-GPU. The principal component analysis (PCA) method was used to reduce the dimensions of orientation or torsion into three. The best conformations for their final output are highlighted with red stars (pointed by arrows). (a) AutoDock Vina: 1 initial conformation with s e a r c h _ d e p t h = 22 , 365 . (b) Vina-GPU: 10 docking threads with s e a r c h _ d e p t h = 2237 . (c) Vina-GPU: 100 docking threads with s e a r c h _ d e p t h = 224 . (d) Vina-GPU: 1000 docking threads with s e a r c h _ d e p t h = 22 .
Figure 10. Similar distribution of full conformation spaces (PDBid: 2bm2) explored by AutoDock Vina or our Vina-GPU with various hyperparameter settings. AutoDock Vina ( c p u = 1 , e x h a u s t i v e n e s s = 1 ) was executed with one initial conformation and s e a r c h _ d e p t h = 22 , 365 by default. Vina-GPU was performed under different scales of docking threads with various s e a r c h _ d e p t h in each thread. The whole searching iterations (= t h r e a d × s e a r c h _ d e p t h ) keep almost the same. Each conformation is represented by its position, orientation and torsion (POT), all of which are plotted with orange dots in AutoDock Vina and bule dots in Vina-GPU. The principal component analysis (PCA) method was used to reduce the dimensions of orientation or torsion into three. The best conformations for their final output are highlighted with red stars (pointed by arrows). (a) AutoDock Vina: 1 initial conformation with s e a r c h _ d e p t h = 22 , 365 . (b) Vina-GPU: 10 docking threads with s e a r c h _ d e p t h = 2237 . (c) Vina-GPU: 100 docking threads with s e a r c h _ d e p t h = 224 . (d) Vina-GPU: 1000 docking threads with s e a r c h _ d e p t h = 22 .
Molecules 27 03041 g010
Figure 11. Comparable docking accuracy by the implementation of Vina-GPU on CPUs. All 140 complexes were used, and the color bar encodes the number of atoms in a ligand. A margin of 0.5 kcal/mol difference on the docking score for the implementation of Vina-GPU on CPUs or GPUs is highlighted with lavender in (a). The Pearson correlation coefficient of their docking scores is 0.966 (indicated by “pearson”). The RMSD value that indicates an acceptable binding pose (<2 Å) is separated by a red dashed line in (b).
Figure 11. Comparable docking accuracy by the implementation of Vina-GPU on CPUs. All 140 complexes were used, and the color bar encodes the number of atoms in a ligand. A margin of 0.5 kcal/mol difference on the docking score for the implementation of Vina-GPU on CPUs or GPUs is highlighted with lavender in (a). The Pearson correlation coefficient of their docking scores is 0.966 (indicated by “pearson”). The RMSD value that indicates an acceptable binding pose (<2 Å) is separated by a red dashed line in (b).
Molecules 27 03041 g011
Figure 12. Acceleration of docking time ( A c c ) of the Vina-GPU implementations on CPUs (indicated by “CPU”) against on GPUs (indicated by “GPU”). All 140 complexes are classified into three sub-datasets with different complexity (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms). The average acceleration is highlighted with a white dot in the center. The implementation of Vina-GPU on GPUs was executed on Nvidia Geforce RTX 3090 (identical to the results in Figure 6) and that on CPUs was executed on Intel (R) Core (TM) i9-10900K CPU @ 3.7 GHz.
Figure 12. Acceleration of docking time ( A c c ) of the Vina-GPU implementations on CPUs (indicated by “CPU”) against on GPUs (indicated by “GPU”). All 140 complexes are classified into three sub-datasets with different complexity (small—5–23 atoms, medium—24–36 atoms, large—37–108 atoms). The average acceleration is highlighted with a white dot in the center. The implementation of Vina-GPU on GPUs was executed on Nvidia Geforce RTX 3090 (identical to the results in Figure 6) and that on CPUs was executed on Intel (R) Core (TM) i9-10900K CPU @ 3.7 GHz.
Molecules 27 03041 g012
Figure 13. Comparable docking scores between AutoDock Vina and our Vina-GPU on all 9125 compounds from the drugbank dataset. The color bar encodes the number of atoms in one ligand. A margin of 0.5 kcal/mol difference on the docking score between AutoDock and our Vina-GPU is highlighted in lavender. The Pearson correlation coefficient of their docking scores is 0.981 (indicated by “pearson”).
Figure 13. Comparable docking scores between AutoDock Vina and our Vina-GPU on all 9125 compounds from the drugbank dataset. The color bar encodes the number of atoms in one ligand. A margin of 0.5 kcal/mol difference on the docking score between AutoDock and our Vina-GPU is highlighted in lavender. The Pearson correlation coefficient of their docking scores is 0.981 (indicated by “pearson”).
Molecules 27 03041 g013
Table 1. The Jacard indexes J i on the top i subsets of Vina-GPU and AutoDock Vina.
Table 1. The Jacard indexes J i on the top i subsets of Vina-GPU and AutoDock Vina.
Top i T vina i T vina - GPU i T vina i T vina - GPU i Jacard Index
1514160.875
5046540.852
100911090.835
2001872130.878
3002773230.858
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tang, S.; Chen, R.; Lin, M.; Lin, Q.; Zhu, Y.; Ding, J.; Hu, H.; Ling, M.; Wu, J. Accelerating AutoDock Vina with GPUs. Molecules 2022, 27, 3041. https://doi.org/10.3390/molecules27093041

AMA Style

Tang S, Chen R, Lin M, Lin Q, Zhu Y, Ding J, Hu H, Ling M, Wu J. Accelerating AutoDock Vina with GPUs. Molecules. 2022; 27(9):3041. https://doi.org/10.3390/molecules27093041

Chicago/Turabian Style

Tang, Shidi, Ruiqi Chen, Mengru Lin, Qingde Lin, Yanxiang Zhu, Ji Ding, Haifeng Hu, Ming Ling, and Jiansheng Wu. 2022. "Accelerating AutoDock Vina with GPUs" Molecules 27, no. 9: 3041. https://doi.org/10.3390/molecules27093041

APA Style

Tang, S., Chen, R., Lin, M., Lin, Q., Zhu, Y., Ding, J., Hu, H., Ling, M., & Wu, J. (2022). Accelerating AutoDock Vina with GPUs. Molecules, 27(9), 3041. https://doi.org/10.3390/molecules27093041

Article Metrics

Back to TopTop