Exporting docking results

Exporting docking results#

AutoDock writes results in PDBQT format, which lacks bond orders and hydrogens bonded to carbon. When other software reads PDBQT files, it needs to infer bond orders, which can be impossible to do accurately for some molecules. As an example, consider the following conversion, in which OpenBabel adds an extra double bond:

obabel -:"C1C=CCO1" -o pdbqt --gen3d | obabel -i pdbqt -o smi
[C]1=[C][C]=[C]O1

Meeko can export docking results to SDF while preserving bond orders because when it writes the PDBQT file for a ligand (as input for docking) it writes a SMILES string and index mapping information in remark lines. Both AutoDock-GPU and AutoDock-Vina preserve the remarks in the output files, and Meeko then uses them to create an accurate RDKit molecule and write an SD file.

When docking with flexible sidechains, Meeko can write a PDB file for the entire receptor with the modified sidechain positions, or include the sidechains as additional fragments in the SDF for the ligand.

Since hydrogens bonded to carbon are excluded by default in AutoDock, exported positions of these hydrogens are calculated by RDKit. This can be annoying if a careful forcefield minimization is employed before docking, as probably rigorous Hs positions will be replaced by the RDKit geometry rules, which are empirical and much simpler than most force fields.

Command line usage#

The script reads vina output files (.pdbqt extension):

mk_export.py vina_results.pdbqt -s vina_results.sdf

And AutoDock-GPU output files (.dlg extension):

mk_export.py autodock-gpu_results.dlg -s autodock-gpu_results.sdf

It can also read files zipped with gzip (.gz extension), and multiple filenames can be passed, for example using wildcards. In this case, the output file names will be the input filenames with an added suffix. The suffix is _docked by default but can be modified with the --suffix flag:

mk_export.py docking_results/*.dlg.gz --suffix _docking_result

To write a PDB file with the entire receptor including modified sidechain positions, the prepared receptor written by mk_prepare_receptor.py or by the equivalent Python code is required using the -j/--read_json option, and the -p/--write_pdb option sets the filename of the output PDB file:

mk_export.py results.pdbqt -s results.sdf -p results.pdb -j rec.json

To write flexible sidechains as additional fragments to the ligand SDF, use the -k/--keep_flexres_sdf flag.

In Python#

This example converts an output file from AutoDock-GPU into a list of RDKit molecules, each with multiple conformers corresponding to a different docked pose:

from meeko import PDBQTMolecule
from meeko import RDKitMolCreate

fn = "autodock-gpu_results.dlg"
pdbqt_mol = PDBQTMolecule.from_file(fn, is_dlg=True, skip_typing=True)
rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(
    pdbqt_mol,
    only_cluster_leads=True,
    keep_flexres=False,
)

The length of rdkitmol_list is one if there are no sidechains and only one ligand was docked. If multiple ligands and/or sidechains are docked simultaneously, each will be an individual RDKit molecule in rdkitmol_list. In this example, keep_flexres=False would exclude sidechains from the RDKit molecules even if such sidechains existed. Note that docking multiple ligands simultaneously is only available in Vina, and it differs from docking multiple ligands one after the other. Each failed creation of an RDKit molecule for a ligand or sidechain results in a None in rdkitmol_list.

For Vina’s output PDBQT files, omit is_dlg=True:

pdbqt_mol = PDBQTMolecule.from_file("vina_results.pdbqt", skip_typing=True)

When using Vina from Python, the output string can be passed directly. See [the docs](https://autodock-vina.readthedocs.io/en/latest/docking_python.html) for context on the v object.

vina_output_string = v.poses()
pdbqt_mol = PDBQTMolecule(vina_output_string, is_dlg=True, skip_typing=True)
rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)