Preprocess#

preprocess workflow

Ligand preparation

Build 3D ligand structures from SMILES and export docking-ready files such as SDF, PDB, or PDBQT.

Receptor preparation

Repair, minimize, clean, and convert receptor structures into docking-ready outputs for downstream engines.

Grid box definition

Estimate the docking search region from ligand coordinates and export Vina-compatible box parameters.

Ligand preparation#

LigandPrep

Convert SMILES into 3D ligand structures for docking workflows.

Use LigandPrep when you want to prepare ligands from SMILES input and export them into standard formats such as SDF, PDB, or PDBQT.

Typical use cases include:

  • preparing small ligand batches from lists,

  • reading ligands from tables or DataFrames,

  • generating docking-ready PDBQT files,

  • keeping generated MolBlocks in memory.

from prodock.preprocess import LigandPrep

ligands = (
    LigandPrep(output_dir="ligands")
    .from_smiles_list(
        [
            "COC1=C(C=C2C(=C1)N=CN=C2NC3=CC(=C(C=C3)F)Cl)OCCCN4CCOCC4",
            "COCCOC1=C(C=C2C(=C1)C(=NC=N2)NC3=CC=CC(=C3)C#C)OCCOC",
        ],
        names=["gefitinib", "erlotinib"],
    )
    .set_output_format("pdbqt")
    .process_all()
)

print(ligands.summary)
print(ligands.output_paths)

Receptor preparation#

ReceptorPrep

Clean, minimize, and export a receptor into a docking-ready artifact.

Use ReceptorPrep when you start from a receptor PDB and want a prepared output for docking.

The high-level workflow handles:

  • receptor fixing,

  • minimization,

  • conversion to PDB or PDBQT,

  • fallback handling when one preparation route fails.

from prodock.preprocess import ReceptorPrep

receptor = ReceptorPrep().prep(
    input_pdb="EGFR_1M17.pdb",
    output_dir="receptor_out",
    out_fmt="pdbqt",
)

print(receptor.final_artifact)
print(receptor.last_simulation_report)

Grid box computation#

GridBox

Define the docking search region from ligand geometry.

Use GridBox when you want to derive a docking box from one ligand or from several reference ligands.

This is commonly used for:

  • reference-ligand-guided docking,

  • reproducible docking box generation,

  • Vina-compatible center and size export.

from prodock.preprocess import GridBox

box = (
    GridBox()
    .load_ligand("AQ4.sdf")
    .from_ligand_pad(pad=4.0, isotropic=False)
)

print(box.center)
print(box.size)
print(box.to_vina_lines())

Minimal end-to-end example#

Example

Preprocess a full docking setup

from prodock.preprocess import LigandPrep, ReceptorPrep, GridBox

ligands = (
    LigandPrep(output_dir="project/ligands")
    .from_smiles_list(
        ["CCO", "c1ccccc1"],
        names=["ethanol", "benzene"],
    )
    .set_output_format("pdbqt")
    .process_all()
)

receptor = ReceptorPrep().prep(
    input_pdb="EGFR_1M17.pdb",
    output_dir="project/receptor",
    out_fmt="pdbqt",
)

box = (
    GridBox()
    .load_ligand("AQ4.sdf")
    .from_ligand_pad(pad=4.0, isotropic=False)
    .snap(step=0.25)
)

print("Ligands:", ligands.summary)
print("Receptor:", receptor.final_artifact)
print("Box:")
print(box.to_vina_lines())

API and next steps#

See also#

  • Preprocess API — full reference for LigandPrep, ReceptorPrep, and GridBox

  • Structure API — low-level conversion and structure utilities

  • Dock — continue to the docking stage