Perform a structure-based virtual screen on the command line
12. This tutorial carries on from last week's LIDAEUS tutorial. Now we want to dock the top 10 compounds (as ranked by LIDAEUS) using Vina, so make a new directory on the University filesystem (so ssh into mscoc2 first) and move into it:
mkdir vina
cd vina
13. The script sdfgrep.ksh is useful for manipulating SD files in various ways. Running it with no options or arguments will show all the things it can do (try it). Take the top 10 compounds from the LIDAEUS output and redirect them to a new file (the '../' part denotes that the output.sdf file is in the directory above your working directory):
~douglas/scripts/docking/sdfgrep.ksh -t ../output.sdf 10 > top10.sdf
14. Copy your protein and sitepoints files from the directory above to your working directory:
cp ../protein.pdb .
cp ../sites.mol2 .
15. Now you need to create a file called hostnames.txt that will tell the "job farming" program which computers you want Vina to run on. Type 'emacs hostnames.txt' to open a new file in the text editor Emacs (see here for how to use Emacs: http://doors.stanford.edu/~sr/computing/emacs.html). You need to enter into this file the name of the computer for each processor core you will use. For example, to use 2 cores on the mscoc2 server, your hostnames.txt file will look like this:
mscoc2.bch.ed.ac.uk
mscoc2.bch.ed.ac.uk
16. You are now ready to dock the top 10 compounds using Vina. The script 'autoAD' automates the process of running the 10 separate docking jobs. It also automates the process of converting the compounds from .sdf format to individual Vina-compatible .pdbqt files, and converting the protein from plain .pdb format to .pdbqt format. Running 'autoAD' with no options or arguments will show all the options available. Try that first and read the output, then perform the docking with:
~douglas/scripts/docking/autoAD.ksh -cqpjv top10.sdf protein.pdb sites.mol2
17. It may ask you if you are sure you wish to connect, if so type 'yes'. It may then prompt you to enter your password twice. With only 10 compounds, the run should take only a few minutes. The autoAD.ksh program will automatically exit once all jobs are done. Now you need to extract the various docking scores from the logfiles that were created (one per compound/docking job), and create a single SDF file that will contain all your docking poses. To do this you use the script 'rankAD.ksh'. The -c option will tell it to check the Vina output for certain errors, the -v option tells it that its input is Vina (and not Autodock) output and the -s option will create the SDF containing the docked poses. As with autoAD.ksh, running it with no arguments will show a help page (try this first).
~douglas/scripts/docking/rankAD_test.ksh -cvs top10.sdf *.dlg
18. You need to copy the Vina results directory to your laptop, so in the appropriate UWIN window type:
scp -r ED\\yourusername@mscoc2.bch.ed.ac.uk:LIDAEUS_tutorial/vina .
cd vina
Now can inspect the docking results by loading the relevant files into Pymol. Again, look for interactions such as hydrogen bonds.
19. You should also look at the rankedlist.txt file as this contains a summary of the docking scores:
more rankedlist.txt
20. We now want to dock the top 4 Vina hits using Autodock. On mscoc2, first create an Autodock directory inside the Vina directory and set up your files:
mkdir autodock
cd autodock
cp ../protein.pdb .
cp ../sites.mol2 .
~douglas/scripts/docking/sdfgrep.ksh -t ../reranked_dockedposes.sdf 4 > top4.sdf
21. The autoAD script is also used to automate the Autodock run, as it was with Vina, only this time a different option is used. Remember to create (or copy) your hostnames.txt file first! The options are similar to those used when using Vina, expect no -j or -v is required, and instead -g is used to prepare the grid paramater .gpf file, the -m option runs Autogrid, the -d option prepares the docking paramater .dpf file, and -s starts the Autodock job farming (instead of the Vina job farming option -v used previously).
~douglas/scripts/docking/autoAD.ksh -cqpgmds top4.sdf protein.pdb sites.mol2
22. Once the run is finished, you can logout and again inspect the results visually using Pymol. Remember to first run rankAD.ksh to extract the dockedposes from the Autodock logfiles:
~douglas/scripts/docking/rankAD.ksh -cas top4.sdf *.dlg