LIDAEUS tutorial
First set up your environment variables by copying the .profile file I sent you to your UWIN home directory. Then start a new instance of UWIN.
Now, typing into the UWIN terminal, log in to our University filesystem ('mscoc2' is the name of the server we use for MSc courses):
ssh ED\\yourusername@mscoc2.bch.ed.ac.uk
1. Use the commands 'pwd' and 'ls' to see which directory (folder) you're in (your working directory) and list the contents. If you're happy with the directory you're in, make a new one:
mkdir LIDAEUS_tutorial
2. Change your working directory to the new directory:
cd LIDAEUS_tutorial
3. Open a second UWIN window on your laptop - you will now have one window where you are logged in to the University filesystem, and another where you are on your own computer. In this new window, make a new directory and 'cd' into it. Copy the PDB file of the protein-ligand complex to the directory you're in by using 'scp'. This is a special type of cp command that can copy files over the internet. 'mscoc2.bch.ed.ac.uk' is the address of the University MSc server. So the following command will copy from the specified directory on the server to the current directory on your computer:
scp ED\\yourusername@mscoc2.bch.ed.ac.uk:~douglas/teaching/tutorial/3LBL.pdb .
TAKE A MINUTE TO UNDERSTAND EXACTLY HOW THIS WORKS!
4. Open this file in Pymol. Note that there are 3 protein-ligand complexes in the asymmetric unit. Select one of the ligands and orient on its position in the active site of its receptor protein. Now remove all the water molecules, then save one of the protein molecules as protein.pdb and its ligand as ligand.pdb. You now need to copy these two files to your LIDAEUS_tutorial directory on mscoc2 (good luck!).
5.Mapgen is a script that automates the process of calculating LIDAEUS sitepoints. Note that, like most UNIX programs, when you run mapgen without options or arguments it displays usage instructions:
~douglas/teaching/tutorial/mapgenWin.ksh
Calculate energy maps and create sitepoints by running mapgen.ksh.
~douglas/teaching/tutorial/mapgenWin.ksh -m ligand.pdb protein.pdb sites.mol2
Mapgen will ask you for values that determine the size of the box to draw around your ligand and the energy cutoffs below which a sitepoint will be generated. If you press return without typing a number the default values (displayed in brackets) will be used. The larger your pad command the slower the process will be as a larger box will be defined, resulting in larger map files to be calculated.
6. Once the calculations are finished, you should inspect the location of the sitepoints using Pymol, making sure they are in your binding site of interest. This will require you to 'scp' the sites.mol2 file from mscoc2 to your computer. Sitepoints denote locations where a certain atom types would be favourable; red denotes hydrogen bond acceptors, blue denote hydrogen bond donors. If any are outside the binding site, delete them, save them as sites.mol:
File -> Save molecule -> select sites -> select file type .mol from the drop-down menu
This .mol file needs to be converted to mol2 format. 'scp' it to mscoc2 and use the program OpenBabel to do this:
babel -imol sites.mol -omol2 sites.mol2
Then 'scp' them back to your computer.
7. You want sufficient sitepoints to be located within the active site pocket of the protein target, and a balance between the three difference types of sitepoints. If you decide that there are not enough sitepoints, or that there are not enough of a particular type (e.g. acceptors - red) rerun mapgen.ksh, this time altering the parameters from their defaults. Increasing the pad parameter, decreasing the energy cutoffs, and increasing the target number of sitepoints will all increase the number of sitepoints written out. Be careful not to have too many sitepoints; less than 220 (after making deletions in Pymol) is best otherwise it will slow the LIDAEUS run (although you want at least 100).
8. Once you are happy with your sitepoints, you can set up the LIDAEUS run.
Log in to the LIDAEUS website: http://opus.bch.ed.ac.uk/lidaeus/
Click through each of the stages, following the instructions for uploading your files. Upload your sitepoints sites.mol2 file.
Type your name into the "Run name" box
Set the "Purpose of this run" to "Computational tutorial"
On the Options page, select the "Test set (1000)" from the drop-down menu
Keep the top 20 results.
Set the sitepoint matching tolerance to 0.04 A.
Once you have uploaded all your files and submitted your options, click the link to Submit. The LIDAEUS run should be very quick as you are only docking 1000 compounds; the top 20 of these will be in a file called output.sdf which will be in a zip file emailed to you. Download this zip file to your working directory and unpack it. You will see the file output.sdf - this contains your docked compounds.
9. Before we can look at your top 20 hits in Pymol, we need to convert them to a Pymol-friendly format. This must be done on mscoc2, so you will need to 'scp' the file output.sdf to the University filesystem first. Then login, 'cd' to the correct directory, and type:
~douglas/scripts/fixsdf.ksh output.sdf
10. 'scp' the output.sdf file back to your computer. Open your protein file, sitepoints and output.sdf using Pymol.
11. You can scroll through the 20 docked compounds using the grey arrows in the lower right pane. They are ranked in order of LIDAEUS score, with the compound predicted to bind the tightest coming first. Do the compounds make good interactions with the protein? Make a note of what kind of interations are made (e.g. hydrogen bonds, van der Waals, electrostatics). Are they positioned within the active site pocket? Are they similar to each other or is there a range of chemical classes?