You are on page 1of 64

DEVELOPMENT OF METHODS FOR DOCKING AND DESIGNING SMALL MOLECULES WITHIN THE ROSETTA CODE FRAMEWORK

A doctoral dissertation defense presented by

GORDON HOWARD LEMMON

ROSETTA

Tuesday September 18th 2012

Outline of presentation
2

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

Outline of presentation
3

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

What is structural biology?


4

Structural Biology is the study of structure and function of biological molecules such as DNA, RNA, and proteins

DNA

Proteins

How big are proteins?


5

1 Angstrom () = 1 ten millionth of a millimeter

Water O H H
1.51

HIV-1 Protease (PR)

Amprenavir

~54 3163 atoms ~17 72 atoms

Proteins consist of amino acid chains


6

Protein sequence determines structure

Protein structure determines function


8

HIV-1 protease cleaves poly-protein precursors to form functional proteins


HIV-1 protease Peptide chain

Proteins are dynamic


9

Outline of presentation
10

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

What is protein modeling?


11

Prediction of protein structure from


1. Sequence alone (de novo folding)

HIV-1 PR Amino Acid Sequence


ANPCCSNPCQNRGECMSTGFDQ YKCDCTRTGFYGENCTTPEFLTRI KLLLKPTPNTVHYILTHFKGVWNIV NNIPFLRSLIMKYVLTSRSYLIDSP PTYNVHYGYKSWEAFSNLSYYTR ALPPVADDCPTPMGVKGNKELPD SKEVLEKVLLRREFIPDPQGSNM MFAFF

What is protein modeling?


12

Prediction of protein structure from


2. Sequence similarity (Comparative modeling)

HIV-1 PR Sequence
PQITLWKRPLVTIRIGGQL KEALLDTGADDTVLEEMN LPGRWKPKMIGGIGGFIK VRQYDQIPIEICGHKAIGT VLVGPTPTNVIGRNLLTQI GCTLNF

HIV-1 PR

HIV-2 PR

What is ligand docking?


13

Prediction of structure of protein/ligand interface Prediction of ligand binding affinity

Outline of presentation
14

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

15

Rosetta protein modeling consists of sampling and scoring

16

RosettaLigand docking consists of sampling and scoring

17

RosettaLigand docking consists of sampling and scoring

18

RosettaLigand docking consists of sampling and scoring

RosettaLigand score function


19

Knowledge-based score terms

Score term

Default weight

attractive repulsive solvation dunbrack pair hbond_lr_bb hbond_bb_sc hbond_sc

0.8 0.4 0.6 0.4 0.8 2.0 2.0 2.0

Outline of presentation
20

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

21

HIV-1 PR is flexible
22

Simmerling 2005

23

HIV-1 PR becomes rigid upon PI binding

HIV-1 protease mutations


24

WHO drug resistance mutations in red

Mutation leads to conformational diversity

FDA approved protease inhibitors (PIs)


25

Lopinavir Tipranavir

Darunavir

Atazanavir

26

Previous PR/PI G predictions failed


Experimental vs Predicted HIV-1 PR G
Score Function Number of non-hydrogen atoms X-Score (HPScore) SYBYL (ChemScore) DS (PMF04) Correlation N=112 0.172 0.341 0.276

0.183
0.225 0.38
Cheng (2009) Jenwitheesuk E Samudrala R. (2003)

DrugScore (PairSurf)
AutoDock

Defining G and G
27

176 experimental PR/PI Gs 171 PR template structures


28

176 PR/PI Gs
sequence

but not structure 34 sequences 10 distinct protease inhibitors

171 PR structures represent PR flexibility

RosettaLigand PR/PI Gs predictions


29

PR/PI Gs prediction workflow


176 Sequence/PI pairs 171 PR template structures

RosettaLigand Docking
Random 5 Translation complete rotation of PI 0.1 5 PI movements

30,096 Rosetta inputs 10 Rosetta relaxed models per input (300,960 models) 1000 RosettaLigand docked models per relaxed model (300,960,000 docked models) Top 10% of models by total score for each Sequence/PI pair Top models by interface score for each Sequence/PI pair

Side chain and ligand rotamer sampling

x6

Energy filter MC Accept Minimization of PR side chain and PI torsion angles Minimize Backbone torsion angles

Reweighting score terms improves HIV-1 PR/PI G predictions


30

Score term

Default weight

Optimized weights

attractive repulsive solvation dunbrack pair hbond_lr_bb hbond_bb_sc hbond_sc

0.8 0.4 0.6 0.4 0.8 2.0 2.0 2.0 0.16

0.71 -0.01 0.68 0.29 0.80 0.85 0.09 -0.35 0.38

0.31 0.17 0.15 0.43 0.80 0.11 -0.20 1.71

CORRELATIONS (R)

0.51

Assuming constant unbound G improves PR/PI G predictions


31

Standard approach Constant unbound approach

Correlation plots
Experimental on X Predicted on Y
32

Default weights: R=0.16

Previous PR/PI G predictions failed


33

Experimental vs Predicted HIV-1 PR G


Score Function Number of non-hydrogen atoms X-Score::HPScore SYBYL::ChemScore DS::PMF04 Correlation N=112 0.172 0.341 0.276

0.183
0.225 0.38

DrugScorePDB::PairSurf
AutoDock

RosettaLigand

0.71

Outline of presentation
34

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

35

Flexibility through fragments


36

Fragment the Ligand

Search database for fragments

Sample from libraries during docking

Assemble rotamer libraries

Ligand fragment rotamers allow efficient flexibility


37

Ligand rotamer docking

38

Ligand docking with interface design


39

DHT

RosettaLigand prediction
A54R

Enlarged prostate gland

L50Y

C9R

prostate cancer

DHT

DHT: Dihydrotestosterone HisF: imidazole glycerol phosphate synthase

HisF

Fragment based screening can greatly expand sampling space


Traditional Screening Fragment based screening

Congreve, M. et al. Drug Discov.Today 2003,8, 876-877

40

Common drug based Fragments


41

H N

H N

N
N

O S NH 2 O

O
N

NH

N H

OH

N O

N N

H N N

NH NH2

O NH

N
N

OH
Hartshorn M.J. Murray C.W.et.al. J. Med. Chem. 2005 48 403-413

RosettaLigandDesign
Library of small molecule fragments

Place fragments in protein binding site


Dock ligand with flexible protein side-chains and backbone

Select low energy models for refinement -10 -12 -7

-5

42

RosettaLigandDesign
Library of small molecule fragments

Place fragments in protein binding site


Dock ligand with flexible protein side-chains and backbone

Select low energy models for refinement -8 -15 -10

-18

-12

43

Examples of fragments
Core fragment
Carbon

1 connection

Oxygen

Nitrogen

2 connections
CH2 connections
44

Ntrp connections

Random assembly of fragments


45

Rosetta ligand design in action


46

A. B. C. D. E. F. G.

Low-res search for starting fragment Refine (dock) starting fragment Grow small-molecule using fragment library Refine (dock) 2-fragment complex Grow small-molecule using fragment library Refine (dock) 3-fragment complex Add Hydrogens to unsatisfied connection points

Protein binding sites are complex


47

Inorganic phosphate

Dethiobiotin (DTB)

Mg Ions

ADP

Multiple Ligand docking may capture induced fit effects


48

Serial Docking

Simultaneous Docking

Rosetta multiple ligand docking


49

Outline of presentation
50

A. What is structural biology?


B. Protein modeling and ligand docking

C. Introduction to Rosetta software


D. HIV-1 PR/PI binding affinity prediction

E. Rosetta software development


F. Ligand docking with waters using improved Rosetta ligand docking code

51

Binding of HIV-1 protease inhibitors involves H2O

Translation of water and PI


52

Rotation of water and PI


53

54

RMSD measures accuracy of docked models

6 Angstrom () RMSD

2 Angstrom () RMSD

Root mean square deviation

6 Angstrom () RMSD

2 Angstrom () RMSD

55

Protein-centric waters improve HIV-1 protease placement

56

Ligand-centric waters improve CSAR inhibitor placement


Community Structure-Activity Resource 299 protein/ligand structures with interface waters

RMSDs vs Rosetta scores


57

Waters improve docking in noncrowded interfaces


58

59

Interface crowdedness correlates with helpfulness of water docking

Conclusions
60

Binding affinity predictions can be improved by


Optimizing

Rosetta score term weights Ignoring the unbound state

New RosettaLigand code allows


Multiple

ligand docking Fragment based rotamers for greater flexibility Fragment based design of ligands

Docking with waters helps in spacious binding cavities, hurts in crowded binding cavities

Professional acknowledgements
61

Meiler Lab Jens Meiler Kristian Kaufmann Sam Deluca Steven Combs

Committee David Tabb Richard DAquila Brian Bachmann Jarrod Smith

RosettaCommons Molecular Biophysics Training Grant (NIH)

Personal acknowledgments
62

Church Friends

Personal acknowledgements
63

Personal acknowledgements
64

You might also like