You are on page 1of 71

Lecture 1 Representation of biological systems as networks

Introduction to Network
Analysis in Systems Biology
Avi Maayan, Ph.D.
Department of Pharmacology and
Systems Therapeutics
Systems Biology Center New York (SBCNY)

Mount Sinai School of Medicine


New York, NY

Two Fundamental Ways to Abstract


Biochemical Reactions

Eisenberg et al. Nature 405:823 (2000)


2

Different Levels of System Representation


a

A1

A2

B1

E1

A1

A2

B1

E1

C1

E2

E3

D1

C1

E2

E3

D1

D2

A3

D3

C2

D2

A3

D3

C2

A1.2
A1.1

A2

B1

E1

E3

D1

A- gene ontology
B- protein-protein interactions
(undirected graphs)
c

C- signaling network diagrams


(mixed graphs,
directed/undirected)
D- ODE modeling of signaling
pathways (directed and weighted)
E- PDE modeling of signaling
pathways considering space
(directed, weighted and nodes
can move or be at different
compartments)

d
A1

A2

B1

E1

C1

E2

E3

D1

C1

E2

D2

A3

D3

C2

D2

A3

D3.1
D3.2

C2.2
C2.1

e
D2

E2

A2

A1.1

A1.2

B1

E1

D1

C2.1
C2.2

A3

E3

D3.1

D3.2

C1

Ma'ayan et al. Annu Rev Biophys Biomol Struct. 34:319-349 (2005)

Graph Theory - Basic Concepts


G = {V, E, A}
G graph
V vertices/nodes
E edges/links
A- arcs/directed edges/arrows

Planar Graphs: when


there are no edge
crossing

http://en.wikipedia.org

Bipartite Graphs: two


sets of nodes; links
only between
members of each set

Metabolic
Networks

Glycolysis

Bourqui et al. BMC Systems Biology 1:29 (2007)

Two types of nodes: enzymes and substrates


Reactions can be directional or bidirectional
Bipartite graph, reactions are not connected
and substrates are not connected
Berg et al. Biochemistry
New York: W. H. Freeman and Co.; c2002

Cell Signaling
Pathways
Gi/o Pathway
Nodes are proteins, metabolites, lipids,
second messengers, or peptides
Interactions designate information flow, can
be activation or inhibition, and are direct and
physical

Ma'ayan A, et al. Sci Signal. 2:cm1 (2009)

Cell Signaling Networks

Signaling
pathways are not
isolated and can
be merged into
large networks

Maayan et al. Science 310, 1078 (2005)

Indirect Signaling Interactions from


Literature
Pseudo-nodes
are used as
place holders to
fill-in unknown
links and
components

Li et al. PLoS Biol. 4:e312 (2006)

Kinase-Substrate Network
Protein kinasesubstrate networks
are directed bipartite
graphs that connect
kinases to their
substrates through
protein
phosphorylation

Tan et al. Sci Signal. 2009 Jul 28;2(81):ra39

Example of Gene Regulation Networks


Stem cell differentiation regulation

Nodes are genes and transcription factors


Interactions can be directional or bidirectional
Interactions can be activation or inhibition

MacArthur et al., PLoS ONE 3: e3086 (2008)

10

Another Example of a Gene


Regulation Network
Drosophila Segment Polarity Expression Pattern

Nodes are genes, transcription


factors or signaling components

Interactions are directional and


can be activation or inhibition

Albert R, Othmer HG. J Theor Biol. 2003 223(1):1-18.


11

Network Construction from Legacy Literature


Manual
Semi-automated (i.e. preBIND)
Natural Language Processing (NLP) (i.e. PathwayStudio)
preBIND

Donaldson I, et al. BMC Bioinformatics. 4:11 (2003)


12

PPI Networks from Y2H Screens


Yeast

Does the small overlap between the


two studies mean that highthroughput Y2H screens are not
identifying real interactions?
13

PPI Networks from Y2H Screens


Fly

Worm

Li et al. Science 540:303 (2004)

Giot et al. Science 1727:302 (2003)

14

PPI Networks from Y2H Screens


Human

Defined different levels


of confidence

Identified disease
genes
Assessed overlap with
literature-based
interactions
Used GO annotation

Blue- literature
Red- Y2H screen (~78% verified by Co-IP)
15

Epistasis Networks: Inferring Networks


by Double Deletion Mutants
291 genetics
interactions
among 204
yeast genes

Hin Yan Tong, Science 294: 2364 (2001)

16

Epistasis Interactions in Yeast


Metabolism
Two types of links:
buffering and aggravating
Links can be directional
or bi-directional

Segre et al., Nature Genetics 37:77 (2004)

17

Inferring Networks from Time


Series Microarrays

Zou M, Conzen SD. Bioinformatics. 2005 21(1):71-9.

18

Perturbations and Bayesian Networks


Networks can be inferred using targeted pertrubations

Sachs et al. Science. 2005 308:523-9

19

Disease Gene Networks

Goh et al. Proc Natl Acad Sci USA. (2007) 104:8685-90


Each node corresponds to a distinct disorder, colored based on the disorder class. The size of
each node is proportional to the number of genes in the corresponding disorder, and the link
thickness is proportional to the number of genes shared by the disorders connected by the link.

20

Drug-Target Networks
Drugs can be connected to their known protein targets

Maayan et al. Mt Sinai J Med (2007) 74:27


Yildirim et al. Nat Biotechnol. (2007) 25:1110
21

Bipartite Networks for Data


Integration
Gene IDs can be used as
anchors for integrating
different omics datasets

Tanay et al. PNAS (2004) 101:2981

22

Pajek - Free Windows Software


to Visualize Networks

http://vlado.fmf.uni-lj.si/pub/networks/pajek/
23

Cytoscape - Leading Academic Network


Analysis and Visualization Software

Shannon et al. Genome Res. 2003 13(11):2498-504

24

Summary
Different types of biological intracellular molecular
networks can be represented by different types of
graphs
Networks can be created from collecting interactions
published in many papers, or networks can be
reconstructed directly from data
Protein interaction networks and cell signaling
networks can be connected to drugs and diseases
Network representation can be used to integrate
different datasets using genes as anchors
25

Lecture 2 Milestones and key concepts in network analysis

Introduction to Network
Analysis in Systems Biology
Avi Maayan, Ph.D.
Department of Pharmacology and
Systems Therapeutics
Systems Biology Center New York (SBCNY)

Mount Sinai School of Medicine


New York, NY

26

Konigsberg Bridge Problem

27

P. Erdos A. Rnyi. Publ. Math.


(Debrecen) 6, 290-297 (1959)
In the 1960s Paul Erdos and Alfred Renyi studied the properties of random graphs.
What are the mathematical consequences of throwing on the floor a random number of
buttons and randomly connecting them with a random number of links?
28

Real Networks are Small World

Watts DJ, Strogatz SH. Collective dynamics of 'small-world' networks.


Nature. 1998 Jun 4;393(6684):440-2.

29

Clustering Coefficient

Ravasz et al. Science 297, 1551 (2002)

Characteristic Path Length


Average shortest path from between all possible pairs of nodes

30

Creating Small-World Networks

31
Watts DJ, Strogatz SH. Collective dynamics of 'small-world' networks. Nature. 1998 Jun 4;393(6684):440-2.

Real Networks are Scale Free


Barabasi and Albert. Science
286, 509 (1999)

Barabasi, Albert and colleagues found that many real networks including the
Internet and the WWW are scale-free. This means that the connectivity
distribution of nodes fits a power-law.
Barabasis group analyzed databases of metabolic networks in lower organisms
and the protein-protein interactions map of the yeast proteome inferred from highthroughput yeast-2-hybrid screens. All shown to have scale-free connectivity
distribution.

Jeong et al. Nature 407, 651 (2000)

Jeong et al. Nature 411, 41 (2001)

32

Erdos-Renyi random networks vs.


Barabasi-Albert scale-free networks

Barabasi, Physics World, July 2001

33

Creating Scale-Free Networks

Barabasi and Albert. Science 286, 509 (1999)

34

The Importance of Hubs

H. Jeong, S. P. Mason, A.-L. Barabsi and Z. N. Oltvai. Lethality and centrality


in protein networks. Nature 411, 41-42 (2001)
Albert R, Jeong H, Barabasi A-L: Error and attack tolerance of complex
networks. Nature 2000, 406(6794):378-382.

35

Creating Scale-Free Networks using


Duplication-Divergence Growth

The network grows by


copying a node with its
links, then some links are
deleted with probability p,
and a link is formed
between the copied node
and the new node with
probability q.

Vzqueza et al. Complexus 1:1 (2003)


36

Creating Geometric Random Networks

Throwing a bunch of buttons


in N-dimensions and
connecting buttons if they
are close in Euclidian space
(geometric distance between
nodes)

Przulj et al. Bioinformatics. 2004 20:3508

37

Network Motifs are Recurring Patterns of


Connectivity
Motifs are those circuits that are statistically more prevalent in real
networks vs. motifs found in randomized networks

Milo et al. Science, 298, 824 (2002)


38

Graphlets motifs in
undirected networks

Evolutionary conservation of
motif constituents in the
yeast protein interaction
network
S Wuchty et al.
Nature Genetics
35, 176 179 (2003)

39

Considering Protein Structure of Hubs

Hub proteins are either


multi or single site

Kim et al. Science 314, 1938 (2006)

40

Bow-Tie Structure of Signaling Networks

Oda and Kitano. Molecular Systems Biology 2:2006.0015 (2006)


41

Hierarchical Organization of Pathways


from Ligands to Effectors

Ma'ayan et al. Phys Rev E Stat Nonlin Soft Matter Phys. 2006 73:061912
Power-law distribution of branched pathways
A topology common for systems that need to make discrete
decisions based on a continues complex state of the environment
42

General Topological Properties


of Biomolecular Networks
A- power-law connectivity distribution
B- party hubs and date hubs
C- multi-site and single-site hubs
D- power-law distribution of branched
pathways

E- bow-tie structure of signaling pathways


F- bifans, the most common motifs
G- negative feedback loops at the
membrane
H- monotone system topology
I- nesting of positive feedback loops

Ma'ayan A. J Biol Chem. 2009 284(9):5451

43

Maayan et al. PNAS105:19235 (2010)

44

Maayan et al. PNAS105:19235 (2010)

45

Maayan et al. PNAS105:19235 (2010)

46

MacArthur, Sanchez-Garcia and Maayan, Phys. Rev. Lett. 104, 168701 (2010)
47

Summary
Real networks are small world and scale free
Simple algorithms can recreate the structure of real
networks
Shuffled networks are created for statistical control
Network motifs and graphlets define the topology at
the microscopic level
Real biological regulatory networks have date-andparty hubs, hubs are either multi or single site,
pathways branching follows a power-law, signaling
networks display bow-tie structure, bifans are highly
enriched, feedback loops are depleted and nested to
48
provide dynamical stability.

Lecture 3 Making predictions using network analysis

Introduction to Network
Analysis in Systems Biology
Avi Maayan, Ph.D.
Department of Pharmacology and
Systems Therapeutics
Systems Biology Center New York (SBCNY)

Mount Sinai School of Medicine


New York, NY

49

Making Predictions based on Network Topology

Proteins close to each other in the interactome


network are also likely to share GO terms

Sharan et al. Molecular Systems Biology 3, 88 2007

50

Making Predictions based on Network Topology

Albert and Albert used the


SUGGEST algorithm used to
organize products in a
supermarket to predict proteinprotein interaction based on
known protein-protein
interactions

51

Making Predictions based on Network Topology

Completing defective cliques can be


used to predict protein interactions

Yu et al. Bioinformatics 22, 7 (2006)

52

How can we use prior knowledge


networks for analyzing multivariate
experimental results?

+
Computational Modeling
Experiments
(High-content)

Low hanging fruit hypotheses

53

Induction of Neurite Outgrowth


The Goal is to Better Understand Initial Cell Signaling
Activation of Transcription Factors After HU-210
Stimulation of CB1R Receptors

Govek et al.Genes & Dev. 19:1 (2005)

Study the Process of Cell Differentiation

54

Protein-DNA Arrays:
Measuring Transcription Factor Activation
20 min

DMSO

AP-2

RAR

CREB
MYB

STAT3
PAX6

23 TF increase binding to DNA after 20 minutes


TFAP2A, CEBPA, NFYA, MYB, CREB1, NR3C1, STAT3, SMAD3, SMAD4,
STAT4, THRA, THRB, VDR, GATA2, STAT1, PAX6, XBP1, NR1I2, HOXD8,
HOXD9, HOXD10, RUNX2, HIVEP1

Validated factors with


Gel-shift assays
Bromberg KD, Ma'ayan A, Neves SR, Iyengar R.
Science. 2008 May 16;320(5878):903-9.

55

signal

Transcription
Factor
Consensus promoter sequence

Transcription
Factor
Consensus promoter sequence

Genes2Networks
29,317

List of
TFs
_____
_____
_____
_____

18,675
7,241

Vidal
Stelzl

Filter

6,149
4,242
3,155

Integrator

Unfiltered
Dataset

Filtered
Dataset

3,121

Genes2Networks

Output
subnetwork

1,418
1,059
242

Berger SI, Posner JM, Ma'ayan A.


BMC Bioinformatics. 2007 Oct 4;8:372.

Significant
Intermediates
56

The Genes2Networks Algorithm


Inputs
Large-scale
mammalian
proteinprotein
interaction
network
Seed list of
proteins
which are
nodes in the
background
network

Algorithm

Output

Step 1: Find all shortest paths for all pairs of


nodes from the seed list
Step 2: Combine all links and nodes from all
found shortest paths to form a subnetwork
Step 3: Add all missing links that directly
connect any pair of nodes from the
subnetwork using interactions from the
background network
Step 4: Rank intermediate nodes (node that
are not from the seed list) based on the
proportion of links in the created
subnetwork vs. total links in the background
network using a binomial proportion test

Subnetwork
connecting
the seed
nodes

Table with
ranked
intermediate
proteins

Berger SI, Posner JM, Ma'ayan A.


BMC Bioinformatics. 2007 Oct 4;8:372.
57

Genes2Networks
Web Interface
- Hash function for fast loading
of the datasets
- Implementation of AJAX
allows changing the page
without reloading
- GraphViz, Overlib, and
PerlMagic library utilization

Berger SI, Posner JM, Ma'ayan A.


BMC Bioinformatics. 2007 Oct
4;8:372.

http://actin.pharm.mssm.edu/genes2networks

58

Network Connecting Activated Factors

Bromberg KD, Ma'ayan A, Neves SR, Iyengar R.


Science. 2008 May 16;320(5878):903-9.

59

Making Predictions by Network Analysis

Bromberg KD, Ma'ayan A, Neves SR, Iyengar R. Science. 2008 320(5878):903-9.

60

Experimental Validation

BRCA1 Blocks Neurite Outgrowth

PI3K-AKT Pathway is Important for


Neurite Outgrowth and Regulates Many of
the Indentified Factors

Bromberg KD, Ma'ayan A, Neves SR, Iyengar R. Science. 2008 320(5878):903-9.


61

Predicting Disease Genes


Noonan Syndrome

Noonans Symptoms
- Heart Defects
- Distinct Facial Features
- Learning Difficulties
- Bruising and Bleeding
- Mild up regulation in the MAPK pathway (gain of function mutations)
- Four disease genes were identified in about 60% of patients
62

Genes2Networks was used to find Additional


Genes that may be Mutated in Noonan Syndrome

Use known disease genes to build a network around these genes to identify new
genes/nodes that could be additional disease genes
Cordeddu V, Di Schiavi E, Pennacchio LA, Ma'ayan A, et al. Nat Genet. 41:1022 (2009)

63

Steiner Trees used to Connect


Seed Genes

White and Maayan, 41st ACSSC 2007. IEEE p. 155-159


64

Steiner Trees Used to Connect Signaling


Pathways to Gene Regulation

Huang SS, Fraenkel E. Sci Signal.


2009 2(81):ra40

65

PluriNet - Connecting Differentially


Expression Genes in Different Stem-Cells
Using Protein Interactions from Literature

Mller et al. Nature. (2008) 455:401

66

KEA- kinase-substrate interaction database and webbased system for kinase enrichment analysis

http://amp.pharm.mssm.edu/lib/kea.jsp
Lachmann and Maayan. Bioinformatics 11, 87 (2010)

67

ChEA- chip-chip and chip-seq database of


protein-DNA interactions and enrichment
analysis tool

118
107
35286
>150

unique transcription factors


publications
genes
ChIP-X assays (ChIP-chip, ChIP-seq,
ChIP-PET)
Average targets per transcription factor
1,300
Total interactions 254,854
68

ChEA works well for determining TFs regulating


gene expression changes: Myc was inferred as an
effector of Estrogen in MCF7 cells

69

Lachmann A, Xu H, Krishnan J, Berger SI, Mazloom AR, and Maayan A. ChEA: Transcription Factor
Regulation Inferred from Integrating Genome-Wide ChIP-X Experiments. Bioinformatics, 26, 2438-44 (2010)

Summary
Prior knowledge networks can be used to predict
function of proteins, protein interactions and disease
genes
Different algorithms can be used to connect seed
lists of proteins with known interactions from prior
knowledge networks
Network analysis can be used to develop hypotheses
for functional experiments by combining highthroughput profiling data with prior knowledge
networks
70

Slides from a lecture in the course Systems BiologyBiomedical


Modeling

Citation: A. Maayan, Introduction to network analysis in systems biology. Sci. Signal.


4, tr5 (2011).

You might also like