You are on page 1of 23

f

Basic Molecular Genetic Mechanisms


Structure of Nucleic Acids
Transcription of Protein-Coding Genes and Formation of mRNA
Decoding of mRNA
Stepwise Synthesis of Proteins on Ribosomes
DNA Replication
DNA Repair and Recombination
Viruses
Molecular Genetic Technique
Analysis of Mutations to Identify and Study Genes
Cloning and Characterization(PCR, ECP)
Using Cloned Fragments to Study Gene Expression
Locating and IDing Human Disease Genes
Inactivating Eukaryotic Genes
Genes, Genomics, and Chromosomes
Eukaryotic Gene Structure
Chromosomal Organization of Genes and Noncoding DNA
Transposable DNA Elements
Organelle DNAs
Genomics
Structural Organization of Eukaryotic Chromosomes
Morphology and Functional Elements of Eukaryotic Chromosomes
Transcriptional Control of Gene Expression
Control of Gene Expression in Bacteria
Overview of Eukaryotic Gene Control
RNA Polymerase II Promoters and General Transcription Factors
Regulatory Sequences and Proteins

Molecular Mechanisms of Transcription, Repression, and Activation


Regulation of Transcription-Factor Activity
Epigenetic Regulation of Transcription
Post-Transcriptional Gene Control
Processing of Eukaryotic Pre-mRNA
Regulation of Pre-mRNA Processing
Transport of mRNA Across the Nuclear Envelope
Cytoplasmic Mechanisms of Post-transcriptional Gene Control
Processing of tRNA & rRNA

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Basic Molecular Genetic Mechanisms
Structure of Nucleic Acids
DNA, RNA both polymers composed nucleotides
RNA more diverse functions (eg. as catalyst)
All nucleotides are made of an organic base+5 Carbon Sugar+ phosphate group
Purine=A&G=fused double bonds Pyrimidine= C+T+U=Single bonds
5 end has phosphate/hydroxyl group on 5 carbon of end sugar, 3 end has hydroxyl
on 3 carbon of terminal sugar (sequences read 5 --> 3)
bond between nucleotides is a phosphodiester bond (2 phosphoester bonds)
Chargaffs Law
DNA usually right-handed double helix, normally less compact B form (when water
is removed in lab, it turns into A form)

DNA has hydrogen in 2 position, making it more chemically stable than RNA with
hydroxyl in 2
denatured at temperature Tm, which varies depending on G-C concentration
(because they have more stability with 3 hydrogen bonds) and ion concentration
(direct relationship) and extreme pH
may renaturate, a feature used in hybridization of sequences
topoisomerase I can relieve torsional stress from a broken DNA fragment
RNA falls apart into bases in an alkaline solution
RNA secondary structure: hairpin, stem-loop
RNA tertiary structure: Psuedoknot, can be ribozyme
Transcription of Protein-Coding Genes and Formation of mRNA
gene is DNA sequence that specifies synthesis of one polypeptide or functional
RNA sequence
miRNAs=microRNA fragments that regulate tRNA activity
RNA polymerase translates in 5-3
RNA polymerase binds to promoter, seperates the DNA in a 12-14 bp area called
the transcription bubble, continues along at rate of 1000 bp/minute until stop site,
where it releases the mRNA strand. In prokaryotes only, translation on the 5 end
can beginbefore transcription for the sequence has finished
In prokaryotes, an operon containing multiple genes is transcribed from a single
promoter, in eukaryotes each protein-encoding gene has its own promoter
RNA polymerase has 2 large subunits (beta, betaprime), and 2 smaller alpha
subunits, as well as one omega subunit for structural stability
exons vs introns (exons usually in multicellular eukaryotes only), operons are
functional gene sequences with mutiple proteins
In eukaryotes, RNA synthesis creates pre-mRNAs, which are transformed in RNA
processing to mRNA: a 5-cap is created to prevent degradation, poly(A)
polymerase adds Poly(A) Tail (100-250bp) at 3 end
to help guide the mRNA
through the cell, and RNA splicing (taking out the introns)
mRNA still retains untranslated regions at the ends
alternative splicing uses introns to creates multiple proteins from a single gene
(Fibronectin is an example)
Decoding of mRNA
translation code degenerate because multiple codons can specify the same amino
acid
AUG is start codon,UGA,UAG, and UAA are stop codons

sequence from start to stop is reading frame


occasionally can frame-shift to read multiple proteins from one RNA seuqence
mitochondria, ciliated protozoans, and Acetabularia (a single-cell plant) have
variations in the amino acid code (read stop codons as amino acids)
tRNAs link to amino acids with help of aminoacyl-tRNA synthetase (whose 20
variations recognise 1 amino acid and all its cognates, or codon triplets coding for
it)
The reaction uses ATP to affix an amino acid to the 2 or 3 hydroxyl at the end of
the acceptor stem in a high-energy bond that is termed activated, whose energy is
used to create the peptide bonds between amino acids
The correct tRNA is recognized by the enzyme by the structure of the anticodon
loop & acceptor stem, + the absence of bases in tRNA
the enzyme also proofreads, so the total error is 1 in 50000 codons (in E. coli)
the D loop and the TCG loops interact with the ribosome to maintain the stability
of the tRNA-mRNA temporary bond
the third position of a codon is the wobble position, which can pair with multiple
types of tRNA anticodons, this means that some tRNAs can recognise multiple
codon triplets that code for the same amino acid, so many cells contain fewer thatn
the 61 tRNAs that would otherwise be necessary
many tRNAs have inosine (I) in place of adenine in the wobble position, which can
recognise A, C, or U mRNA base pairs (ex, CUA, CUC CUU and UUA, which all code
for Leu, are all recognized by GAI anticodon)
Stepwise Synthesis of Proteins on Ribosomes
ribosomes most common RNA, work at 3-5 amino acids per second
up to 3 hours to make largest proteins (eg titin)
ribosomes are made of 3 RNA molecules in prokaryotes(pro), 4 RNA in eukaryotes
(euk)
Large subunit contains 1 molecule large rRNA(23S pro, 28S euk), one molecule of 5S
(svedburg/sedimentation units) rRNA, and 1 molecule 5.8S in vertebrates (total
50S pro, 60S euk)
Small subunit contains 1 molecule small rRNA(16S pro, 18S euk)
total ribosome is 70S in bacteria (30S small subunit, 50S large subunit), 80S in
eukaryotes (60S small subunit, 40S large subunit)
plants and yeasts can be larger

specific tRNA called tRNAiMet binds to the P site on the small subunit to initiate
translation (when it reads the AUG codon)
eukaryotic translation initiation factors(eIFs) mediate initiation:
1.a ribosome finished with translation binds with eIFs 1, 1A, and 3 to form a 43S
preinitiation complex
2.eIF2 binds GTP to tRNAiMet
3. a eIF4 complex binds to mRNA to activate it (5 end and Poly(A) tail)
4. the preinitiation complex binds with the eIF4/mRNA complex j
5. the eIF4 works as a helicase to undo the RNA secondary structure and feed it
into the ribosome
6. The AUG start codon is recognized by the ribosome,causing GTP to be hydrolyzed
to GDP (as a sort of proofreading switch to confirm that translation has started),
creating a 48S initiation complex
7. The small subunit joins the bottom of the large subunit, over the RNA
8.When this occurs correctly, the remaining GTP is hydrolyzed to GDP as a
proofreader switch, the eIFs are released, and the full 80S ribosome is created
Some mRNAs have an internal ribosome entry site (INES), which forms an RNA
complex that interacts with eIFs to bind to the 40S subunit , which then is bound to
the mRNA and the 60S subunit.
Elongation Factors (EFs) are used to guide the translocation of ribosomes over
the mRNA sequence
the pepidyltransferase reactionis catalyzed by the large rRNA itself (not a
protein), and GTP hydrolysis is used again to signal this
A, P ,E sites
to terminate the translation, Release Factors (RFs) bound directly to the A site
(eRF1) recognise the stop codon and signal to eRF3-GTP to cleave the completed
protein from the ribosome.
the protein ABCE1 uses ATP to release the mRNA and RFs to go to another
ribosome, and the IFs come marching in again to form the 43S preinitiation complex

DNA Replication
first learned about with SV40 viruses
these viruses use only 1 viral protein (large T-antigen) to reproduce, the rest come
from inside the cell

DNA Repair and Recombination


DNA polymerase in eukaryotes produce 1 mutation in 104 nucleotides, in
prokaryotes 109
proofreading from exonuclease activity of polymerase (base pair returned to
exonuclease site( a polymer domain) unless forms base pair with template dna)
the bond between a purine and deoxyribose is prone to hydrolysis!
mitochondria and peroxisomes create hydroxyl radicals and superoxide, which also
damage dna
point mutations are a change in 1 base pair
nonsense mutations are new stop codons in the wrong place
missense mutations are a change in the amino acid sequence
silent mutations dont change the sequence (changes between codon cognates)
deamination is one of the more frequent mutations: C-->U (or C--> T in humans)
excision-repair systems excise and repair damaged dna by referencing the
template
in the case of a G-T mismatch, the repair system knows to automatically replace T
with C
(because it must have been caused by deamination)
this all happens before replication or it wouldnt be recognized
depurination is the loss of a guanine or adenine base from the hydrolysis of its
bond, which forms abasic sites, which generate mutations
mismatch excision repair corrects mismatches after replication
nucleotide excision repair fixes chemically warped bases that distort the DNA
shape locally,
including thymine-thymine dimers which is when the carbons in thymine bond,
warping the dna (caused by UV light)
the excision repair uses helicase, polymerase, ligase, as normal
x, gamma radiation and anticancer drugs lead to double strand breaks that often
rejoin incorrectly
nonhomologous end-joining (NHEJ) is the predominant way to repair these
breaks, although they lose a few base pairs at the joining zones
sometimes, ends from 2 different chromosomes are joined together, leading to a
cancerous genes
homologous recombination was found to be very important when a strong
correlation between mutations in its genes and cancer was discovered

used as important repair mechanism


to find homologous strands, it takes DNA from separate molecules in genetic
recombination, an important source of genetic diversity
replication fork collapse, caused by a nick in the phosphodiester backbone, can be
fatal to the cell (destroys the genetic data)
its repaired by:strand invasion of the DNA by a complementary strand
branch migration is when the cell uses ATP to extend the hybrid zone away from
the break
branch=where the target dna crosses from the whole DNA strand to its broken
compliment
a holliday structure is then formed, as the broken off strand base pairs with the
bottom, whole strand, crossing over the hybrid strand
the strands are broken at the crosspoint and the breaks are ligated (this makes it so
the 2 original strands of dna, as well as the transcribed dna, are all preserved
Viruses
there are RNA and DNA viruses: RNA replicates in cytoplasm, DNA replicates in
nucleus
viruses can encode between 4 and 200 proteins
the infectious particle is a virion
the host range is the set of cells that viruses infect (usually pretty narrow (one
phyla max)
phage
vesicular stomatitis virus has a wide host range (insects and many mammals)
polio and many other viruses only infect specific cell types (intestine for polio)
HIV infects lymphocytes and glial cells
the capsid is the protein coat around the nucleic acid of a virus
its made with a small number of distinct genes to minimize the nucleic acid neede
to encode for it
capsid+nucleic acid=nucleocapsid
helical nucleocapsids are rodlike tubes with the nucleic acid in a helical groove (ex
tobacco mosaic virus)
icosahedral nucleocapsids are hedrons made of 20 sides (each an equilateral
triangle)
some use grooves between capsid subunits to interact with host cells, some use
fibers extending from the surface
many bacteriophages have a icosaderal head and a rod tail

many viruses have an envelope of a phospholipid bilayer + a few glycoproteins


plaque assays can find how many viral particles are in a sample
its done by culturing a sample of viral particles on host cells and then counting the
number of lesions (plaques) that develop
lytic cycle:
adsorption: binding of capsids to the cell membrane
penetration, replication, assembly, release
temperate phages establish nonlytic association that doesnt kill the cell
prophage=integrated viral dna
this is lysogeny
must switch to lytic cycle at some point to get out of the cell again--this is
induction
retroviruses, reverse transcriptase
cancer-causing genes in retroviruses

Molecular Genetic Techniques


Analysis of Mutations to Identify and Study Genes
mutation analysis reveals genes required for the process, the order that genes act
in the process,a nd whether encoded proteins interact with each other
allele
wild type=non-mutated, standard gene
phenotype/genotype
mutagen
<bunch of heredity stuff...>
conditional mutations used to isolate mutants, most common type
temperature-sensitive mutations which can be isolated in bacteria and lower
eukaryotes but not in warm-blooded eukaryotes.THese are mutations that function
(ie produce proteins) at one temperature, but denature at another (when a normal
protein would be stable throughout that temperature range)
nonpermissive temperature is when the phenotype is observed (permissive is
opposite)
mutations are used to find the order of protein function (with mutations defective
at a certain point in the process)

genetic suppression occurs when a mutation in the structure of one protein is


matched by a mutation in a interacting protein, such that the functioning of the
process involving the proteins is unimpeded iff both mutations occu (this helps
determine if 2 proteins interact)
synthetic lethality is the opposite (ie proteins fail to function iff both mutations
occur)
mutations can be used to map genes by tracking instances of crossing over (when
you get 0 or both mutations in the recombinant chromosome), as less
recombinations occur if the genes are closer together (discovered by A. Sturtevant)
1 genetic map unit is the distance between 2 positions that results in a 1/100
recombination rate
loci are unlinked if the recombinant rate=parental rate
Cloning and Characterization(PCR, ECP)
recombinant DNA dna from different sources
vector DNA: dna inside the cell combined with a DNA fragment (the one you want
to replicate) (commonly E. coli)
dna is cut by using restriction enzymes, which find restriction sites (usually
palindromic) on enzymes
the bacteria that produce restriction enzymes also produce modification enzymes,
which adds methyl groups to native DNA to prevent restriction enzymes(RE) from
cutting at that point
sticky ends are produced with staggered cuts in the double helix ( leaving single
strands), blunt/flush ends are produced when the enzyme cuts across both strands
at the same place
x RE will always cut y sequence at a predictable set of locations, spaced 4^n bases
apart (n=length of restriction site)
ligase can easily join 2 complementary sticky ends with covalent phosphodiester
bonds, ligase from bacteriophage T4 can inefficiently join blunt ends
SmaI & AluI REs produce blunt ends
plasmids are rings (1-3 kb) of extrachromosomal dna found in lower eukaryotes, E.
coli ones are often used as cloning vectors
plasmids have a replication origin (ORI), a marker that can be selected (eg drug
resistance) and a place to insert DNA
once a host cell starts replicating a plasmid at the ORI, it will continue replicating
the rest if the plasmid including inserted DNA

in transformation, ~1/10000 E coli cells mixed with modified plasmids will take up
a plasmid (these are then selected for with the marker and left to reproduce)
fragments from 3-10000 bp can be inserted into vectors
vector versatility is increased with polylinkers, which are synthetic vectors with
several different restriction sites, meaning that they can be used with DNA
sequences cut with multiple REs.
Bacterial Artificial Chromosomes(BACs) are used to clone long (millions bp)
sequences, one type uses a ORI called the F factor
DNA libraries are collections of DNA molecules each cloned into vectors; the
cloned set of all sequences in a genome is a genomic library
because large genomes contain too many introns, complementary DNA (cDNA)
libraries, which store DNA copies of mRNAs, are used for higher eukaryotes
poly(A) tails are used to recognize mRNA in the cell (using thymidylate)
the mRNA is then synthesized into cDNA with reverse transcriptase (thx HIV)
This is then methylated to prevent cleavage, ligated to an EcoRI linker with T4
Ligase, and attached to an E coli vector
differences in transcription rates mean that #occurrences of a gene in a cDNA
library is variable (libraries thus contain millions of individual recombinant clones)
libraries are screened with oligonucleotide probes (20 bp) that bind to a selected
clone, and finding genes based on encoded proteins
probes use hybridization: denature the library replica and add the (fluorescent)
probe, then renature it and wash away the excess probe, scan sample for
fluorescent objects (ie the hybrid dna)
gel electrophoresis:
for sequences 10-2000 bp, use acrylamide gels, 2000-20 kb need agarose gels
subcloning is rearranging parts of genes (eg change out a promoter)
PCR es lo que es
denature and then add in synthetic oligonucleotides in excess
100 bp DNA fragments can be sequenced with PCR using fluorescent DNA
polymerase
Using Cloned Fragments to Study Gene Expression
southern blotting is used to find a specific gene fragment: 1st use gel
electrophoresis to separate the genome by length, then denature the DNA at the
desired length and put in hybridization probes.
northern blotting: southern blotting with RNA
in situ hybridization is used to preserve the relative location of mRNA

DNA microarrays are thousands of DNA sequences attached to a slide


1500 sequences/cm^2
transfection clones genes into animal cells
electroporation is the application of electric shock to open the pores of a cell to
DNA
transient transfection is when a viral vector is used to put in a lot of plasmids
quickly, but these arent distributed to all daughter cells (thus transient)
stable transfection uses mammalian enzymes with a selectable marker, those
which integrate into the genome are then selected for
retroviral expression systems: lentivirus systems can be used to speed up
transfection
can flag proteins with GFP (green fluorescent protein) or an antibody called an
epitope

Inactivating Eukaryotic Genes


the genome of the yeast S. cerevisiae can be easily changed
disruption constructs are ersatz sequences of DNA spliced into the genome
(created with PCR), to see if the gene it replaces is vital to cell function
this method found that 4500/6000 yeast genes not essential
can also use a conditional promoter: eg GAL1 promoter in yeast only permits
replication when galactose is present: allows researchers to control replication of
that gene
gene knockout
RNA interference (RNAi) uses double-strand rna to block expression of
complementary rna (see chapter 8)

Genes, Genomics, and Chromosomes


Eukaryotic Gene Structure

bacterial mRNAs are sometimes polycistronic (a cistron is a sequence encoding


one polypeptide), but most eukaryotic mRNAs are monocistronic: each mRNA
molecule encodes 1 protein
as a result, bacterial translation can proceed from multiple points on the mRNA
thus, bacterial transcription units are distinct from genes (transcription units are
single operons)
simple transcription units are processed to create an mRNA for one 1 protein
90% of human transcription units are complex (it can be processed in more than 1
way, all results monocistronic)
mutations in control units affect all of it, mutations in exons affect only that mRNA
the various proteins encoded from different expressions of a gene are called
isoforms
there are solitary and duplicated genes
duplicated genes create a gene family, and are similar but non-identical forms of a
gene located close together on the transcription sequence, created by unequal
crossing-over, they separately evolve in a beneficial manner
psuedogenes: gene duplications that became non-functional
heavily used genes (like that to produce rRNAs) have multiple identical copies
other genes code:
small nuclear RNAs, which function in RNA splicing
small nucleolar RNAs: help rRNA processing in nucleolus
micro RNAs (miRNAs) regulate the translation and stability of mRNAs

Chromosomal Organization of Genes and Noncoding DNA


No direct correlation between gene length and complexity--amoeba dubia has 200
times more DNA/cell than humans
selective pressure on commonly used genes to reduce intron size
1/3 of DNA is transcribed (but 95% of this is introns), and the rest is between genes,
repeated DNA sequences
microbes have less introns because the cost of gene synthesis is proportionally
greater to them
repetitious dna: multiple copies of DNA sequences
simple-sequence DNA: is 6% of the genome and is identical copies of various
sequences. When 1-13 bp, it is called microsatellites
created by backward slippage during replication

interspersed repeats are much longer (more info)


microsatellites cause many genetic diseases (by creating coding sequences that
code for bad proteins)
extended repeats can also occur in non-coding sequences, where they can form
long RNA hairpins that sequester the proteins that are supposed to regulate
splicing
14-100 bp regions are minisatellites
today, satellites are used for dna fingerprinting, by using PCR to find amplify
tandem repeats, which gives different results for every person (other than identical
twins)

Transposable DNA Elements


Interspersed repeats are also known as moderately repeated DNA
they also can move around the genome in a process called transposition, they
seem to only exist to maintain themselves (thus also called transposons)
retrotransposons copy to rna then to dna, DNA transposons cut out their sequence
and move somewhere else.
retroviruses may have evolved from these transposons
can aid evolution by transposing sequences around them too (mutation)
transposons are copied (rarely so they dont disrupt essential genes) by
transposase
retrotrans
they have an inverted repeat on either end, and then a direct repeat on eiteher
end--wild-type has it once, but the IS has it on both ends
activator elements are highly correlated with reversible mutations
dissociation (Ds) elements are correlated with mutations that dont reverse
themselves, except in the presence of the 1st class
activator elements are IS elements
dissociation elements are IS elements with damaged transposase--transfer only in
presence of 1st mutation (ie transposase)
retrotransposons can be divided into those with long terminal repeats(LTR)
LTR retrotransposons make up 8% of human DNA
because they code for all retroviral proteins, and have LTRs like integrated
retroviruses, theyre called retrovirus-like elements
endogenous retroviruses are the most common, lotsof isolated LTRs
most common mammal transposons are nonviral retrotransposons
long interspersed elements (LINE) are 6 kbp long

three types of LINES, L1-L3: only L1 still transposes


in total 21% of DNA
it has direct repeat, a region with a lot of A&T, then ORF (open reading frame) 1,
encoding for a rna-binding protein, then ORF2 which encodes a long region with
reverse transcriptase and DNA endonucleases.
Short INterspersed Elements (SINE) are 13% of DNA
100-400bp
still have A-T rich sequence at ends like LINE
short because they dont encode protein, the rely on the reverse transcriptase from
LINES
many SINEs are Alu elements evolved from a RNA in the signal recognition
complex, which target polypeptides to the ER
1 in 8 individuals have non-LTR retrotranspositions occurring, 60% SINE (90% Alu),
40% LINE L1
retrotranspositions can come from processed mRNA, too, creating psuedo-genes
transpositions crucial source of mutations--duplications can evolve separately to
perform mutually beneficial effects.
recombinations between mobile elements in separate genes () can generate totally
new genes--called exon shuffling
in DNA transposons happens when when an exon is flanked by 2 transposons and
the whole thing gets transposed
in retrotransposons it happens when the LINE poly(A) signal is too weak, and
transcription continues through another exon (and the whole thing is
retrotransposed)
transposons are used for gene therapy (to insert a gene) (sleeping beauty
transposon)

Organelle DNAs
mitochondrial dna (mtDNA) inherited cytoplasmically--as the sperm hasless
cytoplasm, most human mitochondria hail from the mother
the mitochondria has its own rRNA, but most of the proteins it needs are imported
from the cytosol (the mitochondria usually produces only a few subproteins that
are assembled into multimers with imported parts from rIKEA.
in animals and protozoa there are few introns (mitochondria need to reproduce
frequently and quickly), but in plants there are many (as a result plant mtDNAs can

be up to 2 mb, vs 6 kb in the smallest mtDNA (protozoan causing malaria), and


typical 16 kb in animals)
mitochondria genes have moved from the mitochondria to the nuculeus through
RNA intermediates
mtDNA is usually circular, but some organisms have it linear, and in Trypanosoma it
is maxicircles comprised of multiple minicircles
the mitochondria has a severe trade deficit vis a vis the cytosol--- big rRNA units
lobbying for increased tariffs
mtDNA encodes proteins differently in animals/protozoa
mutations in mtDNA big part of animal aging
chloroplasts also have unique organism dnas, but less diverse because they were
integrated into eukaryotic cells after mitochondria
ribosomes in mitochondria are like bacterial ribosomes in their sensitivity ot
chloramphenicol and resistance to cyclohexamide

Structural Organization of Eukaryotic Chromosomes


histones are abundant proteins that order chromosomal DNA,these +the DNA is
chromatin-1/2 dna 1/2 protein by mass
histone proteins have + charged amino acids--easily interact with negatively
charged phosphate groups in DNA
isolated chromatin w/o ionic solution forms a beads-on-string structure (beads are
nucleosomes, histone and dna groups linked by free dna
if isolated in normal cellular ionic concentration more fibrous
nuclease digestion can erode the linker dna and leave just the nucleosomes
histone chaperones assemble newly synthesized dna into nucleosomes
histone sequences and chromatin structure highly conserved between species
histone coding is a site of post translational modifications that change chromatin
function
histone acetylation/methylation
1 X chromosome in most female cells are highly condensed so that X chromosomes
are expressed at the same rate in males and females (barr body)
replication, centromere sequence, and 2 telomere sequences are needed for a
chromosome
telomerase completes telomeres during synthesis otherwise it would be constant;y
shortened
autonomously regulating sequences act as replication starts in yeast

Transcriptional Control of Gene Expression


Control of Gene Expression in Bacteria
sigma factors are necessary to activate prokaryotic genes--protein that enables
binding between RNA polymerase and promoters
lac and trp
phosphorylation and small-molecule ligands can regulate promotion and repression
sigma factor in a complex with RNA polymerase can be activated by enhancersfar
from the start site
2-component regulatory sequences have 1 sensor protein that transfers a
gamma-phosphate (in ATP) to the response regulator protein
attenuation (with Trp):
the ribosome follows right behind RNA polymerase to translate directly after
transcription--speed varies directly with on concentrations of tRNATrp
isection 3 of the translated RNA can bind with either section 2 or 4--if section 2 is in
the ribosome (because the ribosome has moved quickly) 3-4 will link, forming a
hairpin and stopping transcription, otherwise 2-3 will link and transcription
continues (because the polymerase has continued past)
attenuation also occurs through riboswitches--tertiary RNA formations that can
bind small molecules and create a termination hairpin when present at sufficient
concentration

Overview of Eukaryotic Gene Control


gene control more permanent, serves whole organism
can analyze gene control regions with reporter genes such as luciferase and GFP
3 RNA polymerases in eukaryotes
#1 transcribes pre-rRNA
#2 transcribes all protein-coding genes & most RNA splicers & siRNAs
#3 transcribes tRNA 5S rRNA and random small RNAs like the signal-recognition
particle and a RNA splicer
all 3 can be distinguished by their net charges (in an ionic solution)
plants also have RNApolys 4&5, synthesize siRNAs
RNA poly has similar structure throughout all organisms
RNA poly 2 has a carboxyl-terminal domain (CTD), 7 amino acid sequence
repeated min. 10 times--critical for viability
CTD becomes phosphorylated during transcription

RNA Polymerase II Promoters and General Transcription Factors


transcription start sites can be found by identifying the DNA sequence under the 5
cap in the corresponding mRNA
TATA boxes are promoters similar to those in E. Coli, ~30 bp upstream of the start
site.
also initiators, much closer to transcription start site
very few CG sequences because deamination of C turns it to T, so most CG
sequences are in:
islands of CpG promoters, which initiate transcription on both sides, the nonsense
side stops being transcribed .5-1 kb from the start site
general transcription factors are proteins required for RNApoly to transcribe
most places
they separate dna strands to get the RNApoly to the sequence and form the
preinitiation complex, which is formed by:
TFIIB (gen. transcription factor) binds to a TATA box, and bends the DNA to allow
transcription, lets the PolyII bind
TFIIH works as a helicase, then most GTFs release and transcription begins
TAF subunits can be ~30 bp downstream, in a downstream promoter
element(DPE)

Regulatory Sequences and Proteins


promoter-proximal elements are upstream elements who must be located close
to the promoter
70% of eukaryote genes are promoted by CpG islands
yeast has upstream activating sequences (UASs) which work like enhancers in
eukaryotes
dna binding-domains of eukaryotic transcription chan have zic finger,
homeodomain, basic helix-loop-helix
enhanceosomes are the multiprotein complexes of activators bound to enhancers

Molecular Mechanisms of Transcriptional Repression and


Activation

histone tails can be modified to change relative condensation of


chromatin--changes ability to transcribe
activators/repressors interact with a large protein complex (the mediator)
this regulates transcription preinitiation complexes
heterochromatin is more condensed, and thus less active, than euchromatin
activation and repression domains

Regulation of Transcription-Factor Activity


nuclear receptors can regulate transcription factors
response elements are where nucleotides bind nuclear receptors
heterodimeric nuclear receptors in the nucleus repress transcription when bound
to cognate sites (by directing histone deacetylation), but when the hormone ligand
binds, it starts the preinitiation complex
homodimeric steroid hormone receptors are normally in cytoplasm (trapped by
inhibitor proteins), but when liganded they activate transcription
example is heat shock genes: when there is a heat shock, the heat-shock
transcription factor activates, stimulates the polymerase, and brings in more
polymerase to get the reaction done more quickly--before the shock the
polymerase is paused mid transcription for faster response
polymerase can transcribe at different rates at different times.

Epigenetic Regulation of Transcription


Methylation and acetylation of histones is the primary manner of epigenetic
regulation
methylation of CpG sequences in CpG island promoters in mammals creates binding
sites for methyl-binding proteins that associate with histone deacetylase--induces
transcriptional repressors
polycomb complexes maintain gene repression
Other Eukaryotic Transcription Systems:
Pol 1 transcription (in the nucleoleus) is highly regulated to correspond with cell
growth: ribosomes need to be created when new cells are created...
the complex NoRC relocates the Pol1 transcription start site into a nucleosome
(then methylates it)
Pol3 is unique because it has internal promoter regions--A box and B box for tRNA
C box for 5S rRNA
stable RNAs coding sequences have upstream promoters
mitochondrial RNA polymerase is encoded in nuclear DNA,
mtDNA promoter sequences are A-base rich

chloroplasts have 2 rna polymerases--plastid polymerase with multiple subunits


like bacteria (has core encoded in chloroplast still), and a bacteriophage-like
polymerase that encodes some of the bacterial polymerase subunits
transcription regulated by sigma factors responding to light and metabolic stress

Post-Transcriptional Gene Control (look and see if this has it


all)
Processing of Eukaryotic Pre-mRNA
5 cap and poly(A) tail are used to shield the mRNA from the enzymes breaking
down introns: 5 exoribonucleases digest unprotected rna
mRNAs are always part of heterogeneous ribonucleoprotein (hnRNP) complexes
5 cap made of methylated riboses and guanine
capping enzyme is catalyzed by the phosphorylating of the CTD of Pol II--this
distinguishes mRNA from r&tRNAepi
because 1 CTD can bind to multiple proteins at the same time, transcription can be
efficiently coupled with splicing, + CTD coordinates transcription with processing
and ensures that the processing machinery is in the right place for transcription to
begin
transcription starts very slowly because of the NELF (negative elongation factor) ,
NELF disassociates when the 5cap is put on
hnRNP proteins prevent RNA secondary structure formation, aid in splicing and
transport
3 end of introns have a lot of P
RNA recognition motif (RRM) is the most common RNA-binding domain, highly
conserved across species
in short transcription units, splicing occurs after cleavage from the template, in
longer ones splicing can start before the 3 end is capped
splice sites can be found by comparing the template DNA to mRNA cDNA
transesterification reactions switch phosphodiester bonds to splice out introns,
usually by pairing a G to an A (introns start with GU and end with AC)(called the
branch point A) in the intron, and cutting the loop out of the mRNA
rarely the intron starts with AU and ends with AC , this uses 4 rare snRNAs

splicing uses snRNAs (U1-6, because theyre rich in U) and other proteins assembled
on a pre-mRNA to form a spliceosome complex, size= of a ribosome
exon-junction complex then formed, with a RNA export factor (REF) to guide the
complete mRNA out of the nucleus and enzymes to quality-control and break down
bad splicing
some protozoans, and C . elegans, use trans-splicing, where they synthesize
together pre-mRNAs to form the final mRNA
the exact location of splice sites is determined by SR proteins interacting with
exonic splicing enhancers to form a cross-exon recognition complex
15% of diseases, including spinal muscular atrophy is caused by poor exon
definition leading to mis-splicing
some introns are self-splicing: group 1(ss) introns are in protozoans, coding for
nuclear rRNAs, and group II are organelle dna encoding for all RNA
snRNAs may have evolved from self-splicing RNAs, which could have accelerated
evolution by allowing for creative splicing and facilitating exon shuffling (because
this frees up intron sequences to be anything without threatening cell viability
AAUAAA acts as a poly(A) signaler upstream, then G/U rich sequence downstream
signaling proteins bind to AUUAAA to create the cleavage/polyadenylation
complex, complex only starts cleavage when poly(A) polymerase (PAP) bonds to
the complex so that the 3 end is capped before degradation starts
PAP starts adenylation slowly, but poly(A) binding protein comes in to speed it up
and guide the mRNA through the cytoplasm
exonucleases linked in an exosome degrade introns
5 cap protected by nuclear cap-binding complex

Regulation of Pre-mRNA Processing


proteins previously encoded can regulate gene splicing
in drosophila, Sxl prevents splicing, Tra promotes splicing
RNA binding sites for protein splicing repressors are called exonic splicing
silencers

RNA editing post-transcription in organelle dna: short sequences encoded


elsewhere are applied to change the exon sequence at certain sites--potential for
use in drugs

p
Transport of mRNA Across the Nuclear Envelope
nuclear pore complexes are symmetrical structures with copies of nucleoporin
proteins, FC-NPCs are semi permeable pores --random coils of amino acids and
FG-repeats limit diffusion
proteins <60 kDa can diffuse through, bigger molecules (ie RNPs) must be
accompanied by special transport proteins that interact with the FG-repeats, in the
case of mRNPs the mRNP exporter do this
as the RNP is transported through the NPC, mRNP remodeling occurs, where the
proteins are exchanged out

Cytoplasmic Mechanisms of Post-transcriptional Gene Control


miRNAs bind to untranslated regions of mRNAs and repress translation
1 miRNA can inhibit multiple mRNAs because base pairing mustnt be perfect
they can come from transcription (these are called pri-miRNAs), or from introns and
pre-mRNAs
form RNA-induced silencing complexes (RISCs)
multiple RISCs are necessary to inhibit an RNP
RISCs cause the RNPs to bind to P bodies, which are large cytoplasmic domains
that are sites of RNA degradation without translation factors
RNA interference (RNAi) uses small RNA sequences to degrade mRNAs,
accomplished with siRNAs (related process to miRNAs, both form RISCs)
siRNAs must have perfect base pairing, unlike miRNAs
they cleave the mRNA and leave it for degradation
siRNAs are used as defense against RNA viruses and transposons
siRNAs produced in just a few cells can be propagated to all cells through a protein
like RNA replicases, plant transfer through plasmodesmata
siRNA knockdown is the use of synthetic siRNA to knock down expression of
specific genes
cytoplasmic polyadenylation is critical to gene expression in embryos--egg cells
have many stored mRNAs without a proper poly(A) tail, cytoplasmic
polyadenylation fixes this at the appropriate time to touch off translation

mRNAs degraded in cytoplasm in 2 ways:


deadenylation-dependent pathway: Poly(A) tail decreases until the exonucleases
can destroy the mRNA
deadenylation-independent pathway: removes 5 cap
rate of deadenylation inversely correlated with initiation frequency because
frequently transcribed mRNA may still have protective initiation factors at the 5
end
mRNA surveillance avoids the translation of bad molecules, examples:
recognition and degradation in nucleus, impossibility of exporting
spliceosome-bound pre-mRNA, nonsense-mediated decay(NMD) --recognizes and
degrades mRNAs without a stop codon at the end

also sequence-specific RNA-Binding regulators--they bind to the 5 UTR , the


ribosome cant translate
used to regulate iron binding--when theres too little iron the binding protein binds
to the 5 UTR and stops translation
specific binding proteins can also bind to block protein recognition of degradation
sequences
PAP

Processing of r&tRNA
1 large precursor rRNA undergoes cleavage, exonucleolytic digestion, and base
modifications to get the various subunits (all in nucleoleus)
snoRNAs base-pair and change pre-rRNA to process pre-rRNA
snRNAs and self-splicing introns are ribozymes that can catalyze transesterification
splicing reactions
tRNA undergoes splicing and modification too
all RNA molecules are always associated with proteins at all times
nuclear bodies are special nuclear domains with high specific protein and RNA
frequency that do certain things
Cajal Bodies assemble RNP complexes
Nuclear Speckles are storage areas for snRNPs and proteins
nucleoli create ribosomes and ribonucleoprotein complexes

You might also like