You are on page 1of 7

Lecture 36-37 Genomics 1. determine complete nucleotide sequence of organisms 2.

inside one of our pairs, there are TWO genomes, but only one set is the actual genome idea: if we know the genome sequence, we know everything b/c we have all the info>> oversimplified 5. start with long dna break the DNA into pieces clones the inserts and mark with physical genetic markers most common physical marker was the number of tandem two nucleotide repeats of CA sequences at chromosomal site -you can add or lose one of these CA sequences, so there are lots of alleles of this > can determine the repeat site of the CA's and using PCR, find out the length later can find the genetic sites, assemble by comparing overlaps, etc and assemble the whole thing. >>>>>>>>Map based approach, kinda cumbersome 6. SHOTGUN sequence start with BAC clone break up into small fragments let a computer determine the sequence, putting together 10-100 kb at a time 7. human genome less genes than ppl thought very little of the sequence is exon human genome sequence was done w/ Sanger DNA two groups actually competed to find this out one group said that you can do shotgun method to find out the sequence but others that there were too many repeated sequences for this to work both groups eventually got the answer now, the shotgun method is used 8. whole genome shotgun sequencing extract DNA make small fragments generate a library clone those and sequence the fragments a computer puts the sequences back together again

-lot faster b/c no physical markers, or concern over overlapping clones 9.finding genes are have two problems 1. large amounts of non-gene DNA 2. genes have no initiation or stop codons, so you find lots of small ORF's but not C-value - amount of DNA in a cell's nucleus 10. c-value paradox some organisms (frog) have more genes than humans non-coding genes in organisms what is all this non-geneic sequence? introns non-fucngional repeated sequence unique sequence, but unknown, not conserved 11. 50% protein coding 20% gene coding sequence 30% not repetitive sequence, not introns or codons 50% repeated sequences that come up many times in out genes many are transposons, which are mobile genetic elements found in pork and euk, can be put in one place and can move the other place; code for protiens that enable their transposition in all organisms 14. by knowing the cDNA, can know the sequence another real important asdf in finding genes is that genes are also highly conserved eg) genes of wheat and human are very similar 15. when human genes were originally characterized combination of all of these made annotation of the genome possible so what do people learn at looking at different genomes is that number of genes necessary to support life are surprisingly modest plants have more genes that humans! one of the important things is that although therea re only 25000 in human genes compared to 27000 of flowering plant, in humans, there are lots of more alternative RNA processing so it is more complex humans have more alternative RNA Processing, etc, so human genomes can be considered more "complex" >>there are lots of

16. distribution of human gene functions: -nucleic acid binding proteins. dark blue, DNA, RNA polymerase light blue are transcription factors 1800 genes that bind to genes in particular way and help transcription and regulate expression of our genes -green: signal transduction, lots are protein kinases; we can identify them with signature AA, but we're still studying on what they do -lots of proteins of unknown function!! large blue portion one interesting statistics: 70% resemble genes in one resemble those in other organisms -high degree of conservation over evolutionary history -for medical purposes, 61% of genes are similar to Drosophila genes -40% of these genes have proteins that are similar 17. diagram of how much of human genes are shared with other organisms -essential for life (DNA, RNA polymerase, etc) -Eukaryotes only (not in bacteria, cell compartments, multicellularity development, etc) -Vertebrates and animals (movement, eating etc) -vertebrates only (complexity associated with vertebrates) one way of looking at distribution of the human genes what do you do after you have genes? the object is to get function of the gene sequences, we can guess the function of the proteins 18. reverse genetics start with sequences since we have the genes, and go back in the process, go from gene to the mutation (reverse genetics) once crucial resource is that no gene activity mutations- generate no mutations how do you go about doing this? gene KO involves homologos recombination, using identical sister chromatin copies-replace the good copy of a gene with mutant copy of the gene that has the coding selection of the gene with selectable marker -start with a fragment of DNA with homologus DNA at then two ends and replaced DNA in the middle -replacement of the deleted gene, strand replacement -yeast is really easy to use for homologous recombination -hard in mammals, b/c if we introduce a fragment of DNA it will be placed in random place, as it uses non-homologus end joining repine mechanism, so

difficult difficult to sort through all of the non-homologous, >> slow and inefficient 19. involved double strand RNA -something that can be used by investigators -demonstrated that u can use double stranded RNA specifc for one gene to knock down activity of that gene -thought that [putting anti-sense RNA would inactivate gene function it in someway -eventually found out that double straind RNA that was responsible how does DOUBLe strand Rna silence gene function? similar to miRNA; two way -start with small double stranded RNA molecules, chemically synthesized -introduce them to cells,with target complementary mRNA -the target mRNA gets cleaved -get complete watson crick base paring -another approach is to use short hairpin RNA< which is subject to same process as miRNA and creates a small RNA -good thing about the short hairpin RNA is that u don't have to chemically make them, but can use vectors that make the small double stranded molecules that gets to mRnA you don't genetically modify the RNA sequence, but change genome squence by degrading the mRNA -much more easier than activating the chromosomal gene 20. two ways 1. analyze mRNA experssion - analyze mRNA levels (where they are,etc) 2. use chromatin immunoprecipitation -look at how mRNA are made (population) and analyze population of transcription factors and general transcription factor machinery 21. microarray -way to analyze mRNA amounts as its labeled on a support chip -single stranded DNA that is specific for a gene (eg, 1 of 150 genes) put in one spot -take mRNA pop. and label them by labling cDNA, see how much mRNA can bind to the single stranded DNA on the chip, detected by how much of the label is visible -amount of probe is proportional to the amount of mRNA compare two different cell populations

-have normal cells and tumor cells; how does the transcription change in tumor cells? -something in transcription must make the change 1. isolate mRNA 2. make cDNA refelctive of mRNA 3. label normal cell with green label and tumor cell with red label 4. mix the two cDNA and let them bind to the spots on the array 5. look at the ratio of green to red >> if ratio is same, assume sameetc -yellow, RNA that are expressed in same level -green spot, RNA repressed in tumor cells -red spot, RNA overexpressed in tumor cells - let the computer do this process -lets u look at genes that differ in their level of expression in cell -there are lots of different questions you can ask, such as how does a drug effect expression (eg, anti arthritic drug) 22. using chromatin immunoprecipitation -look for where along the genome are bound the sequence specific DNA binding proteins -have an antibody that is specific to a protein or modification of the protein p53 (transcription factor activated in response to DNA damage) important cancer protein, as it protects the genome -binds upstream of sequences 23. how does it work -have preinitiation complex -try to identify all of these green sites are in the genome (where does p53 bind?) 24. 1. treat cells with formaldehyde to stabilize the whole structures as it links the proteins 2.break the cells the open and sonicate, and break the chromatin into pieces 3. immunoprecipitate to a protein specific anti-body ( recognize epitopes) and use it to precipitate out all of the proteins with the antibody 4. reverse cross links by heat treatment and deproteinate the immunoprecipitated material 5. detect all the antibodies with p53 6. find where in the genome the protein is located

25. visual rep -fragmented chromatin exposed to antibody only protiens bound teo antibody will be precipitated -get only DNA fragments that were associated with p53 -you can use PCR or arrays to see where the DNA hybridize more straight forward to simply sequence the segments 26. differences: for Snger sequencing, take fragment, make thime into a vector, clone them and sequence one fragment at a a time low thru put technology NGsequencing you have a bunch of fragments, tag the two ends of fragments, have a large array of sequences very high thru put ( another technology: solexa, use basic strategy of tagging fragments and arraying them and sequencing them but important thing is that you get shorter runs but u can do for a lot of fragments at a time (80 million) VERY high thru put use cDNA library and cout how many times you get thtat sequence 28 you can use it t tell where in the DNA are chromatin modifications 29 protein analysis, goals of this genomics approach 1. what proteins are in a cell and what form are they in? 2. where are they localized 3. protein interaction? ea spot is a protein in a cell u can cut them out and identify them using mass spectroscopy a proteiome can be very cell type specific and environment(condition) specific depending on where they're grown 30. know where the proteins are in the cell (mitochondrial, membrane etc) use gene fusions to detect -have a protien and fuse it to a P that is easy to detect (fluorescent) -GFP and RFP are usually used and asks where the protein localizes

GFP is a workhorse in cell bio! nobel prize! lol protein folds up and adds the small GFP within it tether it by direct translation and look at where the protein goes 31. new kind of biology: study molecular structure

You might also like