You are on page 1of 22

CS 563

Instructor: A. Tetteh
Lecture 1
September 10, 2018
Recommended reading:
Hartl and Clark, Chapter 2

Assigned reading:
Fan, J.-B., Chee, M. S. and Gunderson, K. L. 2007. Highly parallel genomic assays. Nat. Rev.
Genet. 7: 632-644.
Sharp, A. J., Cheng, Z. and Eichler, E. E. 2006. Structural variation in the human genome. Annu.
Rev. Genomics Hum. Genet. 7: 407-442.

Goal
The goal of studying population genetics is to understand the factors that give rise to variation in
populations that is, the distribution of allele frequencies and how they change in time and space

Population genetics is the study of the distribution of allele frequencies and interaction of alleles
and genes in populations and how the frequencies change under the influence of the four main
evolutionary forces leading to adaptation and speciation. The four main evolutionary processes
are:
 natural selection
 genetic drift
 mutation
 gene flow

The theory of population genetics encompasses other factors that alter the distribution of allele
frequencies, such as recombination, effect of mating systems, population size, patterns of
migration, segregation, and transposition of mobile elements, population subdivision and
population structure on the patterning of variation within and between populations and species.

The discipline of population genetics was founded by Sewall G. Wright, J. B. S. Haldane and R.
A. Fisher, who also laid the foundations for the related discipline of quantitative genetics.
1
Sewall Wright J.B.S. Haldane R.A. Fisher
Sewall Wright: December 21, 1889 – March 3, 1988, an American geneticist, known for his
influential work on evolutionary theory and on path analysis.
J.B.S. Haldane: November 5, 1892 - December 1, 1964, a British geneticist and evolutionary
biologist credited with a central role in the development of neo-Darwinian
thinking.
R.A. Fisher: February 17, 1890 - July 29, 1962 an English statistician who made major
contributions to Statistics, Evolutionary Biology and Genetics.

What is a population?
The genetic architecture of a population of a species is one that is divided into subpopulations, or
local populations, or demes such that a collection of the subpopulations to form a single large
population is described as a metapopulation. In this metapopulation, there can be various kinds
of distributions of individuals separated by time, space, or some social structure so that the
subpopulations are not continuous and panmixis may not exist, leading to different areas of the
metapopulation having distinct gene frequencies.

Population genetics generally deals with genetic variants that segregate as Mendelian factors,
such as variation in DNA sequences or mutations with major phenotypic effects. Population
geneticists seek to describe the patterning of variation within and between populations,
increasingly on a whole genome scale, and to infer from these observations what demographic
and evolutionary forces have acted historically to generate the observed patterns, by comparing
the observations with predictions of theoretical models.

2
Quantitative genetics is concerned with subtle differences in phenotypes between individuals.
Examples are differences in aspects of morphology (such as height and weight), physiology
(such as blood pressure), behavior, and susceptibility to common diseases. Quantitative genetics
considers the variation between individuals that can be readily observed as phenotypes that are
typically continuously distributed in populations. This continuous phenotypic variation is caused
by the joint segregation of multiple genes affecting the trait, as well as variation caused by the
effects of the environment on expression of the alleles. Quantitative geneticists are concerned
with the inheritance of phenotypic measurements, which integrates population genetic
principles with the rules of Mendelian inheritance applied to multiple loci.

Historically, quantitative geneticists have sought to describe resemblance between related


individuals, and then use this information to predict the population response to selection and
alternative mating systems. Most traits important for adaptive evolution and increased yield of
agriculturally important crops and animals are quantitative traits, making the understanding of
quantitative genetics central to evolutionary genetics and plant and animal breeding.

Alleles are alternative forms of genes. A gene is a DNA sequence that codes for a protein or
ribosomal RNA. The particulate matter that is transmitted to and inherited by an offspring from
parents during mating is a gene. DNA is a polynucleotide.

Chemical nature of polynucleotides

DNA consists of RNA consists of


Nitrogenous bases Nitrogenous bases
adenine (A) adenine (A)
cytosine (C) cytosine (C)
guanine (G) guanine (G)
thymine (T) uracil (U)
Phosphoric acid Phosphoric acid
Sugar: deoxyribose Sugar: ribose

3
Bases
Sugars of nucleic acids

Nomenclature
Bases: no sugar or phosphate
Nucleosides = sugar + base
Nucleotides = sugar + phosphate group
+ base

Nucleoside
Nucleotides

4
A trinucleotide

Alleles differ from one another in the sequence of DNA at the chromosomal locus.
Variation in DNA sequences may arise due to mutations.

Mendelian genetics
In 1865 Gregor Mendel, the Augustinian (now Czech Republic) monk published his findings on
the inheritance of seven different traits in the garden pea. Mendel was the first to describe how
chromosomes are transmitted between generations.

5
Gregor Mendel 1822-1884

Different versions of each gene are called alleles. Alleles differ from one another in the sequence
of DNA at the chromosomal locus. One allele can be dominant over the other, the recessive
allele. Mendel performed a monohybrid cross (single traits) of green and yellow seeded garden
pea. All progeny in the first filial (F1) generation were yellow seeded demonstrating that yellow
color was dominant, green color was recessive. Selfing the F1 progeny to get F2, some green-
seeded pea reappeared in the F2. Mendel concluded that allele from green seed must have been
preserved in the F1 generation even though it did not affect the seed color. Each parent carried
two copies of the gene, i.e. parents were diploid for that trait. Homozygotes had two copies of
the same allele. Heterozygotes had one copy of each allele. Gametes carried only one copy of the
gene, i.e. they are haploid.

In Mendelian genetics, the progeny could be assigned to discrete classes of either green or
yellow, but not mixed phenotype. Ratio of yellow to green-seeded in the F2 was 3:1 – single and
major gene inheritance, a qualitative trait.

Law of segregation
Homologous chromosomes separate during the production of gametes so that half of the cells
will be produced with one allele and half with the other allele in the heterozygote plant
A dihybrid cross, that is, crossing plants differing at two traits produced progeny whose traits
segregated independently from one another to produce an F2 progeny in the ratio of 9:3:3:1.

6
Traits of interest here are plant height (T (tall) and t (short) alleles) and seed color (Y (yellow) or
y (green) alleles)

Phenomenon described as law of independent assortment

Chromosomes from different homologous chromosome pairs separate independently from one
another during the production of gametes.

Population genetics is the application of Mendel‟s laws and other genetic principles to the study
of variation within and between populations of species.
Mendelian traits are characterized by single and major gene inheritance, phenotype can be
classified into discrete categories such as tall and short. Mendelian traits are not influenced by
the environment. Inheritance conforms to segregation ratios in the F2 and backcross generations.

Two discrete classes of


phenotypes- Mendelian genetics

7
However, most traits are more complex than Mendelian traits. They are controlled by the action
and interaction of many minor genes (polygenic traits), exhibit continuous variation to show a
wide range of phenotypes, and are strongly influenced by environmental factors. These are
quantitative traits.

Francis Galton (1822-1911) was the pioneer of quantitative genetics. Quantitative genetics is the
study of continuous variation where phenotypes of organisms are measured on a quantitative
scale rather than discrete classes. Examples of quantitative traits are:
Height
Weight
Skin color, etc.

Francis Galton (1822-1911)

Continuous variation is caused by the joint segregation of multiple and minor genes, each having
a small effect on the phenotype. The segregation of one gene is obscured by the segregation of
the other gene affecting the trait. Individual genes cannot be identified by their segregation ratios
in the F2 and BC1 generations. Genetic segregation is obscured by environmental effects. The
distribution of the phenotypes conforms closely to a normal distribution.

A graph showing normal distribution

8
Quantitative geneticist study the inheritance of the individual differences in phenotypic
measurements.Though Mendelian ratios cannot be applied, the inheritance of quantitative traits
depends on genes subject to the same laws of transmission displayed by qualitative differences.
Quantitative genetics is therefore an extension of Mendelian genetics.
Because ratios cannot be observed, single progenies do not provide enough information.
The unit of study is made up of larger groups of individuals: populations of many progenies. The
trait to be measured is given a score, not classified into discrete groups.
Most traits of importance to agriculture are quantitative.

Although many quantitative genetic predictions can be made without understanding the
underlying genetic details, advances in molecular genetics have facilitated identifying gene
regions harboring variants affecting quantitative traits, and in some cases cloning the relevant
loci. Emerging technologies for assessing genome-wide population genetic variation for
thousands of individuals (by microarray analysis) bring the promise of using population genetic
principles to rapidly map genes and variants affecting quantitative traits, including susceptibility
to common diseases, in many organisms.

Sources of variation
Mutation
Recombination
Migration
Transposable elements

Mutation
Mutation is the process by which the nucleotide sequence of a single gene changes, as a result of
point mutations and chromosome rearrangement, such as duplication, inversion or translocation.
In general, mutational changes are deleterious and lead to cell death, especially if it occurs in
somatic cells (source of cancers). Mutation in the germline which is passed on to the next
generation and enhances survival rates is the ultimate source and drive for evolution. These are
the mutations relevant to population and quantitative genetics. Mutations that segregate in
populations are called polymorphisms.

9
Mutations occur at random and can vary in their effect. They may be neutral with no phenotypic
expression, or cause variations to an individual‟s phenotype, which may range from small-scale
to large-scale. It is important that not too many mutations occur in a single DNA molecule at a
time. For a cell or an organism to be able to evolve through time, the base sequence of its DNA
must be capable of change.

Sources of mutation
DNA replication error under the following circumstances:
- Substitution of an incorrect base
- Accidental insertion or deletion of an extra base in the daughter cell
- Inefficient DNA repair mechanism that does not faithfully correct
- DNA damaged by a mutagenic agent

There are two principal mechanisms of mutation: (1) there is a chemical alteration of the base
(by a chemical or radiation) that gives it new hydrogen-bonding properties and thus cause a
different base to be incorporated upon replication. The new sequence must persist so that
progeny cells will have the new sequence (i.e. the change must be heritable).

(2) In order to ensure cell survival, mutation rate must be kept low. The following mechanisms
are employed for keeping mutation rates low:
- The hydrophobic water core of the DNA bases in the interior of the double helix reduces
its accessibility to attacking molecules
- The cell has evolved several repair mechanisms for correcting alterations or replication
errors
- The repair systems are not completely efficient so mutations occur at a rate that is very
low, but useful in an evolutionary sense

Kinds of Mutation
1. POINT MUTATION
This is a change in only a single base pair from the wild type
Point mutation may be:

10
Base substitution
Base insertion
Base deletion

MUTATIONS CAN HAVE A RANGE OF EFFECTS ON DNA SEQUENCES


(a) Point mutations
Point mutations are changes of one nucleotide to another. Point mutations are called transitions
when the change is a purine for a purine (A↔G or) or a pyrimidine for a pyrimidine (C↔T).
Transition mutations are more common.
Point mutations may also be transversion mutations, which result from the change of a purine to
pyrimidine or vice versa (A/G ↔C/T). Point mutations give rise to single nucleotide
polymorphisms, or SNPs (pronounced „snips‟).

11
Polymorphisms
DNA polymorphisms are DNA sequences that vary between two related genomes. They are
usually not present in a gene. It is the ultimate in molecular markers. Polymorphisms don‟t have
to be associated with a restriction site or specific PCR primer. Techniques to follow SNPs are
still evolving. In certain applications, one can simultaneously follow thousands of
polymorphisms in a single experiment

Consequences of point mutation in terms of the amino acid sequence affected


• Silent mutation - when there is no change in the amino acid translated due to the
degenerate genetic code
• Missense mutation (amino acid substitution)–an amino acid in the wild type is replaced
by a different amino acid in the mutant
• Nonsense mutation – an amino acid is replaced with a stop codon (TGA,TAA or TAG).
Will result in the truncation of a protein (premature stop)
Example:
AUA GGA UAC ACA CCA…mRNA
Ile Gly Tyr Thr Pro

AUA GGA UAA ACA CCA…


Ile Gly Stop

12
The reading frame following the nonsense mutation is still the same, unlike with a frameshift
mutation

2. INSERTION OR DELETION MUTATIONS


Insertion and deletion mutations occur when one or more nucleotides are added (insertions)
or subtracted (deletions), and give rise to indel polymorphisms, or copy number variation
(CNV). The most common category of indel polymorphism involves small numbers of
bases, but entire genes, gene regions and even whole chromosomes can be duplicated or
deleted.

Generally these variations are not used for mapping.

13
Other types of mutations lead to sequences that are repeated in the genome. The repeated
sequences can be arranged tandemly in one location, or dispersed throughout the genome.
Examples of tandemly repeated sequences are microsatellites (also called simple sequence
repeats, or SSRs) and minisatellites. Microsatellites are simple sequences of two, three or four
nucleotides that are repeated 10-100 times. The number of repeats can vary substantially among
individuals, leading to polymorphism in repeat copy number. Minisatellites, also VNTR
(variable number of tandem repeats) are similar to microsatellites in that a core sequence is
tandemly repeated many times at a single location, but the repeated sequence is more
complicated, containing 10-100 base pairs. The high amount of polymorphism in numbers of
microsatellite and minisatellite repeats makes them ideal for mapping genes in pedigrees and for
individual identification.

Microsatellite markers
• Microsatellite markers were developed in 1989
• Dinucleotide repeat: CA CA CA CA
• Trinucleotide repeat: CAG CAG CAG CAG
• Tetranucleotide repeat: TAGC TAGC TAGC TAGC
• The number of repeats can vary substantially among individuals, leading to
polymorphism in repeat copy number

3. TRANSPOSABLE ELEMENTS
Dispersed repetitive sequences are typically transposable elements (TEs), or selfish genetic
elements, which can replicate by jumping to different genomic locations. Barbara McClintock
was the first scientist to predict that transposable elements, mobile pieces of the genetic material
(DNA), were present in eukaryotic genomes.
• She performed her work on corn and specifically followed seed color phenotypes.
• Later, other TEs were found in Drosophila, yeast, and bacteria.

14
Barbara McClintock, 1902-
1992. 1983 Nobel Laureate in
Physiology or Medicine

There are many different families of TE or simply, transposons, based on their size, structure and
mechanism of transposition. TEs are broadly classified as being retrotransposons (transposable
element I) or DNA transposon (transposable element II)

Retrotransposons
These transposable elements replicate themselves in a genome via transposition through an RNA
intermediate. Retrotransposons are abundant in plants, where they are often a principal
component of nuclear DNA.
Maize = 49-78% of the genome. Humans = 42%.

The replication and transposition processes occur rapidly to increase the copy numbers of
elements and thereby can increase genome size. The following steps occur:
1. Retrotransposons copy themselves into RNA
2. The RNA is converted back to DNA by a reverse transcriptase, which the retrotransposon
encodes
3. The DNA is integrated back to the genome.

To avoid deleterious effects of the transposition, a regulatory mechanism made up of host-


encoded factors controls the event.
DNA (encodes a reverse transcriptase)→ RNA intermediate→ Replicated sequence

Types of retrotransposons
15
There are two sub-types, those having
(A) Long terminal repeat at each end (LTR-retrotransposons) and the DNA at the
insertion site is duplicated. They range from ~100 bp to over 5 kb in size. These encode
reverse transcriptase and are similar to retroviruses.
LTR Retrotransposons are in turn divided into two broad categories which differ in the size of
the duplication
1. copia elements which generate 5-bp direct duplications
2. gypsy elements in Drosophila which create 4-bp direct duplications
• molecular analysis has determined that many of the classical Drosophila mutations
are the result of transposon insertion.
• Insertion does not always eliminate the function of the gene product, but rather
changes its function enough to result in another phenotype.

(B) non-LTR retrotransposons


Non-LTR retrotransposons consist of two sub-types, long interspersed elements (LINEs > 5 kb
pairs in size), and short interspersed elements (SINEs, < 500 base pairs in size) and occur in high
copy numbers of up to 250,000 in the plant species. The LINEs encode reverse transcriptase,
lack LTRs, and are transcribed by RNA polymerase II. Non-long terminal repeat (LTR)
retrotransposons are widespread in eukaryotic genomes. The SINEs do not encode reverse
transcriptase and are transcribed by RNA polymerase III.

(C) DNA transposon (transposable element II)


This is a transposable element that replicates itself by a cut-and-paste transposition mechanism
without the involvement of an RNA intermediate. The transpositions are catalyzed by several
transposase enzymes. Some transposases non-specifically bind to any target site in DNA,
whereas others bind to specific DNA sequence targets. The transposase makes a staggered cut at
the target site resulting in single-strand 5' or 3' DNA overhangs (sticky ends). This step cuts out
the DNA transposon, which is then ligated into a new target site; this process involves activity of
a DNA polymerase that fills in gaps and of a DNA ligase that closes the sugar-phosphate
backbone. This results in duplication of the target site. The insertion sites of DNA transposons
may be identified by short direct repeats (created by the staggered cut in the target DNA and

16
filling in by DNA polymerase) followed by a series of inverted repeats important for the TE
excision by transposase. Cut-and-paste TEs may be duplicated if their transposition takes place
during S phase of the cell cycle. Such duplications at the target site can result in gene
duplication, which plays an important role in evolution.

Examples of DNA transposons include the Ac/Ds elements of maize, the first transposable
element system to be described; and the P element of Drosophila, which has been harnessed as
an important transformation vector. The genome copy number of TEs can vary from less than 10
to over one million, such as the Alu TE (a SINE) in humans. There can be polymorphism for
both the total number of transposons as well as individual insertion sites. Segmental duplications
(also called low copy repeats) are also interspersed sequences, consisting of 1-400 kb pairs of
highly homologous genomic regions located in multiple genomic regions.

In addition to changes in state and copy number, mutations can also result in a change in gene
order. Inversions occur when a segment of the genome within a chromosome is reversed, and can
be polymorphic in natural populations. Translocations occur when a segment of the genome is
deleted from one region of the genome and re-inserted in another – either on the same or a
different chromosome. Translocations may involve a single DNA segment, or be reciprocal
translocations of two different segments.

Effects of Mutation
The consequences of mutations depend on where they occur. Eukaryotic protein coding genes
are typically split into several exons separated by non-coding introns, with non-coding 3‟ and 5‟
regulatory sequences. Adjacent genes may be separated by stretches of sequence with no known
function, or genes may be nested in the same or opposite orientation within other genome.

Because of the degeneracy of the genetic code, point mutations in coding regions can be silent
(or synonymous) if the mutation does not change the amino acid encoded by the codon. Non-
synonymous mutations either change the code of the amino acid to another one (missense
mutations) or to a stop codon, leading to a truncated protein (nonsense mutations). Insertions and
deletions in coding regions alter the reading frame, which can lead to a non-functional protein

17
(frameshift mutation), or a potentially altered protein if the insertion/deletion is an in-frame
multiple of three bases. Similarly, TE insertions in coding regions can result in non-functional
proteins or proteins with novel function. Mutations in regulatory regions could alter the timing,
efficiency or tissue-specific patterns of gene expression, or alter the number and size of
transcripts by changing splice sites.

In population genetic models, the only mutational effect that matters is the effect on reproductive
fitness. Mutations may have no effect on fitness and be selectively neutral; have slight to strong
deleterious effects and cause reduced fitness or even lethality; or be advantageous and favored
by natural selection. In quantitative genetic models, the effect of the mutation on the measured
value of the trait, as well as the effect on fitness, matters. The trait can be an external measure of
the organism‟s phenotype (counts, linear dimensions, weights, performance indices) or an
intermediate endophenotype (transcript, metabolite or protein abundance).

Recombination
Recombination is another source of variation in a population. It occurs when two homologous
chromosomes exchange some of their genetic material producing two chromosomes that are
genetically unique from the parental chromosomes. Recombination enlarges the amount of
genetic diversity in the population by increasing the number of alleles at any given genetic locus
to generate diversity.

18
The further away two points are on the chromosome, the more recombination there is between
them. Because recombination varies aloong the chromosome, we can obtain relative positions for
loci on a genetic map.

Recombination, r, is quantified by the ratio of the number of recombinant gametes to the total
number of gametes produced by one generation of meiosis. If r = 0.5, one-half of the gametes
produced in each meiosis are recombinant, and one half are non-recombinant, as expected by
Mendelian segregation of chromosomes. For example, in a cross of individuals containing two
unlinked mutations at two different loci, with free recombination (r = 0.5) all four gamete types
will be produced by the F1 progeny. With complete linkage (for example, two different mutations
of the same nucleotide), recombination cannot separate the mutations and only two gamete types
will be produced by the F1 progeny. Thus, one can estimate r by counting the number of
recombinant and non-recombinant gametes in a dihybrid cross.

There are two metrics of the distance between genetic variants. One is the physical distance,
measured in base pairs; the other is the recombination distance, measured in centiMorgans (cM).
For recombination fractions < 0.1 r, the relationship between recombination fraction and
recombination distance is linear; i.e. 0.1 r = 10 cM. For recombination fractions > 0.1 r,

19
unobserved double cross-overs mean the relationship between the real recombination distance
and the observed fraction of recombinants is not linear. Several mapping functions have been
derived to account for this nonlinearity, the most well-known of which is Haldane‟s mapping
function: m = ln(12r), where m is the true map distance in Morgans.

Recombination is not constant across the genome of any species, or between species. For
example, in Drosophila, recombination only occurs in females, and is lower at the ends than the
middle of chromosomes. In humans, recombination is highly heterogeneous across the genome,
with „hot‟ and „cold‟ spots. In other words, the relationship between r and physical distance is a
constant. Thus, we need to consider the recombination landscape as well as mutation in models
predicting the fate of new mutations in populations.

Recombination is common between homologous chromosomes. However, when there are repeat
sequences in the genome, recombination can occur between non-homologous repeats, such as
between tandemly repeated sequences or genes, or between interspersed repeats such as TEs and
segmental duplications. Non-homologous recombination between tandem repeats leads to
increases and decreases in copy number of the repeats, accounting for the high variance in copy
number of microsatellites and minisatellites. Non-homologous recombination between segmental
duplications or between TEs of the same family can lead to duplications, deletions and
inversions.

Detecting Genetic Variation


The most straightforward method for identifying genetic variation is direct DNA sequencing.
The current gold standard is dye-terminator sequencing, producing the familiar four-color
sequence reads of ~500 bp from a single reaction. However, this method is costly and not high-
throughput. Currently there are several high-throughput, massively parallel DNA sequencing
technologies in development and in production (for example Illumina‟s Solexa, Life Sciences
454 and ABIs Solid, and Perlegen‟s sequencing by hybridization technology). All are much less
costly than dye-terminator sequencing, but are less accurate. Nevertheless, in the near future
high-accuracy direct sequencing of whole genomes of large numbers of individuals will be
possible for reasonable cost. In fact, the X Prize Foundation established the Archon X Prize in

20
October 2006, intending to award $10 million to "the first Team that can build a device and use it
to sequence 100 human genomes within 10 days or less, with an accuracy of no more than one
error in every 100,000 bases sequenced, with sequences accurately covering at least 98% of the
genome, and at a recurring cost of no more than $10,000 (US) per genome."

For many applications we do not need to know complete genome sequences, but do require large
numbers of SNP genotypes for large numbers of individuals. Again, there are several
competing methods for high-throughput SNP genotyping, including Illumina‟s Golden Gate
assay, Affymetrix Oligonucleotide Microarray Based SNP Chips, and Pyrosequencing (a re-
sequencing by synthesis method). Lower throughput methods include identifying restriction
fragment length polymorphisms (RFLPs). With this simple method one amplifies a genomic
region of interest using PCR, digests the sample with a restriction endonuclease, and performs a
gel assay to deduce the size of the sample. Two fragments will be visualized if the restriction site
is present in the sample, but there will be only one larger fragment if there is a SNP that alters
the restriction site. Much of the population genetics literature from the 1960s-1980s was based
on differences in mobility of fragments digested with enzymes when subjected to gel
electrophoresis. This method only detects SNPs in coding regions that alter the charge of the
protein.

The importance of copy number variation is becoming increasingly apparent. The most common
population method for detecting CNV is by hybridization of genomic DNA to cDNA or
oligonucleotide microarrays. Tiling arrays, which represent the entire genome, are particularly
informative.

Cytological methods are used to detect chromosomal aberrations and rearrangements.


Individual chromosomes can be identified microscopically by their size, position of the
centromere, and characteristic pattern of bands when stained with various chemicals. One can
then detect gross differences in chromosome number as well as deletions, insertions, inversions
and translocations by comparing the karyotype (chromosome complement) of an individual to
the standard karyotype of the species. More precise analyses are possible using fluorescence in
situ hybridization (FISH).

21
Detecting Phenotypic Variation
The first step to assess variation in phenotypes is to define the trait or traits of interest, and the
second is to devise an assay to obtain a measure of the trait. For example, human height can
be measured in centimeters, mouse weight in grams, and Drosophila bristle number by a hair
count. Aspects of cognitive performance can be assessed by the amount of time it takes to learn
or to complete a task. Many traits can be simply scored as present or absent; or, in the case of
human diseases, affected or not affected. It is very important to define the trait precisely. “Body
size” could refer to a linear measure of stature, weight, or a combined function of the two, and
will change as the individual ages or as food intake or another aspect of the environment is
altered. Precise definitions are particularly important in studies of human disease and psychiatric
disorders, where it is possible that not all patients are diagnosed using the same criteria.

One can also measure variation in molecular phenotypes, such as transcript, protein and
metabolite abundance. There are several commercially available platforms for quantifying
transcript abundance on a genome-wide scale for organisms with complete genome sequences.
More generally, quantitative PCR techniques can be used to assess the relative abundance of
any transcript of interest. However, the abundance of transcripts, proteins and metabolites is
highly dynamic and changes with stage of development/age, tissue, and the environment to
which an individual is exposed, so precise definition of the conditions under which the
measurements are taken is critical.

22

You might also like