1 of 27

Recombination Frequency

1

Chapter 4

Lecture structure

  • Polymorphic markers and linkage analysis
  • Gene mapping: linkage analysis

2 of 27

Recombination Frequency

2

Overview

  • An important step in understanding the basis of an inherited disease is to locate the gene(s) responsible for the disease.
  • This chapter provides an overview of the techniques that have been used to map and clone thousands of human genes.

Polymorphic markers and linkage analysis

  • A prerequisite for successful linkage analysis is the availability of a large number of highly polymorphic markers dispersed throughout the genome.
  • There are several classes of polymorphic markers; over 20,000 individual examples of these polymorphic markers at known locations have been identified and are available for linkage studies.
  • Collectively, these markers provide a marker map of each chromosome, so when one wishes to map the position of a gene—often one involved in a disease process—the gene’s position can be located relative to one or more markers by recombination mapping.
  • These high-density marker maps allow genome-wide screening for mapping genes.

3 of 27

Recombination Frequency

3

  • Types of DNA Polymorphisms
    • RFLPs (restriction fragment length polymorphisms)
    • VNTRs (variable number of tandem repeats)
    • STRPs (short tandem repeat polymorphisms, or microsatellites)
    • SNPs (single nucleotide polymorphisms)

  • RFLPs (restriction fragment length polymorphisms)
  • Restriction endonuclease (RE) sites—palindromes recognized by specific Restriction endonucleases—can serve as 2-allele markers.
  • A specific site may be present in some individuals (allele 1) and

absent in others (allele 2), producing different-sized restriction fragments that can be visualized on a Southern blot.

4 of 27

Types of DNA Polymorphisms

4

  • RFLPs (restriction fragment length polymorphisms)
  • Alternatively, the area containing the restriction site can be amplified by using a PCR (polymerase chain reaction) and treating the PCR product with the restriction enzyme to determine whether
    • one fragment is produced (RE site absent) or
    • two fragments are produced (RE site present) .
  • The PCR product(s) can be separated by agarose gel electrophoresis and visualized directly without using a blot.

5 of 27

5

6 of 27

Types of DNA Polymorphisms

6

  • VNTRs (variable number of tandem repeats)
  • These polymorphisms are the result of varying numbers of minisatellite repeats in a specific region of a chromosome.
  • The minisatellite repeat units typically range in size from 20 to 70 bases each.
  • The repeat is flanked on both sides by a restriction site, and variation in the number of repeats produces restriction fragments of varying size.
  • VNTRs are used infrequently in current genetic mapping, as they tend to cluster near the ends of chromosomes.

7 of 27

Types of DNA Polymorphisms

7

  • STRPs (short tandem repeat polymorphisms, or microsatellites)
  • Short tandem repeat polymorphisms are repetitive sequences in which the repeated unit is generally 2–6 bases long.
  • An example of a dinucleotide repeat (CA/TG) is shown below.
  • These markers have many alleles in the population, with each different repeat length at a locus representing a different allele.
  • STRPs can be amplified with a PCR by using primers designed to flank the repeat block.
  • Variation in the number of repeats produces PCR products of varying length, which can then be visualized on agarose gel electrophoresis.
  • STRPs are distributed throughout the chromosomes, making them very useful in mapping genes.
  • STRPs are used in paternity testing and in forensic cases, but these sequences can also be used in gene mapping.

8 of 27

Types of DNA Polymorphisms

8

  • SNPs (single nucleotide polymorphisms)
  • SNPs represent nucleotide positions in the human genome where only two nucleotides— for example, C or G—are found.
  • These occur on average about once in every 1,000 base pairs (bp) and, like STRPs, are very useful in mapping genes.
  • Unlike STRPs, which have multiple alleles at each locus, SNPs are usually two-allele markers (either G or C in the above example).
  • These can be typed by PCR amplification and identification by sequencing or through the use of probes on DNA chips, a process which can be automated.

9 of 27

Gene mapping: linkage analysis

9

  • Crossing Over, Recombination, and Linkage
  • The first step in gene mapping is to establish linkage with a known polymorphic marker (one with at least two alleles in the population).
  • This can be done by recombination mapping to determine whether the gene is near a particular marker.
  • Multiple markers on different chromosomes are used to establish linkage (or the lack of it).
  • Recombination mapping is based on crossing over during meiosis, the type of cell division that produces haploid ova and sperm.
  • During prophase I of meiosis, homologous chromosomes line up and occasionally exchange portions of their DNA. This process is termed crossover.

Process of Crossing Over Between Homologous Chromosomes

10 of 27

Gene mapping: linkage analysis

10

  • Crossing Over, Recombination, and Linkage
  • When a crossover event occurs between two loci, G and M, the resulting chromosomes may contain a new combination of alleles at loci G and M.
  • When a new combination occurs, the crossover has produced a recombination.
  • Because crossover events occur more or less randomly across chromosomes, loci that are located farther apart are more likely to experience an intervening crossover and thus a recombination of alleles.
  • Recombination frequency provides a means of assessing the distance between loci on chromosomes, a key goal of gene mapping.

11 of 27

Gene mapping: linkage analysis

11

  • Crossing Over, Recombination, and Linkage
  • If the gene of interest (with alleles G1 and G2) and the marker (with alleles M1 and M2) are on different chromosomes, the alleles will remain together in an egg or a sperm only about 50% of the time.
  • They are unlinked.

12 of 27

Gene mapping: linkage analysis

12

  • Crossing Over, Recombination, and Linkage
  • If the gene and the marker are on the same chromosome but are far apart, the alleles will remain together about 50% of the time.
  • The larger distance between the gene and the marker allows multiple crossovers to occur between the alleles during prophase I of meiosis.
  • An odd number of crossovers separates G1 from M1 (recombination), whereas an even number of crossovers places the alleles together on the same chromosome (no recombination).
  • The gene and marker are again defined as unlinked.

13 of 27

Gene mapping: linkage analysis

13

  • Crossing Over, Recombination, and Linkage
  • If the gene and the marker are close together on the same chromosome, a crossover between the two alleles is much less likely to occur.
  • Therefore, G1 and M1 are likely to remain on the same chromosome more than 50% of the time.
  • In other words, they show less than 50% recombination. The gene and the marker are now defined as linked.

14 of 27

Gene mapping: linkage analysis

14

  • Recombination Frequencies and Gene Mapping
  • The closer together two linked loci are—for instance, a gene and a marker—the lower the recombination frequency will be between them.
  • Therefore, recombination frequency can be used to estimate proximity between a gene and a linked marker.
  • The following example of a family with neurofibromatosis type 1, an autosomal dominant disorder with complete penetrance, illustrates the concept of recombination frequency. Some members of the family have the disease-producing allele of the gene (indicated by phenotype in the pedigree) whose location is to be determined. Other individuals have the normal allele of the gene.
  • Each individual has also been typed for his or her allele(s) of a 2-allele marker (1 or 2).
  • Three steps are involved in determining whether linkage exists and, if so, estimating the distance between the gene and the known marker.

1. Establish linkage phase between the disease-producing allele of the gene and an allele of the marker in the family.

2. Determine if linkage exists between the 2 alleles.

3. If linkage exists, estimate the recombination frequency.

15 of 27

Gene mapping: linkage analysis

15

  • Recombination Frequencies and Gene Mapping

Pedigree for Neurofibromatosis Type

16 of 27

Gene mapping: linkage analysis

16

  • Recombination Frequencies and Gene Mapping
  1. Linkage Phase.
  2. The pedigree indicates that the grandmother (I-2) had the disease- producing allele (A) of the gene, which she passed to her daughter (II-2).
  3. Is it also possible to determine which allele of the marker was passed from the grandmother to her daughter? Yes, allele 1.
  4. If linkage is present, the disease producing allele (A) is linked to allele 1 of the marker.
  5. We can then designate the daughter’s 2 haplotypes as AM1/aM2, indicating the chromosomes from her mother/father respectively.

A haplotype is the combination of

alleles on a single chromosome.

Individual II-2 has two haplotypes, AM1 and aM2, depicted

below, where A and a are

alleles of the gene

causing the disease.

The marker has alleles

1 and 2.

17 of 27

Gene mapping: linkage analysis

17

  • Recombination Frequencies and Gene Mapping

2) Determine If Linkage Exists.

  • Are the gene and the marker actually linked as we hypothesize?
  • Looking at the children in generation III (each representing a meiotic event in their mother, II-2), we would expect a child who inherited marker allele 1 from the mother to have the disease.
  • The children who inherited allele 2 from the mother should not have the disease.
  • Examination of the 6 children’s haplotypes shows that this assumption is true in all but one case (III-6).
  • Because the AM1 and aM2 haplotypes remain together more than 50% of the time (or, conversely, are separated by recombination less than 50% of the time), our hypothesis of linkage is correct.

18 of 27

Gene mapping: linkage analysis

18

  • Recombination Frequencies and Gene Mapping
  1. Estimate the Recombination Frequency.

Out of 6 children, there is only one recombinant (III-6).

The estimated recombination frequency in this family is 1/6, or 17%.

Recombination frequencies can be related to physical distance by the centimorgan (cM)

  • The recombination frequency provides a measure of genetic distance between any pair of linked loci.
  • This distance is expressed in centimorgans.
  • The centimorgan is equal to 1% recombination frequency.
  • For example, if two loci show a recombination frequency of 2%, they are said to be 2 centimorgans apart.
  • Physically, 1 cM is approximately equal to 1 million base pairs of DNA (1 Mb).
  • This relationship is only approximate, however, because crossover frequencies are somewhat different throughout the genome, e.g., they are less common near centromeres and more common near telomeres.

19 of 27

Gene mapping: linkage analysis

19

  • Determining Recombination Frequency Accurately: LOD Scores
  • In the previous example, a very small population (6 children, or 6 meiotic events) was used to determine and calculate linkage, allowing only a very rough estimate of linkage distance.
  • In fact, there is some small chance that the gene and the marker are not actually linked at all and the data were obtained by chance.
  • We could be more confident that our conclusions were correct if we had used a much larger population.
  • Because families don’t have 100 or 200 children, the next best approach is to combine data from different families with this same disease to increase the number of meioses examined.
  • These data can be combined by using LOD (log of the odds) calculations.

20 of 27

Gene mapping: linkage analysis

20

  • Determining Recombination Frequency Accurately: LOD Scores
  • A LOD (log of the odds) score, calculated by computer, compares the probability (P) that the data resulted from actual linkage with a recombination frequency of theta (θ) versus the probability that the gene and the marker are unlinked (θ = 50%) and that the data were obtained by chance alone.
  • In the example calculation:
  • In practice, because 17% might not be the correct number, the computer calculates these probabilities assuming a variety of recombination frequencies from θ = 0 (gene and marker are in the same location)

to θ = 0.5 (gene and marker are unlinked).

  • The “odds of linkage” is simply the probability that each recombination frequency (θ) is consistent with the family data.

21 of 27

Gene mapping: linkage analysis

21

  • Determining Recombination Frequency Accurately: LOD Scores
  • If data from multiple families are combined, the numbers can be added by using the log10 of these odds.
  • This equation need not be memorized.
  • These calculations are done by computer and are displayed as a LOD table that gives the LOD score for each recombination frequency, θ.

LOD Scores for a Gene and a Marker

22 of 27

Gene mapping: linkage analysis

22

  • Determining Recombination Frequency Accurately: LOD Scores

When interpreting LOD scores, the following rules apply:

LOD score > 3.00 shows statistical evidence of linkage. (It is 1,000 times more likely that the gene and the marker are linked at that distance than unlinked.)

• LOD score < –2.00 shows statistical evidence of nonlinkage. (It is 100

times more likely that the gene and the marker are unlinked than linked at that distance.)

LOD score between –2.00 and 3.00 is indeterminate.

23 of 27

Gene mapping: linkage analysis

23

  • Determining Recombination Frequency Accurately: LOD Scores
  • An examination of the table below shows that in only one case is there convincing evidence for linkage and that score has a recombination frequency of 0.10.
  • Therefore, the most likely distance between the gene and the marker is a recombination frequency of 10%, or 10 cM.
  • If no LOD score on the table is >3.00, the data may be suggestive of linkage, but results from additional families with the disease would need to be gathered.

LOD Scores for a Gene and a Marker

24 of 27

Gene mapping: linkage analysis

24

Gene mapping by linkage analysis serves several important functions:

• It can define the approximate location of a disease-causing gene.

Linked markers can be used along with family pedigree information

for genetic testing. In practice, markers that are useful

for genetic testing must show less than 1% recombination with the gene

involved (be <1 cM distant from the gene).

• Linkage analysis can identify locus heterogeneity.

25 of 27

Chapter 4: Recombination Frequency

25

  • Review Questions

  • Select the ONE best answer.

1. A family with an autosomal dominant disorder is typed for a 2 allele

marker, which is closely linked to the disease locus. Based on the individuals

in Generation III, what is the recombination rate between the disease

locus and the marker locus?

A. 0

B. 0.25

C. 0.50

D. 0.75

E. 1.0

F. The marker is uninformative

26 of 27

Chapter 4: Recombination Frequency

26

  • Review Questions

  • Select the ONE best answer.

2. A man who has alkaptonuria marries a woman who has hereditary sucrose

intolerance. Both are autosomal recessive diseases and both map to 3q with

a distance of 10 cM separating the two loci.

What is the chance they will have a child with alkaptonuria and sucrose intolerance?

A. 0%

B. 12.5%

C. 25%

D. 50%

E. 100%

27 of 27

Chapter 4: Recombination Frequency

27

  • Review Questions

  • Select the ONE best answer.

3. In a family study following an autosomal dominant trait through 3

generations, two loci are compared for their potential linkage to the disease

locus. In the following 3-generation pedigree, shaded symbols indicate the

presence of the disease phenotype, and the expression of ABO blood type

and MN alleles are shown beneath each individual symbol.

Which of the following conclusions can be made about the linkage of the

disease allele, ABO blood group locus, and MN locus?

A. The ABO and MN alleles are linked, but assort independently from the disease allele

B. The ABO, MN, and disease alleles all assort independently

C. The disease allele is linked to the ABO locus

D. The disease allele is linked to the ABO and MN loci

E. The disease allele is linked to the MN locus