N◦
d’ordre: Ann´ee 2011
THESE
pr´esent´ee devant
l’UNIVERSITE CLAUDE BERNARD - LYON I
pour l’obtention
du DIPLOME DE DOCTO...
ii
UNIVERSITE CLAUDE BERNARD - LYON 1
Président de l’Université
Vice-président du Conseil d’Administration
Vice-président ...
iii
Abstract
Meiotic recombination plays several critical roles in molecular evolution. First, recombina-
tion represents ...
iv
Notations
General abbreviations
A Adenine
bp base pair
C Cytosine
CpG A dinucleotide CG, p standing for a phosphate link.
...
vi
HS Heterogeneous Stock (for mouse populations)
LD Linkage Disequilibrium
LE Lateral Element (referring to the SC)
NAHR ...
vii
σ0 Uniform basal tensile stress in the mechanical stress model of
Kleckner et al. (2004).
⊗ Tensor product of matrices...
viii
Definitions
Information assembled from Stumpf and McVean (2003); Arnheim et al. (2007);
Lynch (2007); Paigen and Petkov (20...
x
Effective Population Size (Ne) Represents the size of an ideal population (identical
individuals, random mating, no overl...
xi
Holliday junction The point at which the strands of the two dsDNA molecules
exchange partners as an intermediate step i...
xii
individuals or strains. There are millions of SNPs in mammalian genomes, and they have
become the preferred markers fo...
Contents
Pr´eambule 1
Introduction 7
I Molecular mechanisms of recombination 9
I.1 Meiosis . . . . . . . . . . . . . . . ....
xiv CONTENTS
II.1.3 Linkage Disequilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 57
II.1.3.1 Quantifying LD an...
CONTENTS xv
IV.3.1 Sex-specific impact in vertebrates . . . . . . . . . . . . . . . . . . . 117
IV.3.2 Chromosome localizat...
xvi CONTENTS
Pr´eambule
D`es le passage `a l’agriculture et `a l’´elevage des animaux, les hommes ont entam´e les
premi`eres exp´erienc...
2 Pr´eambule
and among chromosomes ; and random genetic drift ensures that gene frequencies
will deviate a bit from genera...
3
des telom`eres et est reduite dans les g`enes et `a cˆot´e du centrom`ere (Myers et al., 2005;
Mancera et al., 2008; Pai...
4 Pr´eambule
´etudi´es assure une forte r´esolution des COs `a l’´echelle du g´enome (Myers et al., 2005). Le
d´es´equilib...
5
simple.
L’´etude des distances inter-COs n’ayant ´et´e men´ee que chez quelques esp`eces. Les
estimations de nos param`e...
6 Pr´eambule
des telom´eres conservent une haute densit´e en points chauds de recombinaison chez une
majorit´e des esp`ece...
Introduction
When humankind first started practicing agriculture and animal breeding, it also initiated
the first genetic ex...
8 Introduction
progress. These technological breakthroughs have facilitated the acquisition of large
amount of high-qualit...
Chapter I
Molecular mechanisms of
recombination
This chapter provides the necessary basis to understand the molecular mech...
10 Chapter I. Molecular mechanisms of recombination
pachytene, diplotene, diakinezis. Of the wide range of proteins acting...
I.1. Meiosis 11
MeiosisIMeiosisII
Telophase I and cytokinesisAnaphase IMetaphase I
Two homologous
chromosomes
Spindle
Cent...
12 Chapter I. Molecular mechanisms of recombination
Figure I.2: Attachment to the nuclear envelope promotes chromosome mov...
I.1. Meiosis 13
on the observation that during meiosis, chromosomes pair only when transcriptionally
active (Cook, 1997). ...
14 Chapter I. Molecular mechanisms of recombination
homologs
sisterchromatids
Figure I.3: Model of the synaptonemal comple...
I.1. Meiosis 15
Figure I.4: Possible architecture of the DNA/protein recombination complexes mediating
homolog pairing. (I...
16 Chapter I. Molecular mechanisms of recombination
Zip1
DSB formation
Processing of DSB
SEI formation
DNA synthesis
Secon...
I.1. Meiosis 17
four, produce a NCO. Many predictions of the DSBR model have come true, starting
with the observation of t...
18 Chapter I. Molecular mechanisms of recombination
NCO, the DSBR model does not account for all the biological observatio...
I.2. Recombination 19
somatic cells and with MLH1, representing a possible bridge between the DSBR and Mus81
pathways (Hol...
20 Chapter I. Molecular mechanisms of recombination
Two chromosomal landmarks are considered cold DSB regions: the chromos...
I.2. Recombination 21
in yeast. Additional to the hotspot organization of DSBs, the recombination products,
COs and NCOs, ...
22 Chapter I. Molecular mechanisms of recombination
Counts
kb
50 100 150 200
0
2
4
6
8 CO
NCO
Figure I.8: Crossover and no...
I.2. Recombination 23
3, 6, 8, 9, and 12 and no degeneracy at the remaining 8 positions. An independent study
(Baudat et a...
24 Chapter I. Molecular mechanisms of recombination
Figure I.9: The recombination rate around genes in human. The blue lin...
I.2. Recombination 25
as deletions, duplications, inversions or isodicentric chromosome formation (reviewed in
Sasaki et a...
26 Chapter I. Molecular mechanisms of recombination
Figure I.11: Genome rearrangement by non-allelic homologous recombinat...
I.2. Recombination 27
are specialized, with one class of COs contributing to “pairing” of homologs, while the
other assure...
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
PopaPhDMain
of 197

PopaPhDMain

Published on: Mar 4, 2016
Source: www.slideshare.net


Transcripts - PopaPhDMain

  • 1. N◦ d’ordre: Ann´ee 2011 THESE pr´esent´ee devant l’UNIVERSITE CLAUDE BERNARD - LYON I pour l’obtention du DIPLOME DE DOCTORAT (arrˆet´e du 7 aoˆut 2006) par Alexandra Mariela POPA The evolution of recombination and genomic structures: a modeling approach. Directeur de th`ese: Christian GAUTIER Co-directrice de th`ese: Dominique MOUCHIROUD JURY: Laurent DURET Pr´esident du jury Christian GAUTIER Directeur Sylvain GLEMIN Rapporteur Christine MEZARD Rapporteur Dominique MOUCHIROUD Directrice Matthew WEBSTER Rapporteur
  • 2. ii UNIVERSITE CLAUDE BERNARD - LYON 1 Président de l’Université Vice-président du Conseil d’Administration Vice-président du Conseil des Etudes et de la Vie Universitaire Vice-président du Conseil Scientifique Secrétaire Général M. A. Bonmartin M. le Professeur G. Annat M. le Professeur D. Simon M. le Professeur J-F. Mornex M. G. Gay COMPOSANTES SANTE Faculté de Médecine Lyon Est – Claude Bernard Faculté de Médecine et de Maïeutique Lyon Sud – Charles Mérieux UFR d’Odontologie Institut des Sciences Pharmaceutiques et Biologiques Institut des Sciences et Techniques de la Réadaptation Département de formation et Centre de Recherche en Biologie Humaine Directeur : M. le Professeur J. Etienne Directeur : M. le Professeur F-N. Gilly Directeur : M. le Professeur D. Bourgeois Directeur : M. le Professeur F. Locher Directeur : M. le Professeur Y. Matillon Directeur : M. le Professeur P. Farge COMPOSANTES ET DEPARTEMENTS DE SCIENCES ET TECHNOLOGIE Faculté des Sciences et Technologies Département Biologie Département Chimie Biochimie Département GEP Département Informatique Département Mathématiques Département Mécanique Département Physique Département Sciences de la Terre UFR Sciences et Techniques des Activités Physiques et Sportives Observatoire de Lyon Ecole Polytechnique Universitaire de Lyon 1 Ecole Supérieure de Chimie Physique Electronique Institut Universitaire de Technologie de Lyon 1 Institut de Science Financière et d'Assurances Institut Universitaire de Formation des Maîtres Directeur : M. le Professeur F. Gieres Directeur : M. le Professeur F. Fleury Directeur : Mme le Professeur H. Parrot Directeur : M. N. Siauve Directeur : M. le Professeur S. Akkouche Directeur : M. le Professeur A. Goldman Directeur : M. le Professeur H. Ben Hadid Directeur : Mme S. Fleck Directeur : Mme le Professeur I. Daniel Directeur : M. C. Collignon Directeur : M. B. Guiderdoni Directeur : M. P. Fournier Directeur : M. G. Pignault Directeur : M. le Professeur C. Coulet Directeur : M. le Professeur J-C. Augros Directeur : M. R. Bernard 2
  • 3. iii Abstract Meiotic recombination plays several critical roles in molecular evolution. First, recombina- tion represents a key step in the production and transmission of gametes during meiosis. Second, recombination facilitates the impact of natural selection by shuffling genomic sequences. Furthermore, the action of certain repair mechanisms during recombination affects the frequencies of alleles in populations via biased gene conversion. Lately, the numerous advancements in the study of recombination have unraveled the complexity of this process regarding both its mechanisms and evolution. The main aim of this thesis is to analyze the relationships between the different causes, characteristics, and effects of recombination from an evolutionary perspective. First, we developed a model based on the control mechanisms of meiosis and inter-crossover interference. We further used this model to compare the recombination strategies in multiple vertebrates and invertebrates, as well as between sexes. Second, we studied the impact of the sex-specific localization of recombination hotspots on the evolution of the GC content for several vertebrates. Last, we built a population genetics model to analyze the impact of recombination on the frequency of deleterious mutation in the human population. R´esum´e La recombinaison m´eiotique joue un double rˆole de moteur ´evolutif en participant `a la cr´eation d’une diversit´e g´en´etique soumise `a la s´election naturelle et de contrˆole dans la fabrication des gam`etes lors de la m´eiose. De plus, en association avec certains m´ecanismes de r´eparation, la recombinaison, au travers de la conversion g´enique biais´ee manipule les fr´equences all´eliques au sein des populations. Les connaissances sur le fonctionnement mˆeme des ce processus ont consid´erablement augment´ees ces derni`eres ann´ees faisant d´ecouvrir un processus complexe, autant dans son fonctionnement que dans son ´evolution. Le th`eme g´en´eral de la th`ese est l’analyse, dans un contexte ´evolutif, des relations entre les diff´erent rˆoles et caract´eristiques fonctionnelles de la recombinaison. Un mod`ele de la recombinaison prenant en compte des contraintes li´ees au contrˆole de la m´eiose et le ph´enom`ene d’interf´erence a permis une comparaison entre esp`eces au sein des vert´ebr´es et des invert´ebr´es de mˆeme qu’un comparaison entre sexes. Par ailleurs, nous avons montr´e l’impact de la localisation sp´ecifique aux sexes des points chauds de recombinaison sur l’´evolution du contenu en GC des g´enomes de plusieurs vert´ebr´es. Finalement, nous proposons un mod`ele `a l’´echelle de la g´en´etique des populations, permettant d’analyser l’impact de la recombinaison sur la fr´equences de mutations d´el´et`eres dans les populations humaines. Cette th`ese, nous l’esp´erons, apportera sa pierre `a l’´etude interdisciplinaire de la recombinaison, `a la fois au sein de la biologie et par ses relations au travers de la mod´elisation avec l’informatique et les math´ematiques.
  • 4. iv
  • 5. Notations General abbreviations A Adenine bp base pair C Cytosine CpG A dinucleotide CG, p standing for a phosphate link. DNA Deoxyribonucleic acid G Guanine Gb Giga base kb kilo base Mb Mega base Myr Million years Ne effective population size SNP single nucleotide polymorphism T Thymine TSS Transcription Start Site Meiosis and recombination-related abbreviations CE Central Element (referring to the SC) cM centimorgan CO Crossover COI CO Interference COR Crossover Rate dHJ double Holliday Junction DSB Double-Strand Break DSBh Double-Strand Break hotspot DSBR Double-Strand Break Repair model dsDNA double stranded DNA F1 First generation of offspring in a crossing experiment F2 Second generation of offspring in a crossing experiment F/M Female/Male ratio HapMap1, 2, and 3 the 1st, 2nd, and 3rd respective phases of HapMap Project HJ Holliday Junction HR Homologous Recombination v
  • 6. vi HS Heterogeneous Stock (for mouse populations) LD Linkage Disequilibrium LE Lateral Element (referring to the SC) NAHR Nonallelic Homologous Recombination NCO Non-crossover NCOR Non-crossover Rate NE Nuclear Envelope NHEJ Nonhomologous End Joining PC Pairing Centres rDNA ribosomal DNA RI Recombinant Inbred lines (in a crossing experiment) SC Synaptonemal Complex SDSA Synthesis-Dependent Strand-Annealing model SEI Single End Invasion ssDNA single stranded DNA TF Transverse Filaments (referring to the SC) Mathematical symbols C coefficient of coincidence C3 Three-point coefficient of coincidence. D the difference between the frequency of a two locus haplotype and the product of the component alleles, divided by the most extreme possible value, given the marginal allele frequencies, measure of LD g genetic distance I Identity matrix m In the counting models, m stands for the number of NCO events that separate two consecutive COs. It is a measure of the strength of interference P Physical length (Mb) of an interval or chromosome p The fraction of COs that are not subject to interference under the two-pathway model Housworth and Stahl (2003). Q the substitution matrix along a branch of a phylogenetic tree R frequency of recombinants among the offspring r2 correlation of alleles at different loci, measure of LD y The mean number of DSB events in the counting model of Foss et al. (1993). # Number χ2 Chi-square distribution Γ Gamma distribution λ The rate parameter for Γ ν The shape parameter for Γ
  • 7. vii σ0 Uniform basal tensile stress in the mechanical stress model of Kleckner et al. (2004). ⊗ Tensor product of matrices BGC abbreviations BGC Biased Gene Conversion BER base excision repair gBGC GC Biased Gene Conversion GC* equilibrium or stationary GC-content MMR mismatch repair Other abbreviations AIC Akaike Information Criterion BIC Bayesian Information Criterion CEPH Centre d’Etude du Polymorphism Humain CI Confidence Interval DAF Derived Allele Frequency DT Distance to Telomeres HGMD Human Gene Mutation Database H-W test Hotteling-William’s t-test L Likelihood LCR Low Copy Repeat LDT Log Distance to Telomeres LINE Long interspersed nuclear element LOD Logarithm of Odds MHC Major Histocompatibility Complex PAR Pseudoautosomal Region PCR Polymerase Chain Reaction RE Repetitive Element SINE Short Interspersed Nuclear Element TE Transposable Element
  • 8. viii
  • 9. Definitions Information assembled from Stumpf and McVean (2003); Arnheim et al. (2007); Lynch (2007); Paigen and Petkov (2010); DB-NCBI Allele One of the variant forms of a DNA sequence at a particular locus, or location, on a chromosome. Backcross Crossing experiment in which individuals in the first generation are crossed back with one or both their parents to obtain the second generation of offspring. Bouquet formation The clustering of telomeres together on the nuclear membrane early in meiosis. Centimorgan Unit of genetic distance between markers that lie close enough to one another so that 1% of the meiotic products will exhibit a crossover between them (in a single generation) Chiasmata A chiasma (plural chiasmata) is the cytologically visible physical connection between homologous chromatids during meiosis that corresponds to the sites of genetic crossing over. Chromatid The product of chromosome replication in meiosis I. Chromatids are distinguished from chromosomes by the fact that the two daughter chromatids of one chromosome remain attached at their centromeres through meiosis I cell division. Crossover (CO) Recombination product consisting of a reciprocal exchange of DNA sequences, usually between a pair of homologous chromosomes Cytokinesis The division of the cytoplasm between two daughter cells following nuclear division. Diploid Having two gene copies at a genetic locus; as in virtually all animals and land plants. Double-strand break (DSB) Cleavage of both strands of a DNA molecule at a specific site. ix
  • 10. x Effective Population Size (Ne) Represents the size of an ideal population (identical individuals, random mating, no overlapping generations) accounting for realistic demographic and structure features. It determines the rate of change in the composition of a population caused by genetic drift. Bottleneck A temporary marked reduction in population size. equilibrium GC-content (GC*) A statistic resuming the matrix of substitutions. It is the GC-content reached by a sequence under a constant substitution pattern. GC∗ = AT→GC AT→GC+GC→AT F2 intercrosses Crossing experiment in which the F2 mapping population is produced by intercrossing F1 individuals. Four-gamete test If all four possible gametes are observed for two bi-allelic loci then this test infers that a recombination event must have occurred between them (under an infinite sites mutation model). Gene conversion The process by which one participant in a recombination event is converted to the sequence of the partner participant; occurs during almost all recombination events, but not necessarily associated with cross-over Genetic distance Distance between DNA markers on a chromosome measured as the amount of crossover between them. A genetic map is an ordered list of markers along the chromosome and the intermarker genetic distances. Genetic drift The change in the frequency of a gene variant (allele) in a population due to random sampling. Genetic interference The presence of a recombinational event in one region that affects the occurrence of recombinational events in adjacent regions. Positive interference, which is seen in eukaryotes, reduces the probability of using nearby hotspots in the same meiosis and causes a more even spacing of crossover than would occur by chance. Genotyping The process by which DNA is analyzed to determine which genetic variant (allele) is present for a certain marker. haploid Having a single gene copy at a genetic locus; as in all prokaryotes, germ cells, and some unicellular eukaryotes. Haplotype A set of genetic markers that are present on a single chromosome and that show complete or nearly complete linkage disequilibrium - that is, they are inherited through generations without being changed by crossing over or other recombination mechanisms. Hardy-Weinberg equilibrium Both allele and genotype frequencies in a randomly- mating population remain constant across generations, unless specific disturbing influences are introduced.
  • 11. xi Holliday junction The point at which the strands of the two dsDNA molecules exchange partners as an intermediate step in crossing over. Infinite sites mutation model A model that assumes that there are an infinite number of nucleotide sites and consequently that each new mutation occurs at a different locus. Linkage disequilibrium (LD) The nonrandom association of alleles at two or more loci. Mutational load Represents a reduction of the mean fitness of a population subsequent to mutations accumulation. Non-crossover (NCO) Recombination product consisting in the swap of small DNA segment Panmixia Random mating. Physical distance Distance between DNA markers on a chromosome measured in the number of nucleotide base pairs. A physical map is an ordered list of markers along a chromosome and the inter-marker physical distances Polymerase chain reaction (PCR) PCR is a technique that amplifies a specific region of DNA as defined by two primer sequences. It is a very useful technique as it generates many copies of one specific genetic material, and thus uses very small amounts of DNA as starting material. PCR is a three stage process: DNA is denaturated (made single stranded), then the primers bind or anneal to their complementary sequence, and in the end, the primers are extended by the addition of nucleotides complementary to that on the template sequences. This process is repeated multiple times. The end result is amplification of the sequence between and including the primer sequence. Positive selection A process by which natural selection favors a single beneficial genotype over other genotypes and may drive this genotype to a high frequency in a population. Pseudoautosomal A region on a sex chromosome that is homologous between the X and Y chromosomes. Successful meiosis in males requires a crossover in this region. Recombinant inbred (RI) lines Crossing experiment in which inbred recombinant lines are obtained from an F1 generation resulting from a cross between parents homozygous at every locus. Recombination Exchange of DNA sequence information within or between chromo- somes. Recombination nodules The early, visible manifestations of sites of chiasmata and crossovers. They are recognized by immunochemical staining, typically for the proteins of late recombination nodules. Single-nucleotide polymorphism (SNPs) SNPs distinguish the chromosomes of two
  • 12. xii individuals or strains. There are millions of SNPs in mammalian genomes, and they have become the preferred markers for genetic studies. Synaptonemal complex A linear protein complex that forms the backbone of each chromatid during prophase I of meiosis and promotes genetic recombination. The DNA of the chromatid is attached to the complex in long loops. The name is derived from the word synapsis, which has been used to describe chromatid pairing. Three-point coefficient of coincidence (C3) The coefficient of coincidence calculated in a pair of adjacent intervals. Zinc finger A protein loop in which cysteine or cysteine-histidine residues coordinate a zinc ion to form the base of the loop. Three of the amino acids in the loop cooperate to recognize three base pairs of DNA, and a tandem array of zinc fingers can show considerable DNA-binding specificity.
  • 13. Contents Pr´eambule 1 Introduction 7 I Molecular mechanisms of recombination 9 I.1 Meiosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 I.1.1 The phases of meiosis . . . . . . . . . . . . . . . . . . . . . . . . . . 9 I.1.2 Pairing of homologs during prophase I . . . . . . . . . . . . . . . . 10 I.1.3 Double strand break (DSB) dependent pairing and the Synaptonemal Complex (SC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 I.1.4 Molecular mechanisms of recombination . . . . . . . . . . . . . . . 14 I.1.5 Postsynaptic phase . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 I.2 Recombination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 I.2.1 Distribution of recombination events . . . . . . . . . . . . . . . . . 19 I.2.1.1 DSB distribution . . . . . . . . . . . . . . . . . . . . . . . 19 I.2.1.2 CO and NCO distribution . . . . . . . . . . . . . . . . . . 20 I.2.2 Non-allelic homologous recombination (NAHR) . . . . . . . . . . . 23 I.2.3 Interference between recombination products . . . . . . . . . . . . . 25 I.2.4 Differences in recombination . . . . . . . . . . . . . . . . . . . . . . 27 I.2.4.1 Differences among species . . . . . . . . . . . . . . . . . . 29 I.2.4.2 Differences among sexes and age classes . . . . . . . . . . 30 I.2.4.3 Differences among individuals of the same species . . . . . 36 I.3 Biased gene conversion (BGC) . . . . . . . . . . . . . . . . . . . . . . . . . 37 I.3.1 Meiotic drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 I.3.2 The molecular mechanism of GC biased gene conversion . . . . . . 40 I.3.3 Genomic evidence for gBGC . . . . . . . . . . . . . . . . . . . . . . 41 I.3.4 Impact of gBGC on the genomic landscape: isochores . . . . . . . . 44 I.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 II Methods for studying recombination 47 II.1 Detecting and measuring recombination . . . . . . . . . . . . . . . . . . . . 47 II.1.1 Genetic markers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 II.1.2 Genetic maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 II.1.2.1 Ordering markers . . . . . . . . . . . . . . . . . . . . . . . 51 II.1.2.2 Calculating genetic distances . . . . . . . . . . . . . . . . 54 II.1.2.3 Sex-averaged and sex-specific genetic maps . . . . . . . . . 55 xiii
  • 14. xiv CONTENTS II.1.3 Linkage Disequilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 57 II.1.3.1 Quantifying LD and recombination . . . . . . . . . . . . . 57 II.1.3.2 HapMap Project . . . . . . . . . . . . . . . . . . . . . . . 59 II.1.3.3 Potential biases in estimating recombination with LD . . . 61 II.1.4 Sperm-typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 II.1.5 Gene conversion rates . . . . . . . . . . . . . . . . . . . . . . . . . . 62 II.2 Modeling the distribution of recombination events . . . . . . . . . . . . . . 64 II.2.1 Counting model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 II.2.2 Mechanical Stress Model . . . . . . . . . . . . . . . . . . . . . . . . 67 II.2.3 Polymerization model . . . . . . . . . . . . . . . . . . . . . . . . . . 68 II.2.4 Recombination and karyotype . . . . . . . . . . . . . . . . . . . . . 69 II.3 The impact of biased gene conversion on the nucleotide composition . . . . 71 II.3.1 Equilibrium GC-content . . . . . . . . . . . . . . . . . . . . . . . . 71 II.3.1.1 Estimating parameters . . . . . . . . . . . . . . . . . . . . 74 II.3.2 Theoretical gBGC model . . . . . . . . . . . . . . . . . . . . . . . . 75 II.3.3 Our model on the effect of gBGC on the frequency of deleterious mutations in human populations . . . . . . . . . . . . . . . . . . . . 76 II.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 III Karyotype and recombination pattern 83 III.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 III.2 Methods and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 III.2.1 Modeling the influence of karyotype on the recombination pattern . 87 III.2.2 Fitting models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 III.2.2.1 Linear and non-linear least squares . . . . . . . . . . . . . 87 III.2.2.2 Confidence interval . . . . . . . . . . . . . . . . . . . . . . 88 III.2.2.3 Comparing and grouping species . . . . . . . . . . . . . . 89 III.2.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 III.2.3.1 Sex-averaged maps . . . . . . . . . . . . . . . . . . . . . . 89 III.2.3.2 Sex-specific vertebrate genetic maps . . . . . . . . . . . . 91 III.3 Inter-species differences in CO number and distribution . . . . . . . . . . . 91 III.3.1 Estimates of the sex-averaged CO interference length and rate of COs 91 III.3.1.1 Vertebrate parameter values . . . . . . . . . . . . . . . . . 91 III.3.1.2 Invertebrate parameter values . . . . . . . . . . . . . . . . 95 III.3.1.3 Examining the interference parameter . . . . . . . . . . . 98 III.3.1.4 Resemblance among species . . . . . . . . . . . . . . . . . 100 III.3.2 Heterochiasmy in vertebrates . . . . . . . . . . . . . . . . . . . . . 102 III.3.2.1 Parameter values . . . . . . . . . . . . . . . . . . . . . . . 102 III.3.2.2 Comparing male and female . . . . . . . . . . . . . . . . . 103 III.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 IV Sex-specific impact of recombination on the nucleotide composition 111 IV.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 IV.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 IV.3 Recombination, nucleotide composition, sex and chromosome localization . 117
  • 15. CONTENTS xv IV.3.1 Sex-specific impact in vertebrates . . . . . . . . . . . . . . . . . . . 117 IV.3.2 Chromosome localization . . . . . . . . . . . . . . . . . . . . . . . . 119 IV.3.3 Quantifying the impact on GC* . . . . . . . . . . . . . . . . . . . . 121 IV.3.4 The particular case of the dog . . . . . . . . . . . . . . . . . . . . . 123 IV.3.5 Cause-effect implications . . . . . . . . . . . . . . . . . . . . . . . . 123 IV.3.6 Reviewing the hypothesis of sex-specific impact . . . . . . . . . . . 124 IV.4 Discussing the methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 125 IV.4.1 Using TEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 IV.4.2 Window length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 IV.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 V Conclusions and Perspectives 131 Bibliography 135 Additional Material 169 A Proteins involved in meiosis . . . . . . . . . . . . . . . . . . . . . . . . . . 170 B Human recombination hotspots analyzed by sperm-typing. . . . . . . . . . 176 C Correlations between distance to telomeres, GC*, and sex-specific COR . . 178 D Opossum correlation windows smaller and larger than 20 Mb . . . . . . . . 180
  • 16. xvi CONTENTS
  • 17. Pr´eambule D`es le passage `a l’agriculture et `a l’´elevage des animaux, les hommes ont entam´e les premi`eres exp´eriences g´en´etiques, en ´etudiant et manipulant la transmission des caract`eres `a la descendance. Mais ce n’est qu’au 19`eme si`ecle qu’un support g´en´etique a ´et´e identifi´e pour les caract`eres, support que Gregor Mendel a nomm´e “facteurs h´er´editaires”. Il a d´ecouvert que l’expression d’un caract`ere chez un individu est r´egul´ee par une paire de facteurs, un provenant du p`ere et le deuxi`eme de la m`ere. Une des lois ´enonc´ee par Mendel affirme que les diff´erents caract`eres sont h´erit´es ind´ependamment les uns des autres. Des exp´eriences ult´erieures ont remis en question la disjonction ind´ependante des caract`eres. Ces patrons h´er´editaires inhabituels, quand certains caract`eres s´egr`egent ensemble plus souvent qu’attendu, a donn´e la d´efinition de la liaison g´en´etique. Thomas Morgan a associ´e la liaison entre les facteurs `a leur appartenance `a un mˆeme chromosome et a rapport´e la force de cette liaison `a la distance qui s´epare les facteurs. Toutefois, certains facteurs montrent des niveaux diff´erents de liaison : entre s´egr´egation ind´ependante et liaison compl`ete. Morgan a sugg´er´e que la liaison entre des facteurs appartenant `a un mˆeme chromosome peut ˆetre bris´ee par la recombinaison lors de la m´eiose, `a travers les chiasmata. Les chiasmata sont les sites visibles de l’´echange de mat´eriel g´en´etique entre les chromosomes des deux parents, nomm´e crossover (CO). En brisant la liaison g´en´etique entre les g`enes, les COs remanient le mat´eriel g´en´etique et g´en`erent, ainsi, des nouvelles combinaisons entre les diff´erents variants des g`enes. La recombinaison fait donc partie des quatre forces fondamentales qui influencent l’´evolution des esp`eces. La s´election explique l’adaptation des esp`eces au fils des g´en´erations par la propagation des traits favorisant la survie et la reproduction. La mutation constitue la principale source de variation sur laquelle la s´election agit. La recombinaison trie la variation g´en´etique et constitue une importante source d’innovation. La d´erive g´en´etique garantit la d´eviation des fr´equences des all`eles dans une population, ind´ependamment des autres forces ´evolutives. “Evolution is a population genetic process governed by four fundamental forces, which jointly dictate the relative abilities of genotypic variants to expand throughout a species. Darwin articulated a clear but informal description of one of those forces, selection (including natural and sexual selection), whose central role in the evolution of complex phenotypic traits is universally accepted, and for which an elaborate formal theory in terms of change in genotypic frequencies now exists. The remaining three evolutionary forces, however, are non-adaptive in the sense that they are not a function of the fitness properties of individuals : mutation (broadly including insertions, deletions, and duplications) is the fundamental source of variation on which natural selection acts ; recombination (including crossing-over and gene conversion) assorts variation within 1
  • 18. 2 Pr´eambule and among chromosomes ; and random genetic drift ensures that gene frequencies will deviate a bit from generation to generation independently of other forces. Given the century of theoretical and empirical work devoted to the study of evolution, the only logical conclusion is that these four broad classes of mechanisms are, in fact, the only fundamental forces of evolution. Their relative intensity, directionality, and variation over time define the way in which evolution proceeds in a particular context.”(Lynch, 2007) Objectifs et plan de la th`ese Cette th`ese a pour objectif l’analyse, dans un contexte ´evolutif, des m´ecanismes de la recombinaison et leur impact sur les g´enomes. La probl´ematique de la quantification des diff´erences li´ees `a la recombinaison entre esp`eces y est abord´ee, des bases mol´eculaires du ph´enom`ene jusqu’`a une estimation plus g´en´erale d’un mod`ele. L’impact de la recombinaison sur le patron des substitutions nucl´eotidiques et la fr´equence des all`eles dans la population est aussi ´etudi´e. La th`ese est structur´ee en deux grandes parties. La premi`ere partie, compos´ee des chapitres I et II, passe en revue les techniques et approches existantes pour l’´etude de la recombinaison et la conversion g´enique biais´ee. La seconde partie, chapitres III, IV, et V, pr´esente des approches nouvelles ayant pour but d’am´eliorer notre compr´ehension des m´ecanismes ´evolutifs de la recombinaison et des structures g´enomiques. Premi`ere partie Dans le premier chapitre, section I.1, nous pr´esentons les m´ecanismes mol´eculaires m´eiotiques `a la base de la recombinaison. Lors de la m´eiose, les chromosomes homo- logues s’apparient sur leur longueur. Cet appariement est ind´ependant de la recombinaison dans certaines esp`eces comme C. elegans ou D. melanogaster (Gerton and Hawley, 2005; Zickler, 2006). Toutefois, pour la majorit´e des esp`eces, l’union compl`ete des homologues n´ecessite la formation des cassures double-brin (revue dans Joyce and McKim (2007)). Les cassures double-brin ont ´et´e identifi´ees comme ´etant les pr´ecurseurs des ´ev´enements de recombinaison (Szostak et al., 1983). Plusieurs mod`eles ont ´et´e propos´es pour expliquer le passage entre les cassures double-brin et leur r´eparation en crossovers (COs) ou non- crossovers (NCOs) (Szostak et al., 1983; Allers and Lichten, 2001b; Constantinou et al., 2002; Wu and Hickson, 2003). La section I.2 offre une vue d’ensemble des facteurs g´eno- miques contrˆolant la production des ´ev´enements de recombinaison et plus particuli`erement des COs. L’´emergence des techniques `a haute r´esolution dans l’´etude de la recombinaison a permis l’analyse de la distribution de ces ´ev´enements le long des chromosomes dans quelques esp`eces mod`eles, comme l’homme et la levure. Nous savons maintenant que la recombinaison a lieu dans des r´egions restreintes (quelques kb) du g´enome appel´ees points chauds de recombinaison (Jeffreys et al., 2004; Myers et al., 2005). De plus, la recombinaison n’est pas r´epartie al´eatoirement le long des chromosomes `a cause de l’interf´erence tant entre les cassures double-brin (Anderson et al., 2001; de Boer et al., 2006) qu’entre les ´ev´enements de recombinaison (Bishop and Zickler, 2004). Au niveau des r´egions chromosomiques, la recombinaison est localis´ee principalement `a proximit´e
  • 19. 3 des telom`eres et est reduite dans les g`enes et `a cˆot´e du centrom`ere (Myers et al., 2005; Mancera et al., 2008; Paigen et al., 2008). R´ecemment, le g`ene Prdm9, un d´eterminant majeur des points chauds de recombinaison a ´et´e identifi´e chez l’homme et la souris (Myers et al., 2009; Baudat et al., 2009). Chez l’homme, la prot´eine `a doigts de zinc produite par ce g`ene se lie `a un motif d´eg´en´er´e de 13 nucl´eotides qui est sp´ecifique de 40% des points chauds de recombinaison chez cette esp`ece (Myers et al., 2008). Tandis que d’importantes avanc´ees ont ´et´e faites dans notre compr´ehension des m´eca- nismes de la recombinaison, ces ´etudes dans quelques esp`eces mod`eles ont mis en ´evidence d’importantes diff´erences dans ce processus, non seulement entre les esp`eces, mais aussi entre les sexes et les individus d’une mˆeme population. Ainsi, le caryotype (nombre et longueur des chromosomes), ainsi que l’histoire d´emographique et l’´evolution des prot´eines li´ees `a la recombinaison, semblent des facteurs importants pour expliquer les diff´erences entre esp`eces (section I.2.4.1). La diff´erence de recombinaison entre les sexes, nomm´ee h´et´erochiasmie, affecte non seulement le nombre des COs mais aussi leur distribution le long des chromosomes (Shifman et al., 2006; Broman et al., 1998; Kong et al., 2002; Paigen et al., 2008; Wong et al., 2010) (table I.3). La variabilit´e inter-individus, quant `a elle, semble intimement li´ee aux all`eles du g`ene Prdm9 port´e par ceux-ci (Cheung et al., 2007; Baudat et al., 2009; Berg et al., 2010). Les exp´eriences r´ealis´ees depuis les travaux de Morgan montrent un rˆole double de la recombinaison. Premi`erement, la recombinaison a un rˆole essentiel dans la progression de la m´eiose, en assurant la bonne s´egr´egation des homologues, entraˆınant ainsi sa forte r´egulation. Deuxi`emement, la recombinaison est un processus qui ´evolue rapidement conduisant `a de fortes diff´erences au sein mˆeme d’une population. Un autre rˆole ´evolutif de la recombinaison consiste `a fa¸conner le paysage g´enomique au niveau des nucl´eotides. Dans la section I.3, nous pr´esentons comment la recombinaison influence la production et l’´evolution des isochores (longues r´egions du g´enome caract´eris´ees par un contenu homog`ene en GC) `a travers la conversion g´enique biais´ee (pour revue, voir Duret and Galtier (2009)). Les particularit´es des isochores ainsi que leur association `a d’autres caract´eristiques g´enomiques sont pr´esent´ees dans ce chapitre. Toutes ces avanc´ees dans l’´etude de la recombinaison ont ´et´e possibles grˆace `a des progr`es technologiques majeurs. Ces perc´ees technologiques ont facilit´e l’acquisition de nombreuses donn´ees `a forte r´esolution. Dans le chapitre II, section II.1, nous d´ecrivons les principales techniques pour mesurer la recombinaison : cartes g´en´etiques et de d´es´equilibre de liaison et l’analyse par sperm-typing. Les cartes g´en´etiques repr´esentent l’outil le plus ancien pour l’´etude de la recombinaison, depuis leur premi`ere mise en place par Sturtevant (1913). Elles se basent sur le d´epistage de la transmission des marqueurs g´en´etiques au sein des familles. Les cartes g´en´etiques constituent, pour le moment, le seul moyen de quantifier les COs `a l’´echelle des g´enomes dans les deux sexes (Lynn et al., 2004; Cheung et al., 2007). Toutefois, elles sont d´ependantes de la taille de la famille ´etudi´ee, ainsi que du nombre de m´eioses repr´esentatives qui r´esultent souvent dans des cartes `a faible r´esolution, particuli`erement chez les eucaryotes (Arnheim et al., 2003). L’´etude du d´es´equilibre de liaison `a l’int´erieur d’une population a permis l’´etude des ´ev´enements historiques de recombinaison (Lewontin and Kojima, 1960). Malgr´e les limites de cette technique pour l’´etude de l’h´et´erochiasmie, le nombre important d’individus
  • 20. 4 Pr´eambule ´etudi´es assure une forte r´esolution des COs `a l’´echelle du g´enome (Myers et al., 2005). Le d´es´equilibre de liaison sert de guide pour l’identification, localement, des potentiels points chauds de recombinaison qui peuvent ensuite ˆetre ´etudi´es `a tr`es haute r´esolution dans la lign´ee germinale mˆale, grˆace au sperm-typing (Li et al., 1988). L’acquisition des donn´ees de recombinaison ne repr´esente que le premier pas dans l’´etude de ces m´ecanismes. Dans la section II.2 nous d´ecrivons quelques uns des mod`eles principaux pour l’´etude de la distribution des COs. Ces mod`eles se concentrent principalement sur la mod´elisation des distances entre les COs. Le premier mod`ele, counting model, consid`ere que deux COs vont ˆetre s´epar´es par un certain nombre de NCOs (Foss et al., 1993). La distance entre deux COs suit donc une loi de Γ dont le param`etre, estim´e sur la longueur g´en´etique des chromosomes, d´ecrit la force d’interf´erence. Le mod`ele mechanical stress model, quant `a lui, mod´elise l’apparition des COs en prenant en compte des ph´enom`enes physiques qui g´en`erent des tensions au niveau des chromosomes lors de la m´eiose (Kleckner et al., 2004). La derni`ere section de ce chapitre, II.3, pr´esente des mod`eles d´evelopp´ees pour l’inf´erence des patrons de substitutions sous l’influence de la recombinaison `a travers la conversion g´enique biais´ee. Arndt et al. (2003) a utilis´e les m´ethodes de maximum de vraisemblance pour inf´erer le patron de substitution dans une esp`ece en se basant soit sur un triple alignement entre des esp`eces proches, soit sur l’alignement entre la s´equence actuelle et son ´equivalent ancestral. En utilisant ce valeurs de substitution dans les s´equences neutres du g´enome humain, Duret and Arndt (2008) proposent un mod`ele quantifiant l’effet de la conversion g´enique biais´ee sur ce patron. A la fin de ce chapitre II.3.3, nous pr´esentons nos r´esultats de simulation de l’effet de la conversion g´enique biais´ee sur la fr´equence des all`eles d´el´et`eres dans la population humaine (Nec¸sulea et al., 2011). Nous montrons que la conversion g´enique biais´ee peut contrecarrer la s´election et engendrer le maintien des mutations d´el´et`eres `a de hautes fr´equences dans les populations. Deuxi`eme partie Comme mentionn´e pr´ec´edemment, la recombinaison est un processus tr`es dynamique conduisant `a de multiples diff´erences entre esp`eces, sexes, et individus. Dans le but de caract´eriser les diff´erences inter-esp`eces dans la recombinaison, nous avons d´evelopp´e un nouveau mod`ele bas´e sur les cartes g´en´etiques, d´etaill´e dans le chapitre III. Ce mod`ele met en relation la longueur g´en´etique totale des chromosomes (representant le nombre total de COs) et leur longueur physique. Des notions biologiques importantes sur le processus de la recombinaison sont prises en compte pour la construction de ce mod`ele : la n´ecessit´e d’un CO obligatoire par paire d’homologues pour assurer leur bonne s´egr´egation et la force d’interf´erence entre COs. L’ajustement de ce mod`ele aux donn´ees donne l’estimation de deux param`etres de la recombinaison : le taux de production de COs suppl´ementaires par Mb et la force moyenne d’interf´erence par esp`ece, d´efinie comme la distance physique entre des COs cons´ecutifs. Puisque le mod`ele implique une analyse au niveau global du caryotype, il peut ˆetre ajust´e mˆeme sur des cartes g´en´etiques de faible r´esolution, permettant ainsi l’´etude de nombreuses esp`eces. Dans le chapitre III, nous montrons que ce mod`ele s’ajuste bien sur les 24 vert´ebr´es et invert´ebr´es analys´es, mˆeme dans les cas qui ne peuvent pas ˆetre expliqu´es par un mod`ele lin´eaire
  • 21. 5 simple. L’´etude des distances inter-COs n’ayant ´et´e men´ee que chez quelques esp`eces. Les estimations de nos param`etres d’interf´erence dans ces esp`eces sont en accord avec les valeurs obtenues par ces ´etudes montrant le grand potentiel de notre mod`ele `a ´etudier l’interf´erence. Les estimations obtenues pour de nouvelles esp`eces fournissent des donn´ees originales sur la distribution des COs. En outre, nous avons utilis´e les valeurs pr´edites du taux de CO par Mb pour comparer les esp`eces entre elles et d´eterminer celles qui se ressemblent. Les esp`eces avec des param`etres similaires peuvent aussi partager un processus et des complexes prot´eiques de la recombinaison similaires. Dans le but d’´etudier l’h´et´erochiasmie, nous avons ajust´e le mod`ele sur les carte g´en´etiques mˆale et femelle appartenant `a 6 vert´ebr´es. Comme attendu, le sexe ayant la plus petite distance inter-CO pr´esente ´egalement plus de COs, qui en outre, sont distribu´es plus uniform´ement. Pour 4 des 6 v´ert´ebr´es, ces tendances engendrent ´egalement un taux de production des COs par Mb plus important. En revanche, pour l’opossum, les deux param`etres sont plus ´elev´es chez la femelle que chez le mˆale. Est-ce que cela r´esulte de la faible r´esolution de la carte g´en´etique pour cette esp`ece, ou traduit un comportement `a part chez la femelle ? Cela m´erite une analyse plus approfondie. Nos r´esultats, ainsi que des nouvelles donn´ees (Elferink et al., 2010), remettent en question le manque d’h´eterochiasmie pr´ec´edemment consentie chez le poulet. L’analyse des causes de l’h´eterochiasmie a motiv´e notre deuxi`eme ´etude, pr´esent´ee dans le chapitre IV. La pr´esence et le sens de l’h´eterochiasmie varient entre les esp`eces. De plus, nous savons que la recombinaison a un impact important sur les s´equences nucl´eotidiques `a travers la conversion g´enique biais´ee. Jusqu’`a maintenant, des ´etudes chez l’homme ont montr´e que la recombinaison mˆale ´etait le facteur principal dans l’´evolution du contenu en GC (Webster et al., 2005; Duret and Arndt, 2008). Dans le chapitre IV, nous analysons la question de l’impact du sexe sur la relation GC/recombinaison chez 5 vert´ebr´es. Nos r´esultats montrent que l’effet plus fort du mˆale n’est pas valable pour toutes les esp`eces. Mˆeme chez l’homme, cet effet est principalement engendr´e par des r´egions proches des telom´eres, qui contiennent principalement des points chauds de recombinaison mˆale. Ces r´esultats montrent un impact important des forts taux de recombinaison sur la composition en nucl´eotides, ind´ependamment du sexe. La diff´erence entre les sexes dans la localisation et l’intensit´e des points chauds de recombinaison est le facteur important de l’impact diff´erentiel du sexe sur la relation GC/recombinaison selon la localisation chromosomique. En outre, nous avons ´etudi´e l’impact du patron de substitution sur l’´evolution du contenu en GC. Pour des ´echelles de temps faibles, la divergence homme-chimpanz´e, le GC actuel des s´equences est tr`es diff´erent du GC attendu `a l’´equilibre (Meunier and Duret, 2004; Duret and Arndt, 2008). Dans le chapitre IV, nous montrons que pour des ´echelles de temps plus longues, le contenu en GC des r´egions neutres, soumises `a la mutation, `a la conversion g´enique biais´ee, et `a la d´erive g´en´etique, est proche de l’´equilibre. Nous proposons une hypoth`ese pour ces r´esultats apparemment contradictoires. Tout d’abord, les points chauds de recombinaison sont tr`es dynamiques, comme l’indique l’absence de conservation entre des esp`eces proches comme l’homme et le chimpanz´e (Ptak et al., 2005; Winckler et al., 2005). Ensuite, certaines r´egions chromosomiques comme celles proches
  • 22. 6 Pr´eambule des telom´eres conservent une haute densit´e en points chauds de recombinaison chez une majorit´e des esp`eces. Ces observations indiquent que, mˆeme si le contenu en GC oscille sous la pression des biais mutationnels et de la conversion g´enique, `a long terme les deux biais s’att´enuent r´eciproquement. Les r´esultats pr´esent´es dans cette th`ese am`enent des ´el´ements nouveaux pour la com- pr´ehension de l’influence r´eciproque entre caryotype, sexe, recombinaison, et composition en nucl´eotides. Cependant, comme le titre de la section d’o`u provient la citation pr´ec´edente l’indique : “Nothing in evolution makes sense except in the light of population genetics” (L’´evolution ne fait sens qu’`a la lumi`ere de la g´en´etique des populations) (Lynch, 2007). En accord avec ce principe, le travail pr´esent´e dans cette th`ese s’inscrit dans un projet plus important qui vise `a int´egrer les nouvelles informations sur le processus de recombinaison dans un mod`ele d´ecrivant son impact ´evolutif dans les populations.
  • 23. Introduction When humankind first started practicing agriculture and animal breeding, it also initiated the first genetic experiments, by studying and influencing the transmission of traits to the offspring. It was not until the 1800s that the traits were found to have a discrete material support, which Gregor Mendel called “factors”. It was Mendel that discovered that factors for one trait come in pairs, one from the father and one from the mother. One of the laws stated by Mendel is that different factors are passed on to the offspring separately from one another. Subsequent experiments have emphasized important deviations from this law of independent assortment. The notion of linkage arose when unusual patterns of inheritance were observed between certain factors, when certain traits were found to segregate together more often than not. Thomas Morgan associated the linkage between factors to their belonging on the same chromosome, and related the strength of this linkage to the distance separating them. However, despite a localization on the same chromosome and short physical distances, the transmission of certain factors showed incomplete linkage. It was Morgan who suggested that breaks in the linkage between factors on the same chromosome were the consequence of recombination, through chiasmata observed during meiosis. Chiasmata are the visible sites of the exchange of genetic material between the chromosomes from the two parents, also termed crossover. By breaking the linkage between genes, crossovers mix the genetic material and thus, create new combinations of gene variants. Hence, recombination is creating variation and represents a powerful source of innovation. In chapter I, section I.1, we offer a detailed description of the molecular mechanisms leading to the advent of recombination. Section I.2 provides our latest understanding of the genomic features generating and controlling recombination, and particularly crossovers. High-resolution studies in a few organisms, such as human and yeast, have provided valuable information on the distribution of recombination events along chromosomes. However, important differences have been observed in their localization and frequency, not only between species, but also between sexes and individuals of the same species. The results obtained since the work of Morgan describe a dual role of recombination. First, recombination plays an essential role in the progress of meiosis, and thus, it is highly regulated. Second, recombination is perceived as a highly dynamic process. Another important evolutionary role of recombination consists in influencing the genomic sequences at the nucleotide level. In section I.3, we describe how recombination can generate isochore structures (long regions of relatively homogeneous GC-content) through the mechanism of biased gene conversion. The characteristics of these structures as well as their correlation to other genomic features are also provided in this section. However, all these advancements have been possible thanks to a major technological 7
  • 24. 8 Introduction progress. These technological breakthroughs have facilitated the acquisition of large amount of high-quality data. In chapter II, section II.1, we present the main genetic methods that have led to the study of recombination: linkage and LD maps, as well as sperm-typing results. As the data on recombination increased considerably, a new need emerged: the necessity of models to describe them. Section II.2 describes some main modeling techniques of the distribution of crossovers. These models are mainly focusing on the distance separating two consecutive crossovers, as these events are not distributed randomly, but interfere with each other. The last section of this chapter, II.3, deals with the models for the analysis and quantification of the impact of recombination on the nucleotide changes, through the influence of the substitution pattern. Also in this section II.3.3 we present our analysis of the influence of biased gene conversion on the frequencies of alleles in a population. Notably, we focus on the modeling of the role played by biased gene conversion in the maintenance of deleterious alleles in a population. As previously mentioned, recombination is a highly dynamic process, as multiple differences can be observed between species and sexes. However, these differences could be analyzed only in a few species, and the models describing the distribution of crossovers characterize only a subset of these species. In order to understand the evolutionary role of recombination, it is important to describe its mechanism at a much larger scale. In chapter III, we make use of the availability of low-resolution genetic maps in a wide variety of species to model the distribution of crossovers. The model we propose takes into account the constraint of one obligatory crossover per pair of homologs in order to ensure their correct segregation. This model is characterized by two parameters, which represent the rate with which supplementary crossovers are produced and the strength of interference. The estimation of these parameters in 24 vertebrates and invertebrates yields important information on their role in creating differences among species and sexes. The differences in recombination between sexes (heterochiasmy) have been found to account for a differential impact of sex on the nucleotide composition. Thus, in human, the male, rather than female, recombination seems to correlate better with the GC-content of sequences. In chapter IV, we investigate this relation in four additional vertebrates. We compare the heterochiasmy differential impact with the localization along the chromosomes. This analysis allows us to understand the role played by sex on the relation between recombination and nucleotide composition, under biased gene conversion. A summary of all our results is provided in chapter V. In view of the results presented in this thesis, we further discuss the future leads they offer to improving our understanding of the evolution of recombination and its impact on the nucleotide landscape of genomes.
  • 25. Chapter I Molecular mechanisms of recombination This chapter provides the necessary basis to understand the molecular mechanisms of recombination and its impact on the genome. The first section summarily describes the phases of meiosis, with a detailed presentation of the recombination mechanism in the second section. The third section characterizes the process of biased gene conversion and its implications on the isochore structures. I.1 Meiosis Most sexually-reproducing species have diploid cells, e.g. they have two copies of each chromosome, one from each parent. When, in turn, such an individual reproduces, it transmits only half of its genetic material to the offspring, through specialized cells termed gametes. An essential step in the sexual reproduction of species is the generation of haploid gametes from diploid cells, which prevents the doubling in genetic material with each generation. The reduction in ploidy is achieved through a special type of cellular division, called meiosis. The specialized diploid cells in ovaries and testis (germinal cells) contain two copies of each chromosome (paternal and maternal), also known as homologs. A preceding step to meiosis consists in the replication of DNA in germinal cells, with the duplication of chromosomes. At the end of this phase each chromosome consists of two sister chromatids linked at the level of the centromere. Two cell divisions follow, which halve the number of chromosomes in the gametes, thus resulting in four haploid cells. I.1.1 The phases of meiosis The first meiotic division is particularly long, representing more than 90% of the total time of meiosis. It is also known as reductional division since it produces two haploid cells. The passage from a diploid number of chromosomes to a haploid stage is done in four phases: prophase, metaphase, anaphase and telophase as in figure I.1. Two important events, specific to meiosis, take place during prophase I: the pairing of homologous chromosomes and recombination. In turn, prophase I is divided into five phases: leptotene, zygotene, 9
  • 26. 10 Chapter I. Molecular mechanisms of recombination pachytene, diplotene, diakinezis. Of the wide range of proteins acting during prophase I, some are mentioned here after and a detailed description can be found in the additional table A. At metaphase I, the paired chromosomes become attached to the meiotic spindle and line up. The chromosomes are condensed at their maximum and the chiasmata (the points of contact between homologs) are visible. The resolution of the chiasmata takes place during anaphase I, when the two replicated homologs (each still consisting of two sister chromatids) separate and are pulled to opposite poles. The chromosomes reach the poles of the meiotic spindle during the telophase I and the cell divides resulting in two sister cells, each inheriting two copies of either the maternal or the paternal homolog of each pair. Each daughter cell contains half the number of chromosomes, which consists of a pair of sister chromatids, closely attached at the level of the centromere. The actual formation of gametes is taking place during the second meiotic division, also known as equational division. The transition between the two meiotic divisions takes place rapidly, during a short interphase period, with no DNA replication. The nuclear envelope (NE) of each daughter cell breaks down in prophase II, and a new meiotic spindle forms. In metaphase II, single condensed chromosomes, as opposed to homologous pairs of chromosomes in metaphase I, line up on the spindle. The two sister chromatids making up each chromosome are separated at the centromere during anaphase II. They segregate to opposites poles of the cell, thus generating two haploid nuclei, each containing a single chromatid. At telophase II, the nuclear envelope of each one of the four cells is formed, producing four gametes, each with a haploid set of chromosomes. I.1.2 Pairing of homologs during prophase I The pairing of chromosomes starts at leptotene, as the homologs overcome spatial separation from complete dissociation to co-alignment. In order to achieve this long-range alignments, homologous chromosomes must find and recognize each other. In a few organisms, the establishment of a physical contact between homologs may occur prior to meiosis (reviewed in McKee (2004); Zickler (2006)). This phenomenon is encountered in Dipterans, as it is especially necessary for the initiation of meiotic association in Drosophila males which lack both recombination and a synaptonemal complex (SC) (Vazquez et al., 2002). The spatial association of homologs has also been reported in somatic cells during mitotic interphase, when chromosomes occupy distinct territories according to their length and gene-density, but this association is infrequent and seems to occur randomly (Cremer and Cremer, 2001; Mora et al., 2006). However, the premeiotic interactions are far from being an universal feature. Even when these type of interactions are present, it is difficult to assess their influence on the pairing of chromosomes during meiosis. Prior to pairing, chromosome ends are linked to the cytoskeleton network through the inner and outer nuclear membrane complex proteins SUN/KASH (Tzur et al., 2006). Figure I.2 depicts the attachment between the microtubules in the cytoplasm and the chromosomes inside the nucleus through specific nuclear envelope (NE) proteins. These NE bridges allow cytoplasmic forces to induce chromosome movement inside the nucleus (Penkner et al., 2009). The motion of chromosomes is supposed to help the pairing of homologs by creating the opportunity of encounter, but most importantly by disrupting the nonhomologous associations (reviewed in Koszul and Kleckner (2009)). At late leptotene the chromosomes
  • 27. I.1. Meiosis 11 MeiosisIMeiosisII Telophase I and cytokinesisAnaphase IMetaphase I Two homologous chromosomes Spindle Centrosomes Prophase II Metaphase II Anaphase II Telophase II and cytokinesis Prophase I Figure I.1: The representation of the two meiotic divisions: Meiosis I and II. Each meiotic division is further classified into four phases: prophase, metaphase, anaphase and telophase. At the end of meiosis, four gametes are produced each with a haploid set of chromosomes. Prophase I is characterized by the exchange of genetic material between homologous chromosomes, also known as crossovers (CO). During metaphase, chromosomes attach to the spindle formed between centrosomes. The genetic material, homologs for Meiosis I and sister chromatids for Meiosis II, segregate at opposite poles during anaphase. The telophase results in the reconstruction of the NE around each homologous chromosome or sister chromatid for the first and second meiotic divisions respectively. Cytokinesis results in the division of the cytoplasm in order to form two daughter cells. migrate into a specific meiotic organization called the “bouquet” arrangement (Zickler and Kleckner, 1998; Scherthan, 2001). The “chromosomal bouquet” is a conserved feature of eukaryotes, characterized by the telomeres being anchored to the NE and the chromosomes being clustered within a delimited volume of the nucleus (Zickler, 2006). Although the role of the “bouquet” configuration in the pairing between homologs is not well defined, it has been suggested that the clustering of chromosomes in a limited area, as well as their rapid movement in and out of the “bouquet” are essential for the resolution of entanglements of chromosomes as well as the prevention of nonhomologous contacts (Zickler, 2006). The chromosome dynamics during meiosis is indeed an essential step in their pairing, but the question still remains as to how homologs recognize each other at very long distances. Recombinases (such as Rad51) are known to facilitate the homology recognition,
  • 28. 12 Chapter I. Molecular mechanisms of recombination Figure I.2: Attachment to the nuclear envelope promotes chromosome movements and homologous attachments.(a) A SUN/KASH domain complex bridges the nuclear envelope (NE) connecting with dynein on the cytoplasmic face. Chromosomes attach via telomeres or specialized pairing center sequences to the NE complex. In C. elegans, Sun-1 phosphory- lation (black circle on SUN protein with attached chromosome) early in zygotene is required for subsequent events. (b) The SUN/KASH/chromosomal foci cluster together and mature into large patches, as additional phosphorylation of Sun-1 is observed. In patches, dynein- mediated forces stress chromosomes, leading to detachment of non-homologous attachments and synapsis and homologous ones. (c) As homologs fully synapse and execute DSB repair, SUN-1 phosphorylation status again changes, leading to dispersal of chromosomes into pachytene morphology. From Yanowitz (2010). but at a local scale, when the interacting molecules are already aligned (Rao et al., 1995; Barzel and Kupiec, 2008). Moreover, in many organisms the homologous pairing is independent of recombination (Gerton and Hawley, 2005; Zickler, 2006). Multiple recombination-independent mechanisms for homology search have been proposed. The clustering of chromosomes could be attained through specific cis-acting pairing centres (PC). In C. elegans, homologue-recognition regions (HRR) have been found to localize along each chromosome, and are essential to the local stabilizing of pairing and the initiation of SC polymerization (McKim et al., 1988; MacQueen et al., 2005). Highly transcribed ribosomal DNA (rDNA) regions in D. melanogaster also play a role as PC between the X and Y chromosomes (McKee, 1996). The pericentric heterochromatic regions too, could act as pairing sites between chromosomes, in S. cerevisiae and D. melanogaster (Kemp et al., 2004; Dernburg et al., 1996). Another mechanism for long-range pairing is based
  • 29. I.1. Meiosis 13 on the observation that during meiosis, chromosomes pair only when transcriptionally active (Cook, 1997). DNA regions that are under active transcription form loops attached to specialized transcription factories. Multiple homologous loops may share the same transcription factory allowing for a transient binding between DNA sequences, and the subsequent paring of homologs (Xu and Cook, 2008). Even if considered less probable, the model of DNA-DNA direct contacts is based on long-range attractive interactions between double-stranded DNA (Danilowicz et al., 2009). These interactions result from the spatial modulation of charge distribution in DNA helices (Kornyshev and Wynveen, 2009), even in protein-free conditions. I.1.3 Double strand break (DSB) dependent pairing and the Synaptonemal Complex (SC) For the majority of species, full homologous pairing seems to be intimately linked to the initiation of recombination via double-strand breaks (DSB) (reviewed in Joyce and McKim (2007)). However, knock-out mutants for the proteins responsible for DSB formation in Caenorhabditis elegans and females of Drosophila can still build a synaptonemal complex (SC) structure and establish inter-homologs synapsis (Dernburg et al., 1998; McKim et al., 1998). SC is a well-conserved tripartite proteinaceous structure consists of two lateral elements (LE) and a central element (CE), connected together by transverse filaments (TF), with the two homologous chromatides disposed in loops around the corresponding LE (Schmekel and Daneholt, 1995) (figure I.3). The chromosome axes begin to assemble in short fragments, at leptotene, as a result of the incorporation of cohesin (e.g Rec8) and axial proteins, such as SCP2 and SCP3 in mammals (Eijpe et al., 2003). The bits will then fuse and form full-length LE as part of the SC (Schalk et al., 1998). Also at leptotene, DSBs are induced on the chromatin loops through the action of the evolutionary conserved endonuclease Spo11 (Keeney et al., 1997; Blat et al., 2002; Keeney and Neale, 2006). The Mre11 complex of proteins further removes Spo11 from the DNA ends and continues to degrade the DSBs from the 5’ to the 3’ end (Borde and Cobb, 2009). Even if DSBs may occur on chromatin loop, it has been proposed that the sequence containing the DSB and the chromosome axis will become spatially associated via DNA/protein recombination complexes (Blat et al., 2002) (figure I.4). It has been observed that the sites of DSBs form 400 nanometers (nm) local bridges between the homologous chromosome axes (Tess´e et al., 2003). The exact mechanism of DSB-mediated alignment is not fully understood, nevertheless a complex of proteins has been identified as being involved in the interaxis bridges assembly (Storlazzi et al., 2010). Strand exchange proteins, such as Rad51 and Dmc1, will form nucleoprotein filaments, binding the resulting single stranded DNA (ssDNA) and catalyzing homologous strand invasion (Shinohara et al., 1992; Kagawa and Kurumizaka, 2010). At zygotene, a small subset of the DSB bridges, the ones that have matured into axial associations, and that later will form crossovers (CO), are also developing sites for the SC (Page and Hawley, 2004). An overview of the recombination and SC processes is pictured in figure I.5. Contemporary to the initiation of the CE in SC, the 3’ ssDNA invades the homologous double strand DNA (dsDNA), through a process called single-end
  • 30. 14 Chapter I. Molecular mechanisms of recombination homologs sisterchromatids Figure I.3: Model of the synaptonemal complex structure. The lateral element (LE) com- prises cohesins (Rec8/C(2)M/SYN1, STAG3/Rec11, SMC1-β and SMC3) (blue ovals), the structural proteins SCP2 and SCP3 and the HORMA-domain proteins Hop1/HIM3/Asy1 (all other LE proteins - green ovals). The transverse filaments (TF) are formed by the proteins Zip1/SCP1/C(3)G/SYP1 (shown also at the bottom). Adapted from Castro and Lorca (2005), originally adapted from Page and Hawley (2004). invasion (SEI) (Hunter and Kleckner, 2001). Homology is recognized between the two sequences through sequential cycles of binding, sampling and release of the dsDNA (Neale and Keeney, 2006). The proteins responsible for the CE nucleation (SCP1 protein, in mammals and ZMM proteins, in yeast) polymerize along homologs leading to the full assembly of the SC at mid-pachytene (Meuwissen et al., 1997; Zickler, 2006). The stable connection between homologs via the SC is called synapsis (Zickler and Kleckner, 1998). I.1.4 Molecular mechanisms of recombination Spontaneous DSBs arise frequently, and without a correct repair mechanism, they would be highly deleterious leading to chromosome mis-segregation, rearrangements or apoptosis. The repair of the DNA break can proceed either by non-homologous end joining (NHEJ) or by homologous recombination (HR), using a DNA template (Haber, 2000). NHEJ is widely used in mammalian mitotic cells and consists of directly ligating the broken ends of the DNA (Weterings and van Gent, 2004). The process itself needs no or very little homology and is very prone to errors (Lieber et al., 2003). The repair of DSBs generated during meiosis exhibits low levels of NHEJ in mammals, and is mainly performed through HR (Goedecke et al., 1999; Haber, 2000). HR uses a template DNA sequence, that can be either the sister chromatid, the homologous chromosome or an ectopic sequence, in order to rebuild the missing DNA. The use of the homologous chromosome as a template
  • 31. I.1. Meiosis 15 Figure I.4: Possible architecture of the DNA/protein recombination complexes mediating homolog pairing. (I) One DSB end (lower red arrowhead) interacts with a homologous chromatin loop, thereby initiating the assembly of a protein complex containing at least four post-DSB recombination proteins (for details of the proteins see Storlazzi et al. (2010)). The other DSB end (upper red arrowhead) associates with the axis of the DSB “donor” chromosome. (II) The complex formed with the partner chromosome in (I) becomes axis- associated, thereby bringing donor and recipient chromosomes into closer proximity, with asymmetric evolution of the recipient chromosome complex. (III) The chromosome axes are separated by a distance of 400 nm. From Storlazzi et al. (2010). is preferred during meisois as it is essential for the accurate segregation of homologs at the end of meiosis I (Schwacha and Kleckner, 1997). Hereafter, HR will refer to the recombination process that takes place between homologous chromatids during meiosis. The repair through HR yields two types of final products: crossovers (CO) and non- crossovers (NCO). While a CO supposes a large exchange of genetic material between the homologous chromosomes, a NCO is a highly local event which results in the swap of only a small DNA segment. The mechanisms leading to these two recombination products are not yet fully under- stood, but all the models for HR are based on the formation of a single-end invasions (SEI) intermediate. One of the first models to account for the production of both COs and NCOs, the double-strand break repair (DSBR) model (Szostak et al., 1983), is based on the resolution of a cross-stranded structure, the Holliday junction (HJ) (Holliday, 1964) (figure I.6). Following the SEI, the loop (also called D-loop) formed by the coming apart of the homologous dsDNA, is enlarged through new DNA synthesis and captures the opposing free 5’ end. Ligation of the two ends as well as gap repair of the missing DNA on the sister chromatid completes the formation of a second recombination intermediate, the double HJ (dHJ). Endonucleases resolve the dHJ by introducing symmetric nicks in the strands with the same polarity, which are then ligated. If, like in figure I.6, the cuts (arrows) are made on the two sides of the dHJ, thus affecting all four strands, a CO is produced. Two cuts on the same side of the dHJ, affecting only two homologous strands out of the
  • 32. 16 Chapter I. Molecular mechanisms of recombination Zip1 DSB formation Processing of DSB SEI formation DNA synthesis Second end capture dHJ formation dHJ resolution Leptotene Zygotene Pachytene DSB complex Nascent D-loop formation Implmentation of interference CO/NCO differentiation Spo16-Spo22 Msh4-Msh5 (+ Zip1, etc) CO assurance SC elongation Crossing-over pathway SC pathway Figure I.5: Model of the parallel between recombination and synaptonemal complex (SC) formation and timing in yeast. At the beginning of leptotene, DSBs appear on the chromatin loops along the chromosomes. At zygotene, bridges are formed between the axes of the two homologous chromosomes at the sites of DSBs by single-end invasion (SEI) and the formation of D-loops. DSBs can be solved either as crossovers (CO) or non-crossovers (NCO). The sites of future CO resolution will recruit proteins such as Zip1, which is a component of the central element (CE) constituting the SC. At mid-pachytene, the polymerization of Zip1 results in the full assembly of the SC. The resolution of the recombination intermediates, double Holliday junctions (dHJ) yields CO recombination products. From Shinohara et al. (2008).
  • 33. I.1. Meiosis 17 four, produce a NCO. Many predictions of the DSBR model have come true, starting with the observation of the dHJ intermediates deduced from 2D gel analysis (Schwacha and Kleckner, 1995; Allers and Lichten, 2001a). Recently, two long-awaited eukaryotic resolvases of the dHJ have been identified: GEN1/Yen1 (Sharples, 2001; Ip et al., 2008; Bailly et al., 2010) and SLX4/BTBD12/MUS312/Him-18 complex (Fekairi et al., 2009; Saito et al., 2009; Svendsen et al., 2009). Figure I.6: Homologous recombination. Summary of our current understanding of recombination pathways that are initiated by a DNA double-strand break (DSB) and which lead to gene conversion with or without crossover. First, the ends of the DSB are cut, producing single stranded DNA that recruits the recombination protein RAD51. The assembly of a RAD51 nucleoprotein filament leads to interactions with homologous duplex DNA and strand invasion. This process is known as single-end invasion (SEI). In some pathways for recombination (centre), SEI is followed by the capture of the second DNA end. This intermediate can proceed to form double Holliday junctions, and any remaining gaps might be filled by new DNA synthesis. The resulting Holliday junctions might then serve as the substrate for a classic Holliday-junction-resolution reaction or be dissociated by the combined actions of BLM (Bloom’s syndrome protein) and topoisomerase IIIα (Topo III). The BLM-Topo-III reaction primarily leads to the formation of non-crossover products, as mutations in BLM cause an increase in crossover formation. Recombinants can also form by a MUS81-dependent pathway that does not involve Holliday-junction formation (right). Similarly, DSBs can be repaired by synthesis-dependent strand annealing (SDSA) (left) to produce non-crossovers. Adapted from Liu and West (2004). Despite the advantage of offering an integrated view of the process generating CO and
  • 34. 18 Chapter I. Molecular mechanisms of recombination NCO, the DSBR model does not account for all the biological observations, especially regarding the production of NCOs (reviewed in McMahill et al. (2007)). The resolution of the dHJ as a NCO is expected to generate a heteroduplex DNA to the left of the DSB in one of the chromatids and to the right in the other chromatid. However, several studies have found that in the majority of cases the heteroduplex is present only in one of the chromatids, and even in cases of two tracts of heteroduplex, they were localized on the same chromatid (Allers and Lichten, 2001a; Merker et al., 2003; Jessop et al., 2005). Additionally, knock-out mutants for proteins involved in CO production reduce drastically their number, but yield no influence on the production of NCOs (reviewed in Bishop and Zickler (2004)). The current hypothesis seems to be that the majority of physically observed HJ are processed into COs (Allers and Lichten, 2001a). These observations as well as the discovery of additional protein complexes involved in CO/NCO production have led to the description of alternative pathways as represented in figure I.6. An alternative pathway of dHJ resolution involves the helicase BLM. BLM together with RMI1 and TOPIII form a protein complex (BLM*) which catalyzes the dissolution of the dHJ and generates NCOs as the final products by preventing the exchange of flanking sequences (Wu and Hickson, 2003, 2006). Another pathway acting on the HJ and leading to the exclusive production of COs, involves the Mus81-Eme1 protein complex (Mus81*) (Constantinou et al., 2002). In Saccharomyces pombe, the majority, if not all COs, are dependent on this pathway (Boddy et al., 2001). It was first thought that Mus81-null mutants, in mouse, were viable, but it was recently demonstrated that they are also subject to severe meiotic defects (Holloway et al., 2008). Studies in S. cerevisae and Arabidopsis thaliana have pointed to the particularity of Mus81 in generating interference-independent COs (de los Santos et al., 2003; Berchowitz et al., 2007). Although the mechanism involving Mus81* is not yet clear, the preferred hypothesis consists in a HJ cleavage activity, which has been observed in vitro (Cromie et al., 2006; Taylor and McGowan, 2008). It has been suggested that Mus81* acts on the D-loops before their full maturation into dHJs, by making two cuts on the opposing strands of the homologous chromatid, transforming the four-way branched structure into two linear products (Gaillard et al., 2003; Osman et al., 2003). The linear products are further resolved by DNA synthesis and ligation, resulting in final COs. Additional to DSBR and BLM* models, most NCOs result from synthesis-dependent strand-annealing (SDSA) without the formation of a HJ (Allers and Lichten, 2001b). Following the SEI and the extension of the invading end past the site of the DSB, the D-loop is disrupted. The displaced DNA strand will further anneal with its complementary ssDNA on the other side of the DSB. DNA synthesis and nick ligation will complete the process, resulting in a NCO (McMahill et al., 2007). Intermediates of the SDSA pathway have been detected in S. cerevisae meiotic cells (McMahill et al., 2007). The current view is that multiple pathways may be used for the formation of recombi- nation products. Moreover, these pathways are not completely independent as there is evidence of cross-talk among them. SLX4 and Mus81 interact, and it has been proposed that the SLX1-SLX4 may be part of the Mus81* pathway as well, with SLX1 making the initial nick of the HJ and Mus81 cutting the nicked HJ generated by the second end capture (Svendsen and Harper, 2010). Also, BLM is known to interact with Mus81 in
  • 35. I.2. Recombination 19 somatic cells and with MLH1, representing a possible bridge between the DSBR and Mus81 pathways (Holloway et al., 2008, 2010). I.1.5 Postsynaptic phase By mid-late pachytene, the mature CO products are observed cytologically at chiasmata sites (Hunter and Kleckner, 2001; Guillon et al., 2005). From late pachytene to diplotene, the SC is disassembled as its CE proteins dissociate from the chromosome arms (Tsubouchi and Roeder, 2005). Following the dissociation of the SC, chiasmata become visible. The homologous chromosomes still attached at the centromere as well as at chiasmata sites, prepare to attach the meiotic spindle upon the entry in metaphase I (Zickler and Kleckner, 1999; Zickler, 2006). Review articles for this sub-chapter: Liu and West (2004); Zickler (2006); Ding et al. (2010); Storlazzi et al. (2010); Sz´ekv¨olgyi and Nicolas (2010); Yanowitz (2010) I.2 Recombination I.2.1 Distribution of recombination events I.2.1.1 DSB distribution Are the recombination events: DSBs, COs and NCOs, evenly distributed along the chromosomes? What makes a genomic region likely to host some of these products? These open questions have lately benefited from advances in microarray and cytological technologies, especially in yeast. Thus, DSBs have been found to cluster in small regions, called DSB hotspots (DSBh) separated by long regions with few or no DSB events, DSB coldspots (de Massy et al., 1995; Lichten and Goldman, 1995; Baudat and Nicolas, 1997; Petes, 2001). Hotspots are regions having a higher fraction of events compared to their surrounding environment. In S. cerevisae, DSBs may occur at many sites within regions of a few hundred base pairs (bp) (de Massy et al., 1995; Liu et al., 1995; Xu and Kleckner, 1995). Mutations at the putative DNA-binding surface of Spo11 have been shown to affect the distribution of DSBs, but the effect is weak and no specific motif has been found at the sequence level to explain the existence of DSBh (Liu et al., 1995; Murakami and Nicolas, 2009). Despite a lack of specificity in the binding of Spo11, some epigenetic features have been found to correlate with the distribution of DSBhs. DSBs are preferentially initiated in the chromatin loops rather than the chromatin bound to chromosome axes (Blat et al., 2002). But not all chromatin loops contain a DSB, the distance between DSBs exceeding the average size of the DNA loops (Gerton et al., 2000). Local chromatin accessibility is another important factor in the initiation of DSBs, since DSBh are preferentially located in nuclease-hypersensitive regions (Ohta et al., 1994; Wu and Lichten, 1994; Berchowitz et al., 2009). Histone modifications, especially H3 lysine 4 trimethylation, associated with active chromatin, are also marks of recombination initiation sites in S. pombe (Yamada et al., 2004), S. cerevisae (Borde et al., 2009), and mouse (Buard et al., 2009).
  • 36. 20 Chapter I. Molecular mechanisms of recombination Two chromosomal landmarks are considered cold DSB regions: the chromosome ends (also known as telomeres) (Su et al., 2000; Blitzblau et al., 2007; Buhler et al., 2007) and ribosomal DNA (rDNA) (Petes and Botstein, 1977; Blitzblau et al., 2007) in S. cerevisae. It was postulated that DSB initiation sites avoid highly repetitive DNA, as it could lead to nonhomologous interactions between chromosomes and loss of rDNA repeats (Barton et al., 2003). Despite the first 20 kb of the chromosome ends being cold, the following 30 Kb are hot, suggesting that telomeres act as promoters for a strong recombination activity in adjacent regions (Blitzblau et al., 2007; Buhler et al., 2007; Barton et al., 2008). At first, centromeric regions were also considered cold regions (Gerton et al., 2000; Borde et al., 2004). However, important DSB hotspots have been found in the pericentromeric region of S. cerevisae (Blitzblau et al., 2007; Buhler et al., 2007). Interestingly, pericentromeric sequences have also an open chromatin structure (Berchowitz et al., 2009). At the sequence level, DSBs form preferentially in intergenic regions (Baudat and Nicolas, 1997; Gerton et al., 2000; Cromie et al., 2007). In S. cerevisae, most recombination initiation sites occur in the vicinity of transcription promoters (Baudat and Nicolas, 1997; Gerton et al., 2000). Some DSB hotspots have been found to require the presence of transcription factors, however the level of transcription doesn’t seem to affect the frequency of DSB events (Gerton et al., 2000). In S. pombe, an association has been reported between recombination hotspots and long non-coding RNA loci, which was proposed to result from the role of these RNA loci in binding factors, such as transcription factors, and thus remodeling the chromatin (Wahls et al., 2008). The existence of hot and cold DSB regions results in the non-random distribution of DSB events. This distribution is related to the observed phenomenon of interference between the recombination initiation sites. Positive interference (simply termed interference hereafter) supposes the existence of an inhibition zone around events, preventing the formation of additional recombination occurrences. Even if DSB interference is detected in the studies mentioned previously, it is certainly underestimated, as the DSB mapping techniques account for the combined results of thousands of independent meioses (Berchowitz and Copenhaver, 2010). Cytological evidence of DSB interference includes the observation of distances between early recombination nodules (structures associated to the SC), in plants (Anderson et al., 2001), as well as between early MSH4 foci in mouse (de Boer et al., 2006). Both early nodules and MSH4 foci are associated with Rad51/Dmc1, and are considered representative of the DSB sites (Zickler and Kleckner, 1999). Another indication of competition between DSBs has been observed by deleting DSB hotspots, which stimulated the formation of DSBs at adjacent sites (Wu and Lichten, 1995). Also, insertion of a DSB hotspot results in the reduction of DSB activity in the neighboring hotspots (Wu and Lichten, 1995; Fan et al., 1997; Robine et al., 2007). The existence of interference suggests that even if the distribution of DSBs is variable from one meiosis to another, their number is subject to little variability, as for example in yeast, it varies between 150 and 170 events per meiosis (Buhler et al., 2007; Robine et al., 2007). I.2.1.2 CO and NCO distribution Figure I.7 depicts the distribution of DSB and recombination rates along chromosome III
  • 37. I.2. Recombination 21 in yeast. Additional to the hotspot organization of DSBs, the recombination products, COs and NCOs, are also subject to a non random distribution. Moreover, some DSBh seem more favorable to NCOs while other host preferentially COs, suggesting higher levels of interference (Mancera et al., 2008) (figure I.8). Techniques such as genetic mapping, linkage disequilibrium and sperm-typing (see chapter II.1 for details), have permitted the extensive study of CO distribution in a wide variety of species. In humans, a majority of CO events (60%) are part of a known CO hotspots (COh) (Coop et al., 2008), and 60-70% of these known COh are hosted within 10% of the genome (Myers et al., 2005). A CO hotspot is a region 1-2 kb wide (Jeffreys et al., 2004), surrounded by CO coldspots, on average 50-100 Kb long (Myers et al., 2005). Despite an evolutionary conserved length (Mancera et al., 2008; Wu et al., 2010), COh display a wide variety of intensities (Arnheim et al., 2007; Wu et al., 2010). The CO frequency associated with COh, in mouse, ranges from 0.0027% to 1.1%(Wu et al., 2010), for an average CO rate (COR) of 0.5 cM/Mb per genome (Cox et al., 2009). In human too, the high resolution characterization of recombination hotspots through sperm-typing (additional table B) indicates a wide variety in COh intensity, for a genome-wide average COR of 1.1 cM/Mb (Kong et al., 2002). 0 5 10 15 20 0 10 20 30 40 50 100 150 200 250 300 HIS4 ARE1-IMG1 THR4LEU2−CEN3 Eventcounts DSBratio kb Figure I.7: Comparison of DSB and recombination rates along chromosome III in yeast. DSB smoothed fluorescence ratios in a SK1 strain (dmc1D, grey) are compared with recombination event counts in a S288c/YJM789 hybrid strain (blue), after adjusting the latter for varying intermarker interval size. Peak locations largely agree despite distinct strain backgrounds, although some fine-scale differences exist. Previously known hotspots are indicated by red segments. From Mancera et al. (2008). The CO frequency in the centromeric and pericentromeric regions is very low, both in human and yeast, (Myers et al., 2005; Chen et al., 2008; Mancera et al., 2008). Experiments in yeast suggest that this reduction might be the consequence of a low number of recombination initiation events in this region (Gerton et al., 2000; Borde et al., 2004). However, as previously mentioned, the pericentromeric regions are not completely devoid of DSBh. A reason for the reduction in recombination may be that COs close to centromeres can interfere with meiotic chromosome segregation (Rockmill et al., 2006). It has been postulated that the CO pathway is inhibited, and repair might favor the sister
  • 38. 22 Chapter I. Molecular mechanisms of recombination Counts kb 50 100 150 200 0 2 4 6 8 CO NCO Figure I.8: Crossover and non-crossover rates along chromosome I in yeast. Crossover (CO, blue) and non-crossover (NCO, green) counts, adjusted for varying intermarker interval size. The black circle represents the centromere. From Mancera et al. (2008). rather than the homologous chromatin, which results in a reduction of the number of COs in this region (Blitzblau et al., 2007; Chen et al., 2008). This inhibition is not due to a more compact chromatin configuration, as the chromatin structure in the pericentromeric area has been found to be open (Berchowitz et al., 2009). Moreover, recent studies in the genome of maize have shown the existence of NCO events at centromere (Shi et al., 2010). The distribution of CO and NCO events at telomeres is more ambiguous. It has been suggested that the recombination events are depleted at telomeres as they could generated high rates of non-homologous recombination, due to the high density of repeat elements in this region (Barton et al., 2003). However, the ambiguity may also simply result from the difficulty of studying repetititve sequences (Mancera et al., 2008). In mammals, the recombination rates are highly increased in the regions adjacent to chromosome ends (Myers et al., 2005; Paigen et al., 2008), while in S. cerevisae the landscape is patchy, with some chromosomes having no event long before the telomeres, while others show a strong activity near telomeres (Mancera et al., 2008). Although DSBs in S. cerevisae are associated with promoter regions, only 25% of the CO hotspots overlap a promoter (Mancera et al., 2008). In humans too, CORs are low near the transcription start site (TSS), but start increasing 10 Kb away from the TSS (Coop et al., 2008). A lower COR near genes seems to be a general feature of mammals and plants (Drouaud et al., 2006; Kauppi et al., 2007; Paigen et al., 2008; Wu et al., 2010). Moreover, the use of 3.1 million human single nucleotide polymorphisms (SNPs) has revealed an asymmetry in the distribution of recombination around genes, as illustrated in figure I.9, with regions 3’ of transcribed domains having more CO activity than the 5’ regions (International HapMap Consortium, 2007). A degenerated 13-mer sequence motif (CCNCCNTNNCCNC) has been identified in association with human CO hotspots (Myers et al., 2008) (figure I.10). LD studies, as well as sperm-typing analyses (Webb et al., 2008), have located the presence of the motif in 40% of the CO hotspots. The function of this motif in humans has been associated with the binding of the zinc-finger protein PRDM9 (Myers et al., 2009). The binding sequence of PRDM9 is an exact match of the 13-mer motif, with a degeneracy at positions
  • 39. I.2. Recombination 23 3, 6, 8, 9, and 12 and no degeneracy at the remaining 8 positions. An independent study (Baudat et al., 2009) of the recombination activity in mouse found that the gene coding for PRDM9 was located at the double-strand break control 1 (Dsbc1) genetic locus which controls for the activity and distribution of recombination hotspots in this species (Grey et al., 2009). Additional to the zinc fingers, PRDM9 also contains a SET-methyltransferase domain, involved in the tri-methylation of the 4th lysine in histone 3 (H3K4me3) (Baudat et al., 2009). The H3K4me3 is associated with the initiation of recombination in both S. cerevisae and mouse (Borde et al., 2009; Buard et al., 2009). Despite the association between PRDM9 and hotspot activity, the exact mechanism of this interaction is not yet understood. Both in mouse and human, the sequencing of the Prdm9 gene has revealed the existence of multiple alleles in a population, resulting in variants with variable number of Zn fingers (Parvanov et al., 2009; Berg et al., 2010). The CO activity and hotspot distribution is highly dependent on the type of Prdm9 alleles carried by individuals. In humans, the allele associated with the 13-mer motif is termed A. It controls the motif association to recombination hotspots in different genetic backgrounds: repeat and nonrepeat DNA, male and female, as well as for generating ectopic recombination (Myers et al., 2008). While the PRDM9 variant coded by allele A is responsible for the recognition of the 13-mer motif, it has been found to trigger recombination even at hotspots depleted of the motif (Berg et al., 2010). On the other hand, other PRDM9 variants, not recognizing the motif, generate high levels of recombination at hotspots containing the motif (Berg et al., 2010). These results imply that PRDM9 might explain more than the 40% of the hotspots containing the motif (McVean and Myers, 2010). It may be that PRDM9 binds even diverged motifs, while additional flanking sequences stabilize the bond (Myers et al., 2008). Additionally, the H3K4me3 activity of PRDM9 might allow the recruitement of the recombination protein complex containing Spo11 (Baudat et al., 2010). Finally, the comparison of Prdm9 sequences in multiple metazoans has revealed that Prdm9 is under an accelerated evolution (Oliver et al., 2009). A particularly high divergence rate characterizes this gene between human and chimpanzee. Given this rapid evolution, it is not puzzling that the binding sequence of the chimpanzee PRDM9 differs from the 13-mer CO hotspot motif found in humans (Myers et al., 2009; Oliver et al., 2009). If PRDM9 is indeed an attribute of CO hotspots in all species, its species-specific analysis would reveal different sequence motifs associated with hotspot activity. I.2.2 Non-allelic homologous recombination (NAHR) Recombination can also take place between non-allelic sequences situated at different genomic locations. This non-allelic homologous recombination (NAHR) (a.k.a ectopic recombination) occurs mainly between repeats. Low copy repeats (LCRs), resulting from the duplication of a few hundred kb long sequences, that display high sequence similarity, represent preferred NAHR sites (Bailey and Eichler, 2006). Studies of LCR have demonstrated that both allelic recombination and NAHR are similar processes (reviewed in Sasaki et al. (2010)). NAHR is initiated by DSBs and is localized in hotspots 1-2 Kb inside the LCRs. The 13-mer degenerated motif associated with allelic recombination is also indicative of NAHR hotspots (Myers et al., 2008). Recombination between non-allelic sequences results in genomic rearrangements, such
  • 40. 24 Chapter I. Molecular mechanisms of recombination Figure I.9: The recombination rate around genes in human. The blue line indicates the mean. Grey lines indicate the quartiles of the distribution. Values were calculated separately 5’ from the transcription start site (the first dotted line) and 3’ from the transcription end site (third dotted line) and were joined at the median midpoint position of the transcription unit (central dotted line). Note the sharp drop in recombination rate within the transcription unit, the local increase around the transcription start site and the broad decrease away from the 3’ end of genes. Adapted from International HapMap Consortium (2007). Figure I.10: Recombination rates around hotspot motif. Estimated HapMap Phase II recombination rate across the 40kb surrounding 16 human THE1 elements (red line) and 6 L2 elements (blue line), each containing a conserved exact match to the 13-bp core motif. Rates are smoothed using a 2kb sliding window slid in 50bp increments, averaged across elements. Horizontal dashed line: the human average recombination rate of 1.1cM/Mb. Vertical dotted line: the center of the repeat. Adapted from Myers et al. (2009).
  • 41. I.2. Recombination 25 as deletions, duplications, inversions or isodicentric chromosome formation (reviewed in Sasaki et al. (2010)) (figure I.11). These rearrangements can induce genomic disorders. In human, two such disorders have been found to contain the degenerated CO hotspot motif (Berg et al., 2010). The study of the effect of Prdm9 alleles on the frequency of rearrangements in these regions has revealed that this allelic associated recombination gene is also characteristic of NAHR hotspots (Berg et al., 2010). I.2.3 Interference between recombination products Keeping homologs together during the reductional division of meiosis is essential for their correct segregation (figure I.1). This role is fulfilled by COs as they first initiate the formation of the SC and keep the homologous chromosomes connected during their migration to the poles (Roeder, 1997). Thus, it has been observed that the majority of species have at least one CO per pair of homologs, also known as the obligate CO (de Villena and Sapienza, 2001). A control mechanism, known as CO homeostasis, promotes the formation of CO events at the expense of NCOs, in order to assure the obligate CO (Martini et al., 2006). Moreover, the distribution of COs is not random along the chromosomes. Instead they are subject to interference, thus ensuring a more uniform distribution and reducing the risks of non-disjunction (Bishop and Zickler, 2004). The construction of the first genetic map in Drosophila allowed the first observation of the interference phenomenon between adjacent COs (Sturtevant, 1913). Despite the validation of CO interference (COI) in a vast majority of species (table I.1), the genetic mechanisms underlying this phenomenon are poorly understood. At first, it was thought that polymerization around the initiation sites of the SC prevented CO formation in adjacent regions (Maguires, 1988). However, the sites of SC initiation, which are also the sites of future COs, exhibit interference long before the assembly of the SC (Fung et al., 2004). Moreover, mutants for the spo16 gene in S. cerevisae, which show defects in the extension of the SC, exhibit normal distribution of interference (Shinohara et al., 2008). Cells with a defective SC in mouse are also having normal interference levels (de Boer et al., 2006). The modern view on COI is that it is not dependent on the formation of the SC, and is probably established very close to the transition between DSBs and SEI formation (Hunter and Kleckner, 2001; Bishop and Zickler, 2004). Recent studies support the idea that interference takes place at the time when Msh4-Msh5 complexes stabilize SEI (Shinohara et al., 2008) (figure I.5). In S. cerevisae, mutants of tid1 gene, coding for the Tid1 protein, involved in the regulation of strand invasion, have normal levels of COs, but the interference is greatly decreased (Shinohara et al., 2003). Dissociation of the strand invasion events, regulated by RTEL1, promotes COI by preventing adjacent DSBs to be repaired through the DSBR pathway, and generating NCOs, possibly through SDSA (Barber et al., 2008; Youds et al., 2010). The study of COI is further complicated by the existence of two types of COs: interfering and non-interfering. The first type is generated through the DSBR pathway (Msh4-Msh5 COs), while the other uses the Mus81 pathway, as described in section I.1.4. The distribution of these two kinds of COs varies widely between species, from S. pombe and Aspergillus nidulans with no interference, to C. elegans with complete interference (table
  • 42. 26 Chapter I. Molecular mechanisms of recombination Figure I.11: Genome rearrangement by non-allelic homologous recombination. Crossover recombination between repeated DNA sequences at non-allelic positions can generate a deletion, a duplication, an inversion or an isodicentric chromosome. Depicted here are six chromosomal outcomes of non-allelic homologous recombination (NAHR) between repeats located on the same chromosome, with two orientations of repeats relative to one another (direct or inverted) for each of three types of interactions (between homologues, between sister chromatids or in the same chromatid). Homologous chromosomes are shown in blue and red, and sister chromatids are depicted in the same colour (homologous chromosomes are not shown in the schematics depicting inter-sister chromatid or intrachromatid ex- changes for simplicity). Low-copy repeats (LCRs) are shown as white and black arrowheads. From Sasaki et al. (2010). I.1). While models have been proposed and proteins have been identified for the two pathways, the mechanisms creating interference in one but not the other type of CO are still unknown. Two hypotheses have been advanced regarding the difference between the interfering and non-interfering COs (reviewed in Berchowitz and Copenhaver (2010)). The “toolbox hypothesis” postulates that Mus81-Eme1 and Msh4-Msh5 protein complexes are both recruited at all the DSB sites (Berchowitz and Copenhaver, 2010). The majority of DSBs is then repaired through the DSBR pathway, and only a subset of recombination intermediates (aberrant ones) will be later resolved by the Mus81-Eme1 protein complex. The idea of a recruitment for both protein complexes at the recombination initiation sites comes from the observation of co-localization of AtMUS81 with AtRAD51 and AtMSH4 foci at leptotene, in A. thaliana (Higgins et al., 2008). Moreover, mutants of the mus81 gene in S. cerevisae show an accumulation of aberrant recombination intermediates (single HJ, intersister and multichromatid molecules) which prevent the correct segregation of homologs and fail to divide nuclei (Jessop and Lichten, 2008; Oh et al., 2008). The “two-phase hypothesis”, proposed by (Getz et al., 2008), supposes that COs
  • 43. I.2. Recombination 27 are specialized, with one class of COs contributing to “pairing” of homologs, while the other assures their “disjunction”. The “pairing” COs occur early during meiosis and are non-interfering as shown by the ndj1 mutants, which have a delay in pairing and also a decrease in interference (Conrad et al., 1997). Additionally, the lack of non-interfering, “pairing”, COs in C. elegans and Drosophila might have been responsible for alternative, CO independent pairing mechanisms between homologs (table I.1). However, if the“pairing” COs are the ones dependent on MUS81, a few inconsistencies arise: why hasn’t any pairing defect been observed in mus81 mutants? and how about the timing of MUS81, which has been reported to act late on the recombination intermediates (Jessop and Lichten, 2008; Oh et al., 2008)? It is possible that the “pairing” COs represent a new pathway of DSB repair, independent of both Mus81 and Msh4-Msh5. This possibility is supported by the existence of an average 0.85 chiasmata per cell that is not explained by either AtMSH4 or AtMUS81, in A. thaliana (Higgins et al., 2008). The “disjunction” COs are supposed to occur late during meiosis and be subject to interference, being dependent on MSH4. NCO events, not associated to COs, don’t seem to interfere with one another in S. cerevisae (Malkova et al., 2004; Mancera et al., 2008). However, the influence between COs and adjacent NCOs is more ambiguous. If interference between COs is generated by the adjacent recombination intermediates being resolved as NCOs, a negative interference is expected between the two recombination products. Biological observations of the CO-NCO distances have yielded contradictory results. While studies of discrete intervals, associated with precise loci, have found no or negative interference between COs and NCOs (Malkova et al., 2004; Getz et al., 2008), the genome wide study of recombination products, in S. cerevisae, found that this same distance is 13.1 Kb larger than expected by chance (Mancera et al., 2008). This might imply, that at the genome scale, NCOs inhibit the formation of COs in their vicinity. Moreover, (Berchowitz and Copenhaver, 2010) proposes that the discordance between the two types of biological results, might reflect the existence of two classes of NCOs, as for the COs, interfering and non-interfering with the single locus studies having found only the non-interfering class. Review articles for this sub-chapter: Berchowitz and Copenhaver (2010); Martinez-Perez and Colai´acovo (2009); Yanowitz (2010); Sz´ekv¨olgyi and Nicolas (2010) I.2.4 Differences in recombination The particular distribution of recombination events and the constraint of an obligate CO suggest that recombination is subject to a strong control. Misplaced or too few recombination products can generate gametes and offspring with an abnormal number of chromosomes (aneuploidy) (Baker et al., 1976; Hunt and Hassold, 2002; Petronczki et al., 2003). Furthermore, recombination plays an important adaptive role by breaking up and reshuffling chromosome segments, thus producing novel multilocus haplotypes that serve as potential selective alternatives for adaptive evolution (Otto and Lenormand, 2002; Marais and Charlesworth, 2003). However, recombination rates display high levels of variation among individuals and species. Unraveling the mechanisms generating this variation is fundamental for the understanding of genome evolution.

Related Documents