Multiple Loci Selection with Multi-Way Epistasis in Coalescence with Recombination
Abstract
:1. Introduction
2. Materials and Methods
2.1. The Coalescent Simulator
2.1.1. Selection Scenarios
2.1.2. EpiSimRA: Multiple Loci Selection & Multiway Epistasis
Algorithm 1 EpiSimRA |
Coalescence Event
Recombination Event
2.2. The Forward Simulator
2.2.1. Simulating the “Book of Populations”
2.2.2. Modeling Multiway Epistasis
2.2.3. Tracing the ARG
3. Results
3.1. Comparison Study
3.2. Evaluating Epistatic Scenarios
4. Discussion
5. Conclusions
6. Patents
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
ARG | Ancestral Recombination Graph |
CDF | Cumulative Distribution Function |
EpiSimRA | Epistatic simulations based on Random Graph Algorithms |
fwd-EpiSimRA | forward EpiSimRA |
KS | Kolmogorov-Smirnov |
MRCA | Most Recent Common Ancestor |
SNP | Single Nucleotide Polymorphisms |
TMRCA | Time to Most Recent Common Ancestor |
WF | Wright-Fisher |
Appendix A. The Forward Simulator
Appendix A.1. Choosing Parents
Transfer of Genetic Material
Appendix A.2. Tracing the ARG from the Book of Populations
Appendix A.3. Simulating the Book of Populations with Selection and Two-Way Epistasis
ALGORITHM: |
1. Initialization: |
|
2. If f is set, randomly select an individual among N and a site, along g which underwent mutation. Select an allele randomly in and set f to 1. |
3. Loop For each generation, |
4. Loop For each individual i in {}, in th generation. |
5. Compute , where any group k of loci could contain a single locus under selection, for which is defined as the user input. It can also contain a locus interacting with another locus, in a two-way epistasis. In this case s is populated from a matrix formed by the all possible alleles at each loci, from the following form, . is the selection cofficient at allele j in individual i’s chromosome. |
6. Select parents for each child in tth generation based on from th generation. |
7. End |
8. For each child i in tth generation, compute scaled recombination rate and select a value, . |
9. If |
10. If No recombination event: Randomly pick a chromosome from the parent and assign its genetic material to the child. |
11. Else Randomly pick a crossover index . Get the genetic material from in the first chromosome of the parent and in the second, combine them and assign it to the child. |
12. In the child’s genetic material, randomly select locations along the chromosome length, g for mutation according to the Poisson distribution and the scaled mutation rate . Assign the alleles randomly to other bases. For example, if the allele was A, change it randomly to one of the other bases . |
13. Update the Chromosomes of the current generation with the new genetic information obtained from the previous generation and continue until the last generation, G. |
14. End |
Appendix A.4. Tracing the ARG from the Book of Populations
ALGORITHM: |
1. Initialization: |
|
2. Loop for each generation, t going backwards from {} |
3. Identify each chromosome from the previous generation which contributed to each chromosome in the current generation, following the book of populations. |
4. Check to see if multiple children in the gth generation share the same parent in the previous generation. |
5. Iterate and Count the number of active samples, in each generation. |
6. Until |
7. Compute the Height of the GMRCA from the height of convergence. |
Appendix A.5. Experiments and Comparison Study
3 Interacting Loci | m | p-Value | Test Statistic | |||
---|---|---|---|---|---|---|
10 | 0.1400 | 0.16 | ||||
20 | 0.4431 | 0.12 | ||||
× | × | × | × | 30 | 0.3439 | 0.13 |
40 | 0.9995 | 0.05 | ||||
10 | 0.6766 | 0.08 | ||||
20 | 0.7942 | 0.08 | ||||
× | × | × | 30 | 0.6766 | 0.10 | |
40 | 0.5750 | 0.11 | ||||
10 | 0.9921 | 0.06 | ||||
20 | 0.5560 | 0.11 | ||||
× | × | 30 | 0.7942 | 0.09 | ||
40 | 0.8938 | 0.08 | ||||
10 | 0.8938 | 0.08 | ||||
20 | 0.9995 | 0.05 | ||||
× | 30 | 0.9710 | 0.06 | |||
40 | 0.7942 | 0.09 | ||||
10 | 0.3439 | 0.13 | ||||
20 | 0.7942 | 0.08 | ||||
× | 30 | 0.6766 | 0.10 | |||
40 | 0.5576 | 0.11 | ||||
10 | 0.9610 | 0.07 | ||||
20 | 0.9610 | 0.07 | ||||
30 | 0.3556 | 0.13 | ||||
40 | 0.6766 | 0.10 |
References
- Dobzhansky, T. Nothing in biology makes sense except in the light of evolution. Am. Biol. Teach. 2013, 75, 87–91. [Google Scholar] [CrossRef]
- Kimura, M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 1969, 61, 893. [Google Scholar] [CrossRef] [PubMed]
- Hudson, R.R. Estimating the recombination parameter of a finite population model without selection. Genet. Res. 1987, 50, 245–250. [Google Scholar] [CrossRef] [PubMed]
- Calafell, F.; Grigorenko, E.L.; Chikanian, A.A.; Kidd, K.K. Haplotype evolution and linkage disequilibrium: A simulation study. Hum. Hered. 2001, 51, 85–96. [Google Scholar] [CrossRef]
- Kingman, J.F.C. On the Geneaology of Large Populations. J. Appl. Probab. 1982, 19, 27–43. [Google Scholar] [CrossRef]
- Griffiths, R.; Marjoram, P. An ancestral recombination graph. In Progress in Population Genetics and Human Evolution, IMA Vols in Mathematics and Its Applications; Donnely, P., Tavare, S., Eds.; Springer: New York, NY, USA, 1997; Volume 87, pp. 257–270. [Google Scholar]
- Carvajal-Rodríguez, A. GENOMEPOP: A program to simulate genomes in populations. BMC Bioinform. 2008, 9, 223. [Google Scholar] [CrossRef] [Green Version]
- Kelleher, J.; Etheridge, A.M.; McVean, G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput. Biol. 2016, 12, e1004842. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- McVean, G.A.; Cardin, N.J. Approximating the coalescent with recombination. Philos. Trans. R. Soc. B Biol. Sci. 2005, 360, 1387–1393. [Google Scholar] [CrossRef] [Green Version]
- Chen, G.K.; Marjoram, P.; Wall, J.D. Fast and flexible simulation of DNA sequence data. Genome Res. 2009, 19, 136–142. [Google Scholar] [CrossRef] [Green Version]
- Excoffier, L.; Foll, M. fastsimcoal: A continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics 2011, 27, 1332–1334. [Google Scholar] [CrossRef] [Green Version]
- Ewing, G.; Hermisson, J. MSMS: A coalescent simulation program including recombination, demographic structure and selection at a single locus. Bioinformatics 2010, 26, 2064–2065. [Google Scholar] [CrossRef] [Green Version]
- Shlyakhter, I.; Sabeti, P.C.; Schaffner, S.F. Cosi2: An efficient simulator of exact and approximate coalescent with selection. Bioinformatics 2014, 30, 3427–3429. [Google Scholar] [CrossRef] [Green Version]
- Spencer, C.C.A.; Coop, G. SelSim: A program to simulate population genetic data with natural selection and recombination. Bioinformatics 2004, 20, 3673–3675. [Google Scholar] [CrossRef] [Green Version]
- Teshima, K.M.; Innan, H. mbs: Modifying Hudson’s ms software to generate samples of DNA sequences with a biallelic site under selection. BMC Bioinform. 2009, 10, 166. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Excoffier, L.; Dupanloup, I.; Huerta-Sánchez, E.; Sousa, V.C.; Foll, M. Robust demographic inference from genomic and SNP data. PLoS Genet. 2013, 9, e1003905. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Carrieri, A.P.; Utro, F.; Parida, L. Sampling ARG of multiple populations under complex configurations of subdivision and admixture. Bioinformatics 2016, 32, 1048–1056. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Neuhauser, C.; Krone, S.M. The Genealogy of Samples in Models with Selection. Genetics 1997, 145, 519–534. [Google Scholar] [CrossRef] [PubMed]
- Stephens, M.; Donnelly, P. Ancestral inference in population genetics models with selection (with discussion). Aust. N. Z. J. Stat. 2003, 45, 395–430. [Google Scholar] [CrossRef]
- Barton, N.H. How does epistasis influence the response to selection? Heredity 2016, 118, 96–109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Corbett-Detig, R.; Jones, M. SELAM: Simulation of epistasis and local adaptation during admixture with mate choice. Bioinformatics 2016, 32, 3035–3037. [Google Scholar] [CrossRef] [Green Version]
- Messer, P.W. SLiM: Simulating Evolution with Selection and Linkage. Genetics 2013, 194, 1037–1039. [Google Scholar] [CrossRef] [Green Version]
- Haller, B.C.; Messer, P.W. SLiM 3: Forward genetic simulations beyond the Wright–Fisher model. Mol. Biol. Evol. 2019, 36, 632–637. [Google Scholar] [CrossRef] [Green Version]
- Felsenstein, J. Theoretical Evolutionary Genetics. Available online: https://evolution.gs.washington.edu/pgbook/pgbook.pdf (accessed on 24 April 2021).
- Javed, A.; Pybus, M.; Melé, M.; Utro, F.; Bertranpetit, J.; Calafell, F.; Parida, L. IRiS: Construction of ARG networks at genomic scales. Bioinformatics 2011, 27, 2448–2450. [Google Scholar] [CrossRef] [Green Version]
- Melé, M.; Javed, A.; Pybus, M.; Calafell, F.; Parida, L.; Bertranpetit, J.; Consortium, T.G. A New Method to Reconstruct Recombination Events at a Genomic Scale. PLOS Comput. Biol. 2010, 6, 1–13. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Parida, L.; Melé, M.; Calafell, F.; Bertranpetit, J. Estimating the Ancestral Recombinations Graph (ARG) as Compatible Networks of SNP Patterns. J. Comput. Biol. 2008, 15, 1133–1153. [Google Scholar] [CrossRef]
- Kelleher, J.; Thornton, K.R.; Ashander, J.; Ralph, P.L. Efficient pedigree recording for fast population genetics simulation. PLoS Comput. Biol. 2018, 14, e1006581. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Volz, E.M.; Koelle, K.; Bedford, T. Viral phylodynamics. PLoS Comput. Biol. 2013, 9, 11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Crow, J.F. Breeding structure of populations II. Effective population number. In Statistics and Mathematics in Biology; Kempthorne, O., Bancroft, T.A., Lush, J.L., Eds.; Iowa State College Press: Ames, IA, USA, 1954; pp. 543–556. [Google Scholar]
- Kimura, M.; Crow, J.F. The number of alleles that can be maintained in a finite population. Genetics 1964, 49, 725–738. [Google Scholar] [CrossRef]
- Stephens, M.; Donnelly, P. Inference in molecular population genetics. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2000, 62, 605–635. [Google Scholar] [CrossRef]
Parameters | Example Values | User-Specified Units | Units in bp for the Algorithm | Scaling Factor | |
---|---|---|---|---|---|
g | seqment length | 25; 75 | Kb | bp | |
m | extant units | 10; 20; 30; 40 | - | - | × 1 |
N | population size | 100; 200; 500; 1000 | - | - | × 1 |
I | length of genetic material | 1000 | bp | 1 bp | × 1 |
rates/generation | |||||
r | recombination rate | 1 | bp/gen | bp/gen | |
SNP mutation rate | 1.5 | mut/bp/gen | ×1 mut/bp/gen | ||
selection, epistasis parameters | |||||
fitness | 0.3 | - | × 1 | ||
epistasis | 0.1, 0.15 | - | - | × 1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bose, A.; Utro, F.; Platt, D.E.; Parida, L. Multiple Loci Selection with Multi-Way Epistasis in Coalescence with Recombination. Algorithms 2021, 14, 136. https://doi.org/10.3390/a14050136
Bose A, Utro F, Platt DE, Parida L. Multiple Loci Selection with Multi-Way Epistasis in Coalescence with Recombination. Algorithms. 2021; 14(5):136. https://doi.org/10.3390/a14050136
Chicago/Turabian StyleBose, Aritra, Filippo Utro, Daniel E. Platt, and Laxmi Parida. 2021. "Multiple Loci Selection with Multi-Way Epistasis in Coalescence with Recombination" Algorithms 14, no. 5: 136. https://doi.org/10.3390/a14050136
APA StyleBose, A., Utro, F., Platt, D. E., & Parida, L. (2021). Multiple Loci Selection with Multi-Way Epistasis in Coalescence with Recombination. Algorithms, 14(5), 136. https://doi.org/10.3390/a14050136