2021-02-05
When?
Friday, 5 February 2021
10.00 - 11.00 h
Where?
https://stockholmuniversity.zoom.us/j/62208705509
Which paper are we discussing?
At the next non-model organisms journal club I will make a presentation with the title "Genomics New Clothes" that discusses methodological challenges working with high-dimensional genomics data. Particularly, I will concentrate on The Curse of Dimensionality that dramatically constrains any statistical analysis of genetic variation data and contributes to the reproducibility crisis in population genetics, ancient DNA, precision medicine and other genomics dominated research areas. As one of manifestations of the problem, we will discuss some misleading interpretations of genetic distances on a PCA plot, as well as the lack of correlation structure and ultra-low (comparable with noise level) variance explained by leading principal components computed on genetic variation data. To compensate for my overall concerned and critical view of the perspectives in genomics research, I will suggest possible ways to combat the challenges of high-dimensional data analysis.
Here is some suggested reading as preparation for the discussion, which is meant to be a follow-up on a discussion at the Advisory program meeting.
- Novembre, John, and Matthew Stephens. 2008. ‘Interpreting Principal Component Analyses of Spatial Population Genetic Variation’. Nature Genetics 40 (5): 646–49. https://doi.org/10.1038/ng.139.
- Reich, David, Alkes L. Price, and Nick Patterson. 2008. ‘Principal Component Analysis of Genetic Data’. Nature Genetics 40 (5): 491–92. https://doi.org/10.1038/ng0508-491.
- McVean, Gil. 2009. ‘A Genealogical Interpretation of Principal Components Analysis’. PLOS Genetics 5 (10): e1000686. https://doi.org/10.1371/journal.pgen.1000686.
- François, Olivier, and Flora Jay. 2020. ‘Factor Analysis of Ancient Population Genomic Samples’. Nature Communications 11 (1): 4661. https://doi.org/10.1038/s41467-020-18335-6.
- House, Geoffrey L., and Matthew W. Hahn. 2018. ‘Evaluating Methods to Visualize Patterns of Genetic Differentiation on a Landscape’. Molecular Ecology Resources 18 (3): 448–60. https://doi.org/10.1111/1755-0998.12747.
Björklund, Mats. 2019. ‘Be Careful with Your Principal Components’. Evolution 73 (10): 2151–58. https://doi.org/10.1111/evo.13835.
- Patterson, Nick et al 2006. 'Population Structure and Eigenanalysis' PLoS Genetics 2(12): e190. https://doi.org/10.1371/journal.pgen.0020190
Who is presenting?
Nikolay
Presentation: GenomicsNewClothes.ppt
Notebook: GenomicsNewClothes.html, GenomicsNewClothes.ipynb