Friday, 5 February 2021
10.00 - 11.00 h
https://stockholmuniversity.zoom.us/j/62208705509
At the next non-model organisms journal club I will make a presentation with the title "Genomics New Clothes" that discusses methodological challenges working with high-dimensional genomics data. Particularly, I will concentrate on The Curse of Dimensionality that dramatically constrains any statistical analysis of genetic variation data and contributes to the reproducibility crisis in population genetics, ancient DNA, precision medicine and other genomics dominated research areas. As one of manifestations of the problem, we will discuss some misleading interpretations of genetic distances on a PCA plot, as well as the lack of correlation structure and ultra-low (comparable with noise level) variance explained by leading principal components computed on genetic variation data. To compensate for my overall concerned and critical view of the perspectives in genomics research, I will suggest possible ways to combat the challenges of high-dimensional data analysis.
Here is some suggested reading as preparation for the discussion, which is meant to be a follow-up on a discussion at the Advisory program meeting.
Björklund, Mats. 2019. ‘Be Careful with Your Principal Components’. Evolution 73 (10): 2151–58. https://doi.org/10.1111/evo.13835.
Nikolay
Presentation: GenomicsNewClothes.ppt
Notebook: GenomicsNewClothes.html, GenomicsNewClothes.ipynb