GenWin

GenWin

GenWin is an R package that defines window or bin boundaries for the analysis of genomic data. Boundaries are based on the inflection points of a cubic smoothing spline fitted to the raw data. Along with defining boundaries, a technique to evaluate results obtained from unequally-sized windows is provided. Applications are particularly pertinent for, though not limited to, genome scans for selection based on variability between populations (e.g. using Wright’s fixations index, Fst, which measures variability in subpopulations relative to the total population).

GenWin is available on CRAN, the Comprehensive R Archive Network.



Ghat

G-hat to identify selected complex traits

The G-hat R function can be implemented to identify complex traits that have been subjected to selection. It does this by relating allele frequency change to SNP effect estimates for every SNP genotypes. See our paper for details.








D'2_IS

Ohta.D.Stats

The ohtadstats R package, a work of former lab member Paul Petrowski, can be implemented to calculate Tomoko Ohta’s partitioning of linkage disequilibrium, deemed D-statistics, for pairs of loci. The package is written so that it can be scaled-up to form a genome-wide test, by implementing the function repeatedly across pairs of loci in a genotype table. See our Heredity paper for an example of this package in action.




Useful Scripts

DriftSimulator.R is an R function for conducting simulations of genetic drift at a single locus. Initial frequency, number of generations, and population demographics can all be manipulated, and plotting is simple. Documentation is in the header of the file. Load into R with “source()”, or by copy-pasting the text of the script.

DriftSimulatorWithBottlenecks.R is very similar to the above R function for conducting simulations of genetic drift at a single locus, but also enables the user to specify a bottleneck event. Documentation is in the header of the file. Load into R with “source()”, or by copy-pasting the text of the script.

VectorFst.R is a simple R function that can be used to calculate locus-by-locus \(F_{ST}\) values from allele frequency data. Basic documentation is included in the header of the file. Load into R with “source()”, or by copy-pasting the text of the script.

ModifiedRogersDistanceFunction.R is a basic function for calculating the modified Roger’s genetic distance between individuals. The calculation is simple, but I’m not aware of other implementations in R. Apply to a dataframe with individuals in rows and markers in columns. There should be two columns per marker (one column for each allele), coded as 0, 1, or 2.