Skip to contents

freq calculates the relative frequency of genotype classes (usually "A", "H", "B" or "0", "1", "2") for each individual or marker in a genotype matrix. The function transposes the matrix to process individuals or markers as needed, computes the relative frequency of each genotype, and fills missing genotype categories with the most frequent genotype. Proportions are calculated based on non-missing values. Results may be biased if the missing data is high. Use filter_missing_geno previously if your data has a high proportion of missingness.

Usage

freq(x, input_format = "numeric", by = "markers")

Arguments

x

A genotype matrix where markers are rows and individuals are columns.

input_format

Character. Specifies the genotype format. Options:

  • "numeric" (default): Uses 0, 1, and 2 as genotype categories.

  • "genotype": Uses "A", "H", and "B" as genotype categories.

by

Character. Specifies whether to calculate genotype frequencies by "markers" (rows) or "individuals" (columns). Default is "markers".

Value

A data frame where rows correspond to markers or individuals, and columns correspond to the genotype categories. Values represent relative genotype frequencies, calculated based on non-missing values.

Examples

# Example genotype matrix (numeric format)
geno_matrix <- matrix(c(0,1,2,1,0,2,1,2,0,1,1,1,2,0,2),
                      nrow = 5, ncol = 3,
                      dimnames = list(c("M1", "M2", "M3", "M4", "M5"),
                                      c("Ind1", "Ind2", "Ind3")))

# Compute genotype frequency by markers
freq(geno_matrix, input_format = "numeric", by = "markers")
#>            0         1         2
#> M1 0.3333333 0.3333333 0.3333333
#> M2 0.0000000 1.0000000 0.0000000
#> M3 0.0000000 0.0000000 1.0000000
#> M4 0.6666667 0.3333333 0.0000000
#> M5 0.3333333 0.3333333 0.3333333

# Compute genotype frequency by individuals
freq(geno_matrix, input_format = "numeric", by = "individuals")
#>        0   1   2
#> Ind1 0.4 0.4 0.2
#> Ind2 0.2 0.4 0.4
#> Ind3 0.2 0.4 0.4