freq
calculates the relative frequency of genotype classes
(usually "A", "H", "B"
or "0", "1", "2"
) for each individual or marker
in a genotype matrix. The function transposes the matrix to process individuals
or markers as needed, computes the relative frequency of each genotype, and
fills missing genotype categories with the most frequent genotype.
Proportions are calculated based on non-missing values. Results may be biased
if the missing data is high. Use filter_missing_geno
previously if your data
has a high proportion of missingness.
Arguments
- x
A genotype matrix where markers are rows and individuals are columns.
- input_format
Character. Specifies the genotype format. Options:
"numeric"
(default): Uses0
,1
, and2
as genotype categories."genotype"
: Uses"A"
,"H"
, and"B"
as genotype categories.
- by
Character. Specifies whether to calculate genotype frequencies by
"markers"
(rows) or"individuals"
(columns). Default is"markers"
.
Value
A data frame where rows correspond to markers or individuals, and columns correspond to the genotype categories. Values represent relative genotype frequencies, calculated based on non-missing values.
Examples
# Example genotype matrix (numeric format)
geno_matrix <- matrix(c(0,1,2,1,0,2,1,2,0,1,1,1,2,0,2),
nrow = 5, ncol = 3,
dimnames = list(c("M1", "M2", "M3", "M4", "M5"),
c("Ind1", "Ind2", "Ind3")))
# Compute genotype frequency by markers
freq(geno_matrix, input_format = "numeric", by = "markers")
#> 0 1 2
#> M1 0.3333333 0.3333333 0.3333333
#> M2 0.0000000 1.0000000 0.0000000
#> M3 0.0000000 0.0000000 1.0000000
#> M4 0.6666667 0.3333333 0.0000000
#> M5 0.3333333 0.3333333 0.3333333
# Compute genotype frequency by individuals
freq(geno_matrix, input_format = "numeric", by = "individuals")
#> 0 1 2
#> Ind1 0.4 0.4 0.2
#> Ind2 0.2 0.4 0.4
#> Ind3 0.2 0.4 0.4