recode
is the heart of the package. This powerful function phases genotype marker
data based on two parental references (parent1
and parent2
).
It phases markers according to parental allele inheritance.
Optionally, it can code phased markers into numeric (0
, 1
, 2
) or character ("A"
, "B"
, "H"
) formats.
Numeric coding is recommended for downstream analyses.
Usage
recode(
geno,
parent1,
parent2,
numeric_output = TRUE,
handle_het_markers = FALSE,
het_marker_types = NULL
)
Arguments
- geno
A genotype matrix or data frame where markers are rows and individuals are columns.
- parent1
Character. The name of the column representing the first parent.
- parent2
Character. The name of the column representing the second parent.
- numeric_output
Logical. If
TRUE
, converts phased markers to numeric dosage values (A = 0
,H = 1
,B = 2
). Default isTRUE
.- handle_het_markers
Logical. If
TRUE
, allows heterozygous parent markers to be included. Default isFALSE
.- het_marker_types
Character vector. Specifies which heterozygous markers to keep when
handle_het_markers = TRUE
. Options include"AxH"
,"HxB"
,"HxA"
,"BxH"
. Default isNULL
, meaning all homozygous markers are kept. See details.
Value
A data frame containing phased genotype markers where:
"0"
represents alleles inherited fromparent1
."2"
represents alleles inherited fromparent2
."1"
represents heterozygous alleles.If
numeric_output = FALSE
,0
,2
,1
, are replaced by"A"
,"B"
,"H"
respectively.numeric_output = TRUE
is recommended.
Details
Drops markers where either parent has
NA
.Removes non-polymorphic markers (markers where both parents have the same genotype).
If
handle_het_markers = FALSE
, retains only homozygous marker where dosages are as follows: (P1 = 0
&P2 = 2
orP1 = 2
&P2 = 0
)If
handle_het_markers = TRUE
, allows heterozygous markers to be kept.Ensures that
parent1
is always0
and specific heterozygous markers to be kept.Ensures that
parent1
is always0
andparent2
is always2
for standardization.Returns a numeric matrix if
numeric_output = TRUE
, otherwise returns phased"A"
,"B"
,"H"
values.More details on the heterozygous F2 marker types ("AxH", "HxB", "HxA", "BxH") are in Braun et al. (2017).
Note
This function was refined with assistance from ChatGPT to improve clarity, efficiency, and visualization formatting. Extensive testing was performed by the author to verify outputs.
Examples
# Example genotype data
geno_data <- data.frame(
Marker1 = c(0, 1, 2, 0, 2),
Marker2 = c(2, 0, 2, 1, 0),
Parent1 = c(0, 2, 2, 0, 2),
Parent2 = c(2, 0, 0, 2, 0)
)
# Recode genotype markers (default: numeric output)
phased_geno <- recode(geno_data, "Parent1", "Parent2")
print(phased_geno)
#> Marker1 Marker2 Parent1 Parent2
#> 1 0 2 0 2
#> 2 1 2 0 2
#> 3 0 0 0 2
#> 4 0 1 0 2
#> 5 0 2 0 2
# Recode genotype markers with heterozygous marker handling
phased_geno_het <- recode(geno_data, "Parent1", "Parent2",
handle_het_markers = TRUE,
het_marker_types = c("AxH", "HxB"))
print(phased_geno_het)
#> Marker1 Marker2 Parent1 Parent2
#> 1 0 2 0 2
#> 2 1 2 0 2
#> 3 0 0 0 2
#> 4 0 1 0 2
#> 5 0 2 0 2