Skip to contents

This function performs haplotype reconstruction using a Hidden Markov Model (HMM). It applies the Viterbi algorithm to infer the most likely sequence of true genotypic states, accounting for genotyping errors and missing data.

Usage

haplotype_reconstruction(geno_matrix, error_rate = 0.05, r = 0.01)

Arguments

geno_matrix

A numeric genotype matrix where:

  • Rows represent genetic markers.

  • Columns represent progeny (individuals).

  • Values are expected to be 0, 1, 2, or NA for missing data.

error_rate

Numeric. The assumed genotyping error rate. Default is 0.05 (5% error rate).

r

Numeric. The recombination rate used in the transition probability matrix (T.mat). Default is 0.01. The user can use the MLEL function from MapRtools to get an estimate of recombination frequency with the adjacent argument set to TRUE result <- MLEL(geno = geno_matrix, pop.type = "f2", LOD = FALSE, adjacent = TRUE) In this case the geno_matrix should be of only one chromosome. r could be estimated as mean(result$value, na.rm = TRUE)

Value

A genotype matrix with inferred haplotypes based on HMM correction.

Details

  • Estimates the missing data rate in the input genotype matrix.

  • Initializes an HMM using:

    • Three hidden states ("0", "1", "2").

    • Four observable symbols ("0", "1", "2", "NA" for missing data).

    • Transition probabilities generated via T.mat(r = r), using the user-defined recombination rate.

    • Emission probabilities computed using E.mat(error = error_rate, missing = missing).

Note

Works only for F2 populations on experimental crosses. Based on the 615 Genetic Mapping Class notes by Prof. Jeffrey Endelman. Spring 2021.

Examples

if (FALSE) { # \dontrun{
# Example genotype matrix with missing values
geno_data <- matrix(c(0, 1, NA, 2, 0, 1, NA, 2, NA, NA, 0, 1),
                    nrow = 4, ncol = 3,
                    dimnames = list(c("Marker1", "Marker2", "Marker3", "Marker4"),
                                    c("Ind1", "Ind2", "Ind3")))

# Perform haplotype reconstruction with default parameters
reconstructed_geno <- haplotype_reconstruction(geno_data)
print(reconstructed_geno)

# Perform haplotype reconstruction with a modified recombination rate
reconstructed_geno_custom_r <- haplotype_reconstruction(geno_data, error_rate = 0.05, r = 0.02)
print(reconstructed_geno_custom_r)
} # }