Skip to contents

trim_LG is an interactive function that helps the user filter markers within a specified chromosome based on linkage group (LG) assignment using LOD score thresholds. trim_LG should make the process easy by allowing the user to choose the thresholds interactively, visualize haplotype frequency before and after filtering, and remove outliers. Inspired by functions by Professor Jeffrey B. Endelman's MapRtools available in Github. trim_LG provides a simple interphase for trimming LGs easily.

Usage

trim_LG(
  chromosome,
  map,
  geno,
  pop_type = "F2",
  drop_outliers = TRUE,
  n_cores = NULL
)

Arguments

chromosome

Character. The chromosome ID to be processed.

map

A data frame containing marker map information with at least the following columns:

  • "marker": Marker names.

  • "chrom": Chromosome identifier.

  • "position": Physical position of markers.

geno

A genotype matrix where:

  • Rows represent genetic markers.

  • Columns represent individuals.

  • Values represent genotype calls. Preferably, a binned genotype matrix. Binning can be performed using MapRtools::LDbin.

pop_type

Character. The population type used for linkage estimation. Default is "F2". Would work with any of the following "DH","BC","F2","S1","RIL.self","RIL.sib". Based on MapRtools::MLEL()

drop_outliers

Logical. If TRUE, removes markers identified as outliers based on haplotype frequency. Default is TRUE.

n_cores

Integer. The number of CPU cores to use for linkage estimation. If NULL, the function selects the maximum available minus one.

Value

A list containing:

  • "trimmed_genotype": The filtered genotype matrix.

  • "final_map": The updated marker map after filtering.

  • "initial_haplo_plot": A ggplot2 object showing the initial haplotype plot.

  • "filtered_freq_plot": A ggplot2 object showing the haplotype plot after filtering.

  • "final_freq_plot": A ggplot2 object showing the haplotype frequency after removing outliers.

  • "starting_LOD": A record of the user-specified initial LOD threshold.

  • "ending_LOD": A record of the the user-specified final LOD threshold.

  • "step": A record of the LOD sequence step size.

  • "selected_LOD_thresh": The final LOD threshold used for filtering markers.

  • "outliers": Markers removed as outliers saved as character vector.

Details

  • Computes LOD scores for marker pairs using MapRtools::MLEL(), parallelized with multiple cores.

  • Displays an interactive "candy stripe" plot for visualizing linkage groups using MapRtools::LG().

  • Asks the user to define a LOD range and a final LOD threshold for marker filtering.

  • If drop_outliers = TRUE, removes markers identified as outliers based on haplotype frequency.

Note

Requires a previous installation of MapRtools.

Examples

if (FALSE) { # \dontrun{
# Example dataset (user should provide actual data)
map_data <- data.frame(
  marker = c("M1", "M2", "M3", "M4"),
  chrom = c("CHR1", "CHR1", "CHR1", "CHR1"),
  position = c(10, 20, 30, 40)
)
geno_matrix <- matrix(sample(0:1, 16, replace = TRUE),
                      nrow = 4, ncol = 4,
                      dimnames = list(c("M1", "M2", "M3", "M4"),
                                      c("Ind1", "Ind2", "Ind3", "Ind4")))

# Run trimming function (with user input required for LOD selection)
result <- trim_LG(chromosome = "CHR1",
                  map = map_data,
                  geno = geno_matrix,
                  pop_type = "F2",
                  drop_outliers = TRUE,
                  n_cores = 2)

# Access trimmed genotype matrix
print(result$trimmed_genotype)
} # }