Sequator Download Repack Site
If you work with next-generation sequencing (NGS) data, particularly RNA-seq, you know the nightmare of batch effects. You run your experiment, get your counts, but when you cluster the samples, they separate by date of extraction or sequencing run rather than by treatment group.
Enter (often misspelled as "Sequator" in searches). This powerful tool, specifically the SVA package component (Surrogate Variable Analysis), helps you estimate and correct hidden batch effects when you don’t know what the confounding variables are. sequator download
# Assuming 'counts' is your expression matrix # Assuming 'coldata' has columns: sample, condition, batch_known library(edgeR) lcpm <- cpm(counts, log=TRUE) Model for your biological question mod <- model.matrix(~ condition, data=coldata) Null model mod0 <- model.matrix(~ 1, data=coldata) Step 3: Run the Estimation Now you run the core function to estimate the number of hidden batch effects. If you work with next-generation sequencing (NGS) data,
# Install BiocManager (if you don't have it) if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("sva") Load the library library(sva) This powerful tool, specifically the SVA package component
# Estimate number of surrogate variables (Sv) n.sv <- num.sv(lcpm, mod, method="leek") print(paste("Estimated surrogate variables:", n.sv)) svobj <- sva(lcpm, mod, mod0, n.sv=n.sv)