estimateJunctionSeqSizeFactors {JunctionSeq}R Documentation

Estimate Size Factors

Description

Estimate size factors, which are scaling factors used as "offsets" by the statistical model to make the different samples comparable. This is necessary because the different samples may have been sequenced to slightly different depths. Additionally, the presence of differentially expressed genes may cause the apparent depth of many genes to appear different.

This function uses the "geometric" size factor normalization method, which is identical to the one used by DESeq, DESeq2, DEXSeq, and the default method used by CuffDiff.

This function is called internally by the runJunctionSeqAnalyses function, and thus for most purposes users should not need to call this function directly. It may be useful to advanced users performing non-standard analyses.

Usage

  estimateJunctionSeqSizeFactors(jscs, 
          method.sizeFactors = c("byGenes","byCountbins"), 
          replicateDEXSeqBehavior.useRawBaseMean = FALSE, 
          calcAltSF = TRUE, 
          verbose = FALSE);
  
  writeSizeFactors(jscs, file);

Arguments

jscs

A JunctionSeqCountSet. Usually initially created by readJunctionSeqCounts. Size factors must be set, usually using functions estimateSizeFactors and estimateJunctionSeqDispersions.

method.sizeFactors

Character string. Can be used to apply alternative methodologies or implementations. Intended for advanced users who have strong opinions about the underlying statistical methodologies.

Determines the method used to calculate normalization size factors. By default JunctionSeq uses gene-level expression. As an alternative, feature-level counts can be used as they are in DEXSeq. In practice the difference is almost always negligible.

replicateDEXSeqBehavior.useRawBaseMean

USED ONLY FOR INTERNAL TESTING! NOT INTENDED FOR ACTUAL USE!

This variable activates an alternative mode in which a (very minor) bug in DEXSeq v1.14.0 and earlier is replicated. If TRUE, the baseMean and baseVar variables will be computed using raw counts rather than normalized counts. This is used for internal tests in which DEXSeq functionality is replicated precisely and the results are compared against equivalent DEXSeq results. Without this option the results would differ slightly (generally by less than 1 hundreth of a percent).

USED ONLY FOR INTERNAL TESTING! NOT INTENDED FOR ACTUAL USE!

calcAltSF

Logical. Determines whether both types of size factor calculations should be generated, and placed in the jscs@altSizeFactors slot.

verbose

if TRUE, send debugging and progress messages to the console / stdout.

file

A file path to write the size factor table.

...

If using the (depreciated) estimateSizeFactors command, use the same syntax as above.

Value

A JunctionSeqCountSet, with size factors included.

Examples

data(exampleDataSet,package="JctSeqData");
jscs <- estimateJunctionSeqSizeFactors(jscs);

## Not run: 
##D ########################################
##D #Set up example data:
##D decoder.file <- system.file(
##D                   "extdata/annoFiles/decoder.bySample.txt",
##D                   package="JctSeqData");
##D decoder <- read.table(decoder.file,
##D                   header=TRUE,
##D                   stringsAsFactors=FALSE);
##D gff.file <- system.file(
##D             "extdata/cts/withNovel.forJunctionSeq.gff.gz",
##D             package="JctSeqData");
##D countFiles <- system.file(paste0("extdata/cts/",
##D      decoder$sample.ID,
##D      "/QC.spliceJunctionAndExonCounts.withNovel.forJunctionSeq.txt.gz"),
##D      package="JctSeqData");
##D ########################################
##D #Advanced Analysis:
##D 
##D #Make a "design" dataframe:
##D design <- data.frame(condition = factor(decoder$group.ID));
##D #Read the QoRTs counts.
##D jscs = readJunctionSeqCounts(countfiles = countFiles,
##D            samplenames = decoder$sample.ID,
##D            design = design,
##D            flat.gff.file = gff.file
##D );
##D #Generate the size factors and load them into the JunctionSeqCountSet:
##D jscs <- estimateJunctionSeqSizeFactors(jscs);
##D #Estimate feature-specific dispersions:
##D jscs <- estimateJunctionSeqDispersions(jscs);
##D #Fit dispersion function and estimate MAP dispersion:
##D jscs <- fitJunctionSeqDispersionFunction(jscs);
##D #Test for differential usage:
##D jscs <- testForDiffUsage(jscs);
##D #Estimate effect sizes and expression estimates:
##D jscs <- estimateEffectSizes( jscs);
##D 
## End(Not run)

[Package JunctionSeq version 1.5.4 Index]