read.qc.results.data {QoRTs} | R Documentation |
Creates a QoRT_QC_Results object using a set of QC result data files.
read.qc.results.data(infile.dir, decoder, decoder.files, calc.DESeq2 = FALSE, calc.edgeR = FALSE, debugMode, autodetectMissingSamples = FALSE, skip.files = c()) completeAndCheckDecoder(decoder, decoder.files)
infile.dir |
REQUIRED. The base file directory where all the QC results data is stored. |
decoder |
A character vector or data.frame containing the decoder information. See details below. |
decoder.files |
Character vector. Either one or two character strings. Either decoder.files OR decoder must be set, never both. See details below. |
calc.DESeq2 |
Logical. If TRUE, this function will attempt to load the DESeq2 package. If the DESeq2 package is found, it will then calculate DESeq2's geometric normalization factors (also known as "size factors") for each replicate and for each sample. |
calc.edgeR |
Logical. If TRUE, this function will attempt to load the edgeR package. If the edgeR package is found, it will then calculate all of edgeR's normalization factors (also known as "size factors") for each replicate and for each sample. |
debugMode |
Logical. If TRUE, debugging data will be printed to the console. |
autodetectMissingSamples |
Logical. If TRUE, automatically drop replicates for which the QC files cannot be found. By default this function will throw an error. |
skip.files |
Character vector of QC data file titles to skip. This can be useful for performing a faster data load when only querying a subset of the available QC metrics. See examples below. |
read.qc.results.data reads in a full QoRTs dataset of multiple QoRTs QC runs and compiles them into a QoRTs_Results object.
completeAndCheckDecoder simply reads a decoder and "fills in" all missing parameters, returning a data.frame.
The "decoder" is used to describe each replicate/sample. The standard decoder is a data frame that has one row per replicate, with the following columns:
unique.ID: The base identifier for the individual replicate. Must be unique. lanebam.ID is a synonym.
lane.ID: (OPTIONAL) The identifier for the lane, run, or batch. The default is "UNKNOWN".
group.ID: (OPTIONAL) The identifier for the biological condition for the given replicate. The default is "UNKNOWN".
sample.ID: (OPTIONAL) The identifier for the specific biological replicate from which the replicate belongs. Note that this is distinct from "lanebam.ID" because in many RNA-Seq studies each "sample" can have multiple technical replicates, as multiple sequencing runs may be needed to acquire sufficient reads for analysis. By default, it is assumed that each replicate comes from a different sample, and sample.ID is set to equal unique.ID.
qc.data.dir: (OPTIONAL) This column indicates the subdirectory in which the replicate's QC data was written. If this column is missing, it is assumed to be equal to the unique.ID.
input.read.pair.count: (OPTIONAL) This column contains the number of read-pairs (or just reads, for single-end data), before alignment. This is used later to calculate mapping rate.
multi.mapped.read.pair.count: (OPTIONAL) This column contains the number of read-pairs (or just reads, for single-end data) that were multi-mapped. This must be included for multi-mapping rate to be calculated.
All the parameters except for unique.ID are optional. The decoder can even be supplied as a simple character vector, which is assumed to be the unique.ID's. All the other variables will be set to their default values.
Alternatively, the decoder can be supplied as a file given by the decoder.files parameter.
Dual Decoder: Optionally, two decoders can be supplied. In this case the first decoder should be the technical-replicate decoder and the second should be the biological-replicate decoder. The technical-replicate decoder should have one row per unique.ID, with the following columns:
unique.ID: The base identifier for the individual replicate. Must be unique!
lane.ID: (OPTIONAL) The identifier for the lane, run, or batch.
sample.ID: (OPTIONAL) The identifier for the specific biological replicate from which the replicate belongs. Note that this is distinct from "lanebam.ID" because in many RNA-Seq studies each "sample" can have multiple technical replicates, as multiple sequencing runs may be needed to acquire sufficient reads for analysis.
qc.data.dir: (OPTIONAL) This column indicates the subdirectory in which the replicate's QC data was written. If this column is missing, it is assumed to be equal to the lanebam.ID. Must be unique!
input.read.pair.count: (OPTIONAL) This column contains the number of input reads, before alignment. This is used later to calculate mapping rate.
multi.mapped.read.pair.count: (OPTIONAL) This column contains the number of reads that were multi-mapped. This must be included for multi-mapping rate to be calculated.
The biological-replicate decoder should have one row per sample.ID, with the following columns:
sample.ID: The identifier for the specific biological replicate from which the replicate belongs. Note that this is distinct from "unique.ID" because in many RNA-Seq studies each "sample" can have multiple technical replicates, as multiple sequencing runs may be needed to acquire sufficient reads for analysis.
group.ID (OPTIONAL): The identifier for the biological condition for the given replicate.
All decoders are allowed to contain other columns in addition to the ones listed here, so long as their names are distinct. Columns do not need to appear in any particular order, so long as they are named according to the specifications above.
#Load the decoder from the example dataset:
directory <- paste0(system.file("extdata/",
package="QoRTsExampleData",
mustWork=TRUE),"/");
decoder.file <- system.file("extdata/decoder.txt",
package="QoRTsExampleData",
mustWork=TRUE);
decoder.data <- read.table(decoder.file,
header=TRUE,
stringsAsFactors=FALSE);
print(decoder.data);
## sample.ID lane.ID unique.ID qc.data.dir group.ID input.read.pair.count
## 1 SAMP1 L1 SAMP1_RG1 ex/SAMP1_RG1 CASE 465298
## 2 SAMP1 L2 SAMP1_RG2 ex/SAMP1_RG2 CASE 472241
## 3 SAMP1 L3 SAMP1_RG3 ex/SAMP1_RG3 CASE 500691
## 4 SAMP2 L1 SAMP2_RG1 ex/SAMP2_RG1 CASE 461405
## 5 SAMP2 L2 SAMP2_RG2 ex/SAMP2_RG2 CASE 467713
## 6 SAMP2 L3 SAMP2_RG3 ex/SAMP2_RG3 CASE 492322
## 7 SAMP3 L1 SAMP3_RG1 ex/SAMP3_RG1 CASE 485397
## 8 SAMP3 L2 SAMP3_RG2 ex/SAMP3_RG2 CASE 489859
## 9 SAMP3 L3 SAMP3_RG3 ex/SAMP3_RG3 CASE 516906
## 10 SAMP4 L1 SAMP4_RG1 ex/SAMP4_RG1 CTRL 460968
## 11 SAMP4 L2 SAMP4_RG2 ex/SAMP4_RG2 CTRL 468391
## 12 SAMP4 L3 SAMP4_RG3 ex/SAMP4_RG3 CTRL 484530
## 13 SAMP5 L1 SAMP5_RG1 ex/SAMP5_RG1 CTRL 469884
## 14 SAMP5 L2 SAMP5_RG2 ex/SAMP5_RG2 CTRL 475001
## 15 SAMP5 L3 SAMP5_RG3 ex/SAMP5_RG3 CTRL 494213
## 16 SAMP6 L1 SAMP6_RG1 ex/SAMP6_RG1 CTRL 452429
## 17 SAMP6 L2 SAMP6_RG2 ex/SAMP6_RG2 CTRL 458810
## 18 SAMP6 L3 SAMP6_RG3 ex/SAMP6_RG3 CTRL 477751
#This command produces the example dataset used in all the other
# examples:
res <- read.qc.results.data(directory,
decoder = decoder.data,
calc.DESeq2 = TRUE,
calc.edgeR = TRUE);
## column 'qc.data.prefix' not found in the decoder, assuming qc.data.prefix = ""
## Note: no multi.mapped.read.pair.count column found. This column is optional, but without it (depending on how your aligner implements multi-mapping) multi-mapping rates might not be plotted.
## infile.dir = /mnt/nfs/gigantor/ifs/DCEG/Home/hartleys/R/x86_64-pc-linux-gnu-library/3.3/QoRTsExampleData/extdata//
## scalaqc_file = QC.summary.txt
## ..........
## done.
## [time: 2018-09-25 13:13:29],[elapsed: 0.14 secs]
## Autodetected Paired-End mode.
## (File 1 of 43): QC.gc.byPair.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:29],[elapsed: 0.11 secs]
## (File 2 of 43): QC.gc.byRead.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:29],[elapsed: 0.11 secs]
## (File 3 of 43): QC.gc.byRead.vsBaseCt.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:29],[elapsed: 0.58 secs]
## (File 4 of 43): QC.quals.r1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:30],[elapsed: 0.12 secs]
## (File 5 of 43): QC.quals.r2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:30],[elapsed: 0.12 secs]
## (File 6 of 43): QC.cigarOpDistribution.byReadCycle.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:30],[elapsed: 0.23 secs]
## (File 7 of 43): QC.cigarOpDistribution.byReadCycle.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:30],[elapsed: 0.23 secs]
## (File 8 of 43): QC.cigarOpLengths.byOp.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:31],[elapsed: 0.38 secs]
## (File 9 of 43): QC.cigarOpLengths.byOp.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:31],[elapsed: 0.42 secs]
## (File 10 of 43): QC.geneBodyCoverage.by.expression.level.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:31],[elapsed: 0.12 secs]
## (File 11 of 43): QC.geneCounts.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:31],[elapsed: 0.31 secs]
## (File 12 of 43): QC.insert.size.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:32],[elapsed: 0.21 secs]
## (File 13 of 43): QC.NVC.raw.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:32],[elapsed: 0.2 secs]
## (File 14 of 43): QC.NVC.raw.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:32],[elapsed: 0.24 secs]
## (File 15 of 43): QC.NVC.lead.clip.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:34],[elapsed: 1 secs]
## (File 16 of 43): QC.NVC.lead.clip.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:36],[elapsed: 1 secs]
## (File 17 of 43): QC.NVC.tail.clip.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:38],[elapsed: 1 secs]
## (File 18 of 43): QC.NVC.tail.clip.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:39],[elapsed: 1 secs]
## (File 19 of 43): QC.NVC.minus.clipping.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:39],[elapsed: 0.2 secs]
## (File 20 of 43): QC.NVC.minus.clipping.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:40],[elapsed: 0.21 secs]
## (File 21 of 43): QC.chromCount.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:40],[elapsed: 0.11 secs]
## (File 22 of 43): QC.biotypeCounts.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:40],[elapsed: 0.14 secs]
## (File 23 of 43): QC.geneBodyCoverage.byExpr.avgPct.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:40],[elapsed: 0.16 secs]
## (File 24 of 43): QC.overlapCoverage.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:40],[elapsed: 0.13 secs]
## (File 25 of 43): QC.overlapMismatch.byRead.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:40],[elapsed: 0.13 secs]
## (File 26 of 43): QC.overlapMismatch.byScore.txt
## ..........
## done.
## [time: 2018-09-25 13:13:41],[elapsed: 0.47 secs]
## (File 27 of 43): QC.overlapMismatch.byBase.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:41],[elapsed: 0.14 secs]
## (File 28 of 43): QC.overlapMismatch.byScoreAndBP.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:45],[elapsed: 4 secs]
## (File 29 of 43): QC.readLenDist.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:45],[elapsed: 0.12 secs]
## (File 30 of 43): QC.referenceMismatchCounts.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:45],[elapsed: 0.13 secs]
## (File 31 of 43): QC.referenceMismatchRaw.byReadStrand.txt
## ..........
## done.
## [time: 2018-09-25 13:13:46],[elapsed: 0.39 secs]
## (File 32 of 43): QC.referenceMismatch.byScore.txt
## ..........
## done.
## [time: 2018-09-25 13:13:46],[elapsed: 0.13 secs]
## (File 33 of 43): QC.referenceMismatch.byScoreAndBP.txt
## ..........
## done.
## [time: 2018-09-25 13:13:46],[elapsed: 0.26 secs]
## (File 34 of 43): QC.mismatchSizeRates.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:47],[elapsed: 0.58 secs]
## (File 35 of 43): QC.FQ.gc.byRead.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:47],[elapsed: 0.11 secs]
## (File 36 of 43): QC.FQ.gc.byPair.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:47],[elapsed: 0.11 secs]
## (File 37 of 43): QC.FQ.gc.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:47],[elapsed: 0.13 secs]
## (File 38 of 43): QC.FQ.gc.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:47],[elapsed: 0.11 secs]
## (File 39 of 43): QC.FQ.NVC.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:47],[elapsed: 0.15 secs]
## (File 40 of 43): QC.FQ.NVC.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:48],[elapsed: 0.15 secs]
## (File 41 of 43): QC.FQ.quals.r1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:48],[elapsed: 0.15 secs]
## (File 42 of 43): QC.FQ.quals.r2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:48],[elapsed: 0.14 secs]
## (File 43 of 43): QC.FQ.readLenDist.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:13:48],[elapsed: 0.12 secs]
## calculating secondary data:
## Calculating Quality Score Rates...
## done. [time: 2018-09-25 13:13:48],[elapsed: 0 secs]
## Calculating cumulative gene coverage, by replicate...
## done. [time: 2018-09-25 13:13:48],[elapsed: 0.03 secs]
## Calculating cumulative gene coverage, by sample...
## done. [time: 2018-09-25 13:13:48],[elapsed: 0.01 secs]
## Calculating Mapping Rates...
## done. [time: 2018-09-25 13:13:48],[elapsed: 0.03 secs]
## calculating normalization factors, by sample...
## Calculating DESeq2 Normalization Factors (Geometric normalization)...
## Calculating edgeR Normalization Factors (all edgeR normalizations)...
## done. [time: 2018-09-25 13:14:16],[elapsed: 27 secs]
## calculating normalization factors, by replicate...
## Calculating DESeq2 Normalization Factors (Geometric normalization)...
## Calculating edgeR Normalization Factors (all edgeR normalizations)...
## done. [time: 2018-09-25 13:14:16],[elapsed: 0.15 secs]
## calculating normalization factors, by sample/replicate...
## done. [time: 2018-09-25 13:14:16],[elapsed: 0.1 secs]
## Calculating summary stats...
## done. [time: 2018-09-25 13:14:16],[elapsed: 0.5 secs]
## Calculating overlap mismatch-size rates...
## done. [time: 2018-09-25 13:14:18],[elapsed: 1 secs]
## Calculating cumulative overlap mismatch-size rates...
## done. [time: 2018-09-25 13:14:19],[elapsed: 1 secs]
## Calculating overlap coverage Rates...
## done. [time: 2018-09-25 13:14:20],[elapsed: 0.07 secs]
## Calculating overlap coverage Rates By Read...
## done. [time: 2018-09-25 13:14:20],[elapsed: 0.11 secs]
## Calculating read length distribution...
## done. [time: 2018-09-25 13:14:20],[elapsed: 0.07 secs]
## Calculating overlap by AVG score...
## done. [time: 2018-09-25 13:14:21],[elapsed: 0.86 secs]
## Calculating overlap by MIN score...
## done. [time: 2018-09-25 13:14:21],[elapsed: 0.47 secs]
## Adding Min score error to summary tables...
## done. [time: 2018-09-25 13:14:22],[elapsed: 0.59 secs]
## Calculating overlap by R1 score...
## done. [time: 2018-09-25 13:14:22],[elapsed: 0.39 secs]
## Calculating overlap by R2 score...
## done. [time: 2018-09-25 13:14:22],[elapsed: 0.39 secs]
## Calculating referenceMismatchCounts stats...
## done. [time: 2018-09-25 13:14:22],[elapsed: 0.02 secs]
## Calculating referenceMismatch.byScore stats...
## done. [time: 2018-09-25 13:14:23],[elapsed: 0.02 secs]
## Calculating referenceMismatchRaw.byReadStrand stats...
## done. [time: 2018-09-25 13:14:23],[elapsed: 0.89 secs]
## Calculating referenceMismatch.byScoreAndBP stats...
## done. [time: 2018-09-25 13:14:27],[elapsed: 3 secs]
## Calculating summary table...
## done. [time: 2018-09-25 13:14:27],[elapsed: 0.02 secs]
## Calculating overlap mismatch combos...
## Calculating mismatch combo rates:...
## done. [time: 2018-09-25 13:14:27],[elapsed: 0.09 secs]
## Calculating overlapMismatch.byScoreAndBP stats...
## done. [time: 2018-09-25 13:15:56],[elapsed: 89 secs]
## done. [time: 2018-09-25 13:15:56],[elapsed: 89 secs]
## Calculating NVC rates...
## done. [time: 2018-09-25 13:16:01],[elapsed: 5 secs]
## done.
## [time: 2018-09-25 13:16:01],[elapsed: 133 secs]
#Note that DESeq2 and edgeR are required in order to
# calculate the size factors using the options above.
#You can also specify incomplete decoders, and use
# the following command to fill in the defaults:
completeAndCheckDecoder(c("SAMP1","SAMP2",
"SAMP3","SAMP4",
"SAMP5","SAMP6"))
## Simple decoder found, list of sample/unique ID's. Building complete decoder.
## column 'qc.data.dir' not found in the decoder, assuming qc.data.dir = unique.ID
## column 'qc.data.prefix' not found in the decoder, assuming qc.data.prefix = ""
## unique.ID sample.ID lane.ID group.ID qc.data.dir qc.data.prefix
## 1 SAMP1 SAMP1 UNKNOWN UNKNOWN SAMP1
## 2 SAMP2 SAMP2 UNKNOWN UNKNOWN SAMP2
## 3 SAMP3 SAMP3 UNKNOWN UNKNOWN SAMP3
## 4 SAMP4 SAMP4 UNKNOWN UNKNOWN SAMP4
## 5 SAMP5 SAMP5 UNKNOWN UNKNOWN SAMP5
## 6 SAMP6 SAMP6 UNKNOWN UNKNOWN SAMP6
#You don't actually have to use completeAndCheckDecoder,
#You can just pass the incomplete decoder directly to QoRTs.
#For example, to load a small subset of the example data
#(without phenotype data):
res <- read.qc.results.data(paste0(directory,"/ex/"),
decoder = c("SAMP1_RG1","SAMP2_RG1",
"SAMP3_RG1","SAMP4_RG1"));
## Simple decoder found, list of sample/unique ID's. Building complete decoder.
## column 'qc.data.dir' not found in the decoder, assuming qc.data.dir = unique.ID
## column 'qc.data.prefix' not found in the decoder, assuming qc.data.prefix = ""
## Note: no input.read.pair.count column found. This column is optional, but without it mapping rates cannot be calculated.
## Note: no multi.mapped.read.pair.count column found. This column is optional, but without it (depending on how your aligner implements multi-mapping) multi-mapping rates might not be plotted.
## infile.dir = /mnt/nfs/gigantor/ifs/DCEG/Home/hartleys/R/x86_64-pc-linux-gnu-library/3.3/QoRTsExampleData/extdata///ex/
## scalaqc_file = QC.summary.txt
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.03 secs]
## Autodetected Paired-End mode.
## (File 1 of 43): QC.gc.byPair.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.02 secs]
## (File 2 of 43): QC.gc.byRead.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.02 secs]
## (File 3 of 43): QC.gc.byRead.vsBaseCt.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.12 secs]
## (File 4 of 43): QC.quals.r1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.03 secs]
## (File 5 of 43): QC.quals.r2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.03 secs]
## (File 6 of 43): QC.cigarOpDistribution.byReadCycle.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.05 secs]
## (File 7 of 43): QC.cigarOpDistribution.byReadCycle.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.05 secs]
## (File 8 of 43): QC.cigarOpLengths.byOp.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.08 secs]
## (File 9 of 43): QC.cigarOpLengths.byOp.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.08 secs]
## (File 10 of 43): QC.geneBodyCoverage.by.expression.level.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.02 secs]
## (File 11 of 43): QC.geneCounts.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.06 secs]
## (File 12 of 43): QC.insert.size.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.04 secs]
## (File 13 of 43): QC.NVC.raw.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.04 secs]
## (File 14 of 43): QC.NVC.raw.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:02],[elapsed: 0.04 secs]
## (File 15 of 43): QC.NVC.lead.clip.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:03],[elapsed: 0.41 secs]
## (File 16 of 43): QC.NVC.lead.clip.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:03],[elapsed: 0.41 secs]
## (File 17 of 43): QC.NVC.tail.clip.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:03],[elapsed: 0.35 secs]
## (File 18 of 43): QC.NVC.tail.clip.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.35 secs]
## (File 19 of 43): QC.NVC.minus.clipping.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.04 secs]
## (File 20 of 43): QC.NVC.minus.clipping.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.04 secs]
## (File 21 of 43): QC.chromCount.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.02 secs]
## (File 22 of 43): QC.biotypeCounts.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.02 secs]
## (File 23 of 43): QC.geneBodyCoverage.byExpr.avgPct.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.03 secs]
## (File 24 of 43): QC.overlapCoverage.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.03 secs]
## (File 25 of 43): QC.overlapMismatch.byRead.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.03 secs]
## (File 26 of 43): QC.overlapMismatch.byScore.txt
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.09 secs]
## (File 27 of 43): QC.overlapMismatch.byBase.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:04],[elapsed: 0.02 secs]
## (File 28 of 43): QC.overlapMismatch.byScoreAndBP.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.96 secs]
## (File 29 of 43): QC.readLenDist.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.02 secs]
## (File 30 of 43): QC.referenceMismatchCounts.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.03 secs]
## (File 31 of 43): QC.referenceMismatchRaw.byReadStrand.txt
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.08 secs]
## (File 32 of 43): QC.referenceMismatch.byScore.txt
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.02 secs]
## (File 33 of 43): QC.referenceMismatch.byScoreAndBP.txt
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.05 secs]
## (File 34 of 43): QC.mismatchSizeRates.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.12 secs]
## (File 35 of 43): QC.FQ.gc.byRead.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.02 secs]
## (File 36 of 43): QC.FQ.gc.byPair.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.02 secs]
## (File 37 of 43): QC.FQ.gc.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.02 secs]
## (File 38 of 43): QC.FQ.gc.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:05],[elapsed: 0.02 secs]
## (File 39 of 43): QC.FQ.NVC.R1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:06],[elapsed: 0.03 secs]
## (File 40 of 43): QC.FQ.NVC.R2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:06],[elapsed: 0.03 secs]
## (File 41 of 43): QC.FQ.quals.r1.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:06],[elapsed: 0.03 secs]
## (File 42 of 43): QC.FQ.quals.r2.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:06],[elapsed: 0.03 secs]
## (File 43 of 43): QC.FQ.readLenDist.txt.gz
## ....
## done.
## [time: 2018-09-25 13:16:06],[elapsed: 0.02 secs]
## calculating secondary data:
## Calculating Quality Score Rates...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0 secs]
## Calculating cumulative gene coverage, by replicate...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.01 secs]
## Calculating cumulative gene coverage, by sample...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0 secs]
## Calculating Mapping Rates...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.01 secs]
## calculating normalization factors, by sample...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0 secs]
## calculating normalization factors, by replicate...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0 secs]
## calculating normalization factors, by sample/replicate...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0 secs]
## Calculating summary stats...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.1 secs]
## Calculating overlap mismatch-size rates...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.21 secs]
## Calculating cumulative overlap mismatch-size rates...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.26 secs]
## Calculating overlap coverage Rates...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.02 secs]
## Calculating overlap coverage Rates By Read...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.02 secs]
## Calculating read length distribution...
## done. [time: 2018-09-25 13:16:06],[elapsed: 0.02 secs]
## Calculating overlap by AVG score...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.18 secs]
## Calculating overlap by MIN score...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.09 secs]
## Adding Min score error to summary tables...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.17 secs]
## Calculating overlap by R1 score...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.09 secs]
## Calculating overlap by R2 score...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.09 secs]
## Calculating referenceMismatchCounts stats...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0 secs]
## Calculating referenceMismatch.byScore stats...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.01 secs]
## Calculating referenceMismatchRaw.byReadStrand stats...
## done. [time: 2018-09-25 13:16:07],[elapsed: 0.2 secs]
## Calculating referenceMismatch.byScoreAndBP stats...
## done. [time: 2018-09-25 13:16:08],[elapsed: 0.73 secs]
## Calculating summary table...
## done. [time: 2018-09-25 13:16:08],[elapsed: 0 secs]
## Calculating overlap mismatch combos...
## Calculating mismatch combo rates:...
## done. [time: 2018-09-25 13:16:08],[elapsed: 0.02 secs]
## Calculating overlapMismatch.byScoreAndBP stats...
## done. [time: 2018-09-25 13:16:28],[elapsed: 19 secs]
## done. [time: 2018-09-25 13:16:28],[elapsed: 19 secs]
## Calculating NVC rates...
## done. [time: 2018-09-25 13:16:29],[elapsed: 1 secs]
## done.
## [time: 2018-09-25 13:16:29],[elapsed: 23 secs]
#The list of available QC names for use with skip.files:
names(res@qc.data);
## [1] "summary"
## [2] "gc.byPair"
## [3] "gc.byRead"
## [4] "gc.byRead.vsBaseCt"
## [5] "quals.r1"
## [6] "quals.r2"
## [7] "cigarOpDistribution.byReadCycle.R1"
## [8] "cigarOpDistribution.byReadCycle.R2"
## [9] "cigarOpLengths.byOp.R1"
## [10] "cigarOpLengths.byOp.R2"
## [11] "geneBodyCoverage.by.expression.level"
## [12] "geneCounts"
## [13] "insert.size"
## [14] "NVC.raw.R1"
## [15] "NVC.raw.R2"
## [16] "NVC.lead.clip.R1"
## [17] "NVC.lead.clip.R2"
## [18] "NVC.tail.clip.R1"
## [19] "NVC.tail.clip.R2"
## [20] "NVC.minus.clipping.R1"
## [21] "NVC.minus.clipping.R2"
## [22] "chrom.counts"
## [23] "biotype.counts"
## [24] "geneBodyCoverage.pct"
## [25] "overlapCoverage"
## [26] "overlapMismatch.byRead"
## [27] "overlapMismatch.byScore"
## [28] "overlapMismatchCombos"
## [29] "overlapMismatch.byScoreAndBP"
## [30] "readLenDist"
## [31] "referenceMismatchCounts"
## [32] "referenceMismatchRaw.byReadStrand"
## [33] "referenceMismatch.byScore"
## [34] "referenceMismatch.byScoreAndBP"
## [35] "mismatchSizeRates"
## [36] "FQ.gc.byRead"
## [37] "FQ.gc.byPair"
## [38] "FQ.gc.r1"
## [39] "FQ.gc.r2"
## [40] "FQ.NVC.R1"
## [41] "FQ.NVC.R2"
## [42] "FQ.quals.r1"
## [43] "FQ.quals.r2"
## [44] "FQ.readLenDist"
#Skip some of the files using a command like this:
res.quick <- read.qc.results.data(directory,
decoder = decoder.data,
skip.files=c(
"QC.NVC.raw.R1.txt.gz",
"QC.NVC.raw.R2.txt.gz",
"QC.NVC.lead.clip.R1.txt.gz",
"QC.NVC.lead.clip.R2.txt.gz",
"QC.NVC.tail.clip.R1.txt.gz",
"QC.NVC.tail.clip.R2.txt.gz"
));
## column 'qc.data.prefix' not found in the decoder, assuming qc.data.prefix = ""
## Note: no multi.mapped.read.pair.count column found. This column is optional, but without it (depending on how your aligner implements multi-mapping) multi-mapping rates might not be plotted.
## infile.dir = /mnt/nfs/gigantor/ifs/DCEG/Home/hartleys/R/x86_64-pc-linux-gnu-library/3.3/QoRTsExampleData/extdata//
## scalaqc_file = QC.summary.txt
## ..........
## done.
## [time: 2018-09-25 13:16:29],[elapsed: 0.12 secs]
## Autodetected Paired-End mode.
## skipping 6 files.
## ("QC.NVC.raw.R1.txt.gz","QC.NVC.raw.R2.txt.gz","QC.NVC.lead.clip.R1.txt.gz","QC.NVC.lead.clip.R2.txt.gz","QC.NVC.tail.clip.R1.txt.gz","QC.NVC.tail.clip.R2.txt.gz")
## (File 1 of 37): QC.gc.byPair.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:29],[elapsed: 0.09 secs]
## (File 2 of 37): QC.gc.byRead.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:29],[elapsed: 0.08 secs]
## (File 3 of 37): QC.gc.byRead.vsBaseCt.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:30],[elapsed: 0.53 secs]
## (File 4 of 37): QC.quals.r1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:30],[elapsed: 0.1 secs]
## (File 5 of 37): QC.quals.r2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:30],[elapsed: 0.1 secs]
## (File 6 of 37): QC.cigarOpDistribution.byReadCycle.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:30],[elapsed: 0.2 secs]
## (File 7 of 37): QC.cigarOpDistribution.byReadCycle.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:31],[elapsed: 0.2 secs]
## (File 8 of 37): QC.cigarOpLengths.byOp.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:31],[elapsed: 0.36 secs]
## (File 9 of 37): QC.cigarOpLengths.byOp.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:31],[elapsed: 0.36 secs]
## (File 10 of 37): QC.geneBodyCoverage.by.expression.level.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:31],[elapsed: 0.09 secs]
## (File 11 of 37): QC.geneCounts.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:32],[elapsed: 0.29 secs]
## (File 12 of 37): QC.insert.size.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:32],[elapsed: 0.18 secs]
## (File 13 of 37): QC.NVC.minus.clipping.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:32],[elapsed: 0.17 secs]
## (File 14 of 37): QC.NVC.minus.clipping.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:32],[elapsed: 0.17 secs]
## (File 15 of 37): QC.chromCount.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:32],[elapsed: 0.08 secs]
## (File 16 of 37): QC.biotypeCounts.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:32],[elapsed: 0.08 secs]
## (File 17 of 37): QC.geneBodyCoverage.byExpr.avgPct.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:33],[elapsed: 0.15 secs]
## (File 18 of 37): QC.overlapCoverage.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:33],[elapsed: 0.11 secs]
## (File 19 of 37): QC.overlapMismatch.byRead.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:33],[elapsed: 0.1 secs]
## (File 20 of 37): QC.overlapMismatch.byScore.txt
## ..........
## done.
## [time: 2018-09-25 13:16:33],[elapsed: 0.38 secs]
## (File 21 of 37): QC.overlapMismatch.byBase.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:33],[elapsed: 0.1 secs]
## (File 22 of 37): QC.overlapMismatch.byScoreAndBP.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:38],[elapsed: 4 secs]
## (File 23 of 37): QC.readLenDist.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:38],[elapsed: 0.09 secs]
## (File 24 of 37): QC.referenceMismatchCounts.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:38],[elapsed: 0.11 secs]
## (File 25 of 37): QC.referenceMismatchRaw.byReadStrand.txt
## ..........
## done.
## [time: 2018-09-25 13:16:38],[elapsed: 0.36 secs]
## (File 26 of 37): QC.referenceMismatch.byScore.txt
## ..........
## done.
## [time: 2018-09-25 13:16:38],[elapsed: 0.1 secs]
## (File 27 of 37): QC.referenceMismatch.byScoreAndBP.txt
## ..........
## done.
## [time: 2018-09-25 13:16:39],[elapsed: 0.2 secs]
## (File 28 of 37): QC.mismatchSizeRates.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:39],[elapsed: 0.56 secs]
## (File 29 of 37): QC.FQ.gc.byRead.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:39],[elapsed: 0.08 secs]
## (File 30 of 37): QC.FQ.gc.byPair.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:39],[elapsed: 0.09 secs]
## (File 31 of 37): QC.FQ.gc.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:39],[elapsed: 0.08 secs]
## (File 32 of 37): QC.FQ.gc.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:40],[elapsed: 0.08 secs]
## (File 33 of 37): QC.FQ.NVC.R1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:40],[elapsed: 0.12 secs]
## (File 34 of 37): QC.FQ.NVC.R2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:40],[elapsed: 0.13 secs]
## (File 35 of 37): QC.FQ.quals.r1.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:40],[elapsed: 0.1 secs]
## (File 36 of 37): QC.FQ.quals.r2.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:40],[elapsed: 0.1 secs]
## (File 37 of 37): QC.FQ.readLenDist.txt.gz
## ..........
## done.
## [time: 2018-09-25 13:16:40],[elapsed: 0.09 secs]
## calculating secondary data:
## Calculating Quality Score Rates...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0 secs]
## Calculating cumulative gene coverage, by replicate...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0.03 secs]
## Calculating cumulative gene coverage, by sample...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0.01 secs]
## Calculating Mapping Rates...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0.03 secs]
## calculating normalization factors, by sample...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0 secs]
## calculating normalization factors, by replicate...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0 secs]
## calculating normalization factors, by sample/replicate...
## done. [time: 2018-09-25 13:16:40],[elapsed: 0 secs]
## Calculating summary stats...
## done. [time: 2018-09-25 13:16:41],[elapsed: 0.53 secs]
## Calculating overlap mismatch-size rates...
## done. [time: 2018-09-25 13:16:42],[elapsed: 0.95 secs]
## Calculating cumulative overlap mismatch-size rates...
## done. [time: 2018-09-25 13:16:43],[elapsed: 1 secs]
## Calculating overlap coverage Rates...
## done. [time: 2018-09-25 13:16:43],[elapsed: 0.07 secs]
## Calculating overlap coverage Rates By Read...
## done. [time: 2018-09-25 13:16:43],[elapsed: 0.11 secs]
## Calculating read length distribution...
## done. [time: 2018-09-25 13:16:43],[elapsed: 0.07 secs]
## Calculating overlap by AVG score...
## done. [time: 2018-09-25 13:16:44],[elapsed: 0.83 secs]
## Calculating overlap by MIN score...
## done. [time: 2018-09-25 13:16:44],[elapsed: 0.39 secs]
## Adding Min score error to summary tables...
## done. [time: 2018-09-25 13:16:45],[elapsed: 0.62 secs]
## Calculating overlap by R1 score...
## done. [time: 2018-09-25 13:16:45],[elapsed: 0.39 secs]
## Calculating overlap by R2 score...
## done. [time: 2018-09-25 13:16:46],[elapsed: 0.39 secs]
## Calculating referenceMismatchCounts stats...
## done. [time: 2018-09-25 13:16:46],[elapsed: 0.02 secs]
## Calculating referenceMismatch.byScore stats...
## done. [time: 2018-09-25 13:16:46],[elapsed: 0.02 secs]
## Calculating referenceMismatchRaw.byReadStrand stats...
## done. [time: 2018-09-25 13:16:47],[elapsed: 0.89 secs]
## Calculating referenceMismatch.byScoreAndBP stats...
## done. [time: 2018-09-25 13:16:50],[elapsed: 3 secs]
## Calculating summary table...
## done. [time: 2018-09-25 13:16:50],[elapsed: 0.01 secs]
## Calculating overlap mismatch combos...
## Calculating mismatch combo rates:...
## done. [time: 2018-09-25 13:16:50],[elapsed: 0.09 secs]
## Calculating overlapMismatch.byScoreAndBP stats...
## done. [time: 2018-09-25 13:18:20],[elapsed: 89 secs]
## done. [time: 2018-09-25 13:18:20],[elapsed: 90 secs]
## Calculating NVC rates...
## done. [time: 2018-09-25 13:18:20],[elapsed: 0.31 secs]
## done.
## [time: 2018-09-25 13:18:20],[elapsed: 100 secs]