next up previous contents
Next: Requirements Up: QoRTs Package User Manual Previous: Contents   Contents


Overview

The QoRTs software package is a fast, efficient, and portable multifunction toolkit designed to assist in the analysis, quality control, and data management of RNA-Seq datasets. Its primary function is to aid in the detection and identification of errors, biases, and artifacts produced by paired-end high-throughput RNA-Seq technology. In addition, it can produce count data designed for use with differential expression 1 and differential exon usage tools 2, as well as individual-sample and/or group-summary genome track files suitable for use with the UCSC genome browser (or any compatible browser).

In its primary role as a QC tool it can produce a wide variety of graphs, plots, and tables that allow the data to be visualized in various ways. Data can be compiled and contrasted in multiple ways to allow systematic errors or artifacts to reveal themselves more easily. While it will not directly assign pass/fail status, it is a powerful tool for bioinformaticians to detect and identify features in the data. In (hopefully) most cases, these plots and graphs will not reveal anything other than mixed statistical noise. Next-Gen sequencing technologies have matured to the point where gross systematic errors and batch-specific biases are relatively modest and rare. However: mistakes can still occur, and basing conclusions on flawed data can be disastrous.

Across the field of bioinformatics there are numerous cases where biases, artifacts, and other data quality or bioinformatic issues have called results into question, sometimes resulting in retractions. In many of these cases the problems were only identified after the study came under intense scrutiny when the results were interesting and/or contentious, and the specific issues at fault were generally not well-characterized until afterwards. The primary purpose of QoRTs is to cast a wide net, characterizing the data in as many ways as is feasible so that quality issues that would otherwise be obscured can be recognized and dealt with, even if these issues have not been previously encountered.

The QoRTs package is composed of two parts: a java jar-file (for data processing) and a companion R package (for generating tables, figures, and plots). The java utility is written in the Scala programming language (v2.11.1), however, it has been compiled to java byte-code and does not require an installation of Scala (or any other external libraries) in order to function. The entire QoRTs toolkit can be used on almost any operating system that supports java and R. While not explicitly required, the use of a 64-bit version of java is recommended.

This vignette primarily covers the quality control functionality of QoRTs, and briefly covers the other functions and capabilities.

The most recent release of QoRTs is available on the QoRTs github page (http://github.com/hartleys/QoRTs). Additional help and documentation is available online (http://hartleys.github.io/QoRTs/index.html). A comprehensive walkthrough covering the entire process from post-alignment all the way to differential expression analysis, along with a full example dataset and example output can be found online (https://dl.dropboxusercontent.com/u/103621176/qorts/exData/QoRTsFullExampleData.zip).


next up previous contents
Next: Requirements Up: QoRTs Package User Manual Previous: Contents   Contents
Dr Stephen William Hartley 2015-11-06