R: Plot the cumulative gene diversity curve

makePlot.gene.cdf {QoRTs}

R Documentation

Plot the cumulative gene diversity curve

Description

Plots the cumulative gene diversity curve

Usage

  makePlot.gene.cdf(plotter, 
                sampleWise = FALSE,
                plot.intercepts = TRUE,
                label.intercepts = FALSE,
                debugMode, 
                rasterize.plotting.area = FALSE, 
                raster.height = 1000, 
                raster.width = 1000,
                singleEndMode,
                plot = TRUE,
                ...)

Arguments

`plotter`	A `QoRT_Plotter` reference object. See `build.plotter`.
`sampleWise`	A logical value indicating whether to compile each sample together across all runs.
`plot.intercepts`	A logical value indicating whether or not to plot the intercepts with the round numbers on the x-axis, in dotted lines.
`label.intercepts`	A logical value indicating whether or not to label the intercepts. No effect if plot.intercepts is FALSE.
`debugMode`	Logical. If TRUE, debugging data will be printed to the console.
`rasterize.plotting.area`	Logical. If TRUE, the plotting area will be written to a temp png file then drawn to the current device as a raster image. This option is generally preferred for vector devices, because NVC plots can be very large when drawn in vector format. Note: this requires the `png` package!
`raster.height`	Numeric. If rasterize.plotting.area is TRUE, then this is the height of the plotting area raster image, in pixels.
`raster.width`	Numeric. If rasterize.plotting.area is TRUE, then this is the width of the plotting area raster image, in pixels.
`singleEndMode`	Logical. Determines whether the dataset consists of single-ended reads. By default this is determined from the data. Thus, this parameter should never need to be set by the user.
`plot`	Logical. If FALSE, suppress plotting and return `TRUE` if and only if the data is present in the dataset to allow plotting. Useful to automatically filter out missing data plots.
`...`	Arguments to be passed to methods, such as graphical parameters (see `par`).

Details

For each bam-file, this function plots the cumulative gene diversity. For each bam-file, the genes are sorted by read-count. Then, a cumulative function is calculated for the percent of the total proportion of reads as a function of the number of genes. Intercepts are plotted as well, showing the cumulative percent for 1 gene, 10 genes, 100 genes, 1000 genes, and 10000 genes.

So, for example, across all the bam-files, around 50 to 55 percent of the read-pairs were found to map to the top 1000 genes. Around 20 percent of the reads were found in the top 100 genes. And so on.

This can be used as an indicator of whether a large proportion of the reads stem from of a small number of genes.

Note that this is restricted to only the reads that map to a single unique gene. Reads that map to more than one gene, or that map to intronic or intergenic areas are ignored.

Examples

  data(res,package="QoRTsExampleData");
  plotter <- build.plotter.colorByGroup(res);
  makePlot.gene.cdf(plotter)

## Starting: Cumulative Gene Assignment Diversity plot.

## Finished: Cumulative Gene Assignment Diversity plot.[time: 2018-09-25 13:12:31],[elapsed: 0.21 secs]

plot of chunk unnamed-chunk-1