stat_aggregate {ggbio}R Documentation

Generates summaries on the specified windows

Description

Generates summaries on the specified windows

Usage


## S4 method for signature 'GRanges'
stat_aggregate(data, ..., xlab, ylab, main, by, FUN,
                          maxgap=0L, minoverlap=1L,
                          type=c("any", "start", "end", "within", "equal"),
                          select=c("all", "first", "last", "arbitrary"),
                          y = NULL, window = NULL, facets = NULL, 
                          method = c("mean", "median","max",
                                   "min", "sum", "count", "identity"),
                          geom = NULL)


Arguments

data

A GRanges or data.frame object.

...

Arguments passed to plot function. such as aes() and color.

xlab

Label for x

ylab

Label for y

main

Title for plot.

by

An object with 'start', 'end', and 'width' methods. Passed to aggreagate.

FUN

The function, found via 'match.fun', to be applied to each window of 'x'. Passed to aggreagate.

maxgap, minoverlap

It passed to findOverlaps.

Intervals with a separation of maxgap or less and a minimum of minoverlap overlapping positions, allowing for maxgap, are considered to be overlapping. maxgap should be a scalar, non-negative, integer. minoverlap should be a scalar, positive integer.

type

It passed to findOverlaps.

By default, any overlap is accepted. By specifying the type parameter, one can select for specific types of overlap. The types correspond to operations in Allen's Interval Algebra (see references). If type is start or end, the intervals are required to have matching starts or ends, respectively. While this operation seems trivial, the naive implementation using outer would be much less efficient. Specifying equal as the type returns the intersection of the start and end matches. If type is within, the query interval must be wholly contained within the subject interval. Note that all matches must additionally satisfy the minoverlap constraint described above.

The maxgap parameter has special meaning with the special overlap types. For start, end, and equal, it specifies the maximum difference in the starts, ends or both, respectively. For within, it is the maximum amount by which the query may be wider than the subject.

select

It passed to findOverlaps.

When select is "all" (the default), the results are returned as a Hits object. When select is "first", "last", or "arbitrary" the results are returned as an integer vector of length query containing the first, last, or arbitrary overlapping interval in subject, with NA indicating intervals that did not overlap any intervals in subject.

If select is "all", a Hits object is returned. For all other select the return value depends on the drop argument. When select != "all" && !drop, an IntegerList is returned, where each element of the result corresponds to a space in query. Whenselect != "all" && drop, an integer vector is returned containing indices that are offset to align with the unlisted query.

y

A character indicate the varialbe column for which aggregation is taken on, same as aes(y = ).

window

Integer value indicate window size.

facets

Faceting formula to use.

method

customized method for aggregating, if FUN is not provided.

geom

The geometric object to use display the data.

Value

A 'Layer'.

Author(s)

Tengfei Yin

Examples

library(GenomicRanges)
set.seed(1)
N <- 1000
## ======================================================================
##  simmulated GRanges
## ======================================================================
gr <- GRanges(seqnames = 
              sample(c("chr1", "chr2", "chr3"),
                     size = N, replace = TRUE),
              IRanges(
                      start = sample(1:300, size = N, replace = TRUE),
                      width = sample(70:75, size = N,replace = TRUE)),
              strand = sample(c("+", "-", "*"), size = N, 
                replace = TRUE),
              value = rnorm(N, 10, 3), score = rnorm(N, 100, 30),
              sample = sample(c("Normal", "Tumor"), 
                size = N, replace = TRUE),
              pair = sample(letters, size = N, 
                replace = TRUE))


ggplot(gr) + stat_aggregate(aes(y = value))
## or
## ggplot(gr) + stat_aggregate(y = "value")
ggplot(gr) + stat_aggregate(aes(y = value), window = 36)
ggplot(gr) + stat_aggregate(aes(y = value), select = "first")
## Not run: 
## no hits 
ggplot(gr) + stat_aggregate(aes(y = value), select = "first", type = "within")

## End(Not run)
ggplot(gr) + stat_aggregate(window = 30,  aes(y = value),fill = "gray40", geom = "histogram")
ggplot(gr) + stat_aggregate(window = 100, fill = "gray40", aes(y = value),
                           method = "max", geom = "histogram")

ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot")
ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot", window = 60)
## now facets need to take place inside stat_* geom_* for an accurate computation
ggplot(gr) + stat_aggregate(aes(y = value), geom = "boxplot", window = 30,
              facets = sample ~ seqnames)
## FIXME:
## autoplot(gr, stat = "aggregate", aes(y = value), window = 36)
## autoplot(gr, stat = "aggregate", geom = "boxplot", aes(y = value), window = 36)

[Package ggbio version 1.16.0 Index]