bioconductor v3.9.0 S4Vectors
The S4Vectors package defines the Vector and List virtual classes
Link to this section Summary
Functions
Annotated class
DataFrame objects
DataTable objects
Matrix for Filter Results
Collection of Filter Rules
List of Hits objects
Hits objects
Comparing and ordering hits
Set operations on Hits objects
LLint vectors
List objects
Common operations on List objects
Pairs objects
Rle objects
Fixed-width running window summaries
Common operations on Rle objects
S4Vectors internals
SimpleList objects
Vector objects
Compare, order, tabulate vector-like objects
Merge vector-like objects
Set operations on vector-like objects
Compute summary statistics of subsets of vector-like objects
Some utility functions to operate on strings
Unlist the list-like columns of a DataFrame object
Some utility functions to operate on integer vectors
Test if a vector-like object is sorted
Apply a function over subsequences of 2 vector-like objects
Display utilities
Divide a vector-like object into groups
Subsetting utilities
Convert between parallel vectors and lists
Link to this section Functions
Annotated_class()
Annotated class
Description
The virtual class Annotated
is used to standardize the
storage of metadata with a subclass.
Details
The Annotated
class supports the storage of global metadata in a
subclass. This is done through the metadata
slot that stores a list
object.
Seealso
The Vector class, which extends Annotated directly.
Author
P. Aboyoun
Examples
showClass("Annotated") # shows (some of) the known subclasses
## If the IRanges package was not already loaded, this will show
## more subclasses:
library(IRanges)
showClass("Annotated")
DataFrame_class()
DataFrame objects
Description
The DataFrame
class extends the DataTable virtual
class and supports the storage of any type of object (with length
and [
methods) as columns.
Details
On the whole, the DataFrame
behaves very similarly to
data.frame
, in terms of construction, subsetting, splitting,
combining, etc. The most notable exception is that the row names are
optional. This means calling rownames(x)
will return
NULL
if there are no row names. Of course, it could return
seq_len(nrow(x))
, but returning NULL
informs, for
example, combination functions that no row names are desired (they are
often a luxury when dealing with large data).
As DataFrame
derives from Vector , it is
possible to set an annotation
string. Also, another
DataFrame
can hold metadata on the columns.
For a class to be supported as a column, it must have length
and [
methods, where [
supports subsetting only by
i
and respects drop=FALSE
. Optionally, a method may be
defined for the showAsCell
generic, which should return a
vector of the same length as the subset of the column passed to
it. This vector is then placed into a data.frame
and converted
to text with format
. Thus, each element of the vector should be
some simple, usually character, representation of the corresponding
element in the column.
Seealso
- DataTable and SimpleList which DataFrame extends directly.
Author
Michael Lawrence
Examples
score <- c(1L, 3L, NA)
counts <- c(10L, 2L, NA)
row.names <- c("one", "two", "three")
df <- DataFrame(score) # single column
df[["score"]]
df <- DataFrame(score, row.names = row.names) #with row names
rownames(df)
df <- DataFrame(vals = score) # explicit naming
df[["vals"]]
# arrays
ary <- array(1:4, c(2,1,2))
sw <- DataFrame(I(ary))
# a data.frame
sw <- DataFrame(swiss)
as.data.frame(sw) # swiss, without row names
# now with row names
sw <- DataFrame(swiss, row.names = rownames(swiss))
as.data.frame(sw) # swiss
# subsetting
sw[] # identity subset
sw[,] # same
sw[NULL] # no columns
sw[,NULL] # no columns
sw[NULL,] # no rows
## select columns
sw[1:3]
sw[,1:3] # same as above
sw[,"Fertility"]
sw[,c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)]
## select rows and columns
sw[4:5, 1:3]
sw[1] # one-column DataFrame
## the same
sw[, 1, drop = FALSE]
sw[, 1] # a (unnamed) vector
sw[[1]] # the same
sw[["Fertility"]]
sw[["Fert"]] # should return 'NULL'
sw[1,] # a one-row DataFrame
sw[1,, drop=TRUE] # a list
## duplicate row, unique row names are created
sw[c(1, 1:2),]
## indexing by row names
sw["Courtelary",]
subsw <- sw[1:5,1:4]
subsw["C",] # partially matches
## row and column names
cn <- paste("X", seq_len(ncol(swiss)), sep = ".")
colnames(sw) <- cn
colnames(sw)
rn <- seq(nrow(sw))
rownames(sw) <- rn
rownames(sw)
## column replacement
df[["counts"]] <- counts
df[["counts"]]
df[[3]] <- score
df[["X"]]
df[[3]] <- NULL # deletion
DataTable_class()
DataTable objects
Description
DataTable is an API only (i.e. virtual class with no slots) for accessing objects with a rectangular shape like DataFrame or DelayedMatrix objects.
Seealso
DataFrame for an implementation that mimics
data.frame
.
Examples
showClass("DataTable") # shows (some of) the known subclasses
library(IRanges)
df <- DataFrame(as.data.frame(UCBAdmissions))
xtabs(Freq ~ Gender + Admit, df)
FilterMatrix_class()
Matrix for Filter Results
Description
A FilterMatrix
object is a matrix meant for storing
the logical output of a set of FilterRules , where
each rule corresponds to a column. The FilterRules
are stored
within the FilterMatrix
object, for the sake of
provenance. In general, a FilterMatrix
behaves like an
ordinary matrix
.
Seealso
evalSeparately
is the typical way to generate this object.FilterRules objects.
Author
Michael Lawrence
FilterRules_class()
Collection of Filter Rules
Description
A FilterRules
object is a collection of filter
rules, which can be either expression
or function
objects. Rules can be disabled/enabled individually, facilitating
experimenting with different combinations of filters.
Details
It is common to split a dataset into subsets during data
analysis. When data is large, however, representing subsets (e.g. by
logical vectors) and storing them as copies might become too costly in
terms of space. The FilterRules
class represents
subsets as lightweight expression
and/or function
objects. Subsets can then be calculated when needed (on the fly). This
avoids copying and storing a large number of subsets. Although it
might take longer to frequently recalculate a subset, it often is a
relatively fast operation and the space savings tend to be more than
worth it when data is large.
Rules may be either expressions or functions. Evaluating an expression
or invoking a function should result in a logical vector. Expressions
are often more convenient, but functions (i.e. closures) are generally
safer and more powerful, because the user can specify the enclosing
environment. If a rule is an expression, it is evaluated inside the
envir
argument to the eval
method (see below). If a
function, it is invoked with envir
as its only
argument. See examples.
Seealso
FilterMatrix objects for storing the logical output of a set of FilterRules objects.
Author
Michael Lawrence
Examples
## constructing a FilterRules instance
## an empty set of filters
filters <- FilterRules()
## as a simple character vector
filts <- c("peaks", "promoters")
filters <- FilterRules(filts)
active(filters) # all TRUE
## with functions and expressions
filts <- list(peaks = expression(peaks), promoters = expression(promoters),
find_eboxes = function(rd) rep(FALSE, nrow(rd)))
filters <- FilterRules(filts, active = FALSE)
active(filters) # all FALSE
## direct, quoted args (character literal parsed)
filters <- FilterRules(under_peaks = peaks, in_promoters = "promoters")
filts <- list(under_peaks = expression(peaks),
in_promoters = expression(promoters))
## specify both exprs and additional args
filters <- FilterRules(filts, diffexp = de)
filts <- c("promoters", "peaks", "introns")
filters <- FilterRules(filts)
## evaluation
df <- DataFrame(peaks = c(TRUE, TRUE, FALSE, FALSE),
promoters = c(TRUE, FALSE, FALSE, TRUE),
introns = c(TRUE, FALSE, FALSE, FALSE))
eval(filters, df)
fm <- evalSeparately(filters, df)
identical(filterRules(fm), filters)
summary(fm)
summary(fm, percent = TRUE)
fm <- evalSeparately(filters, df, serial = TRUE)
## set the active state directly
active(filters) <- FALSE # all FALSE
active(filters) <- TRUE # all TRUE
active(filters) <- c(FALSE, FALSE, TRUE)
active(filters)["promoters"] <- TRUE # use a filter name
## toggle the active state by name or index
active(filters) <- c(NA, 2) # NA's are dropped
active(filters) <- c("peaks", NA)
HitsList_class()
List of Hits objects
Description
The HitsList class stores a set of Hits objects. It's typically
used to represent the result of findOverlaps
on
two IntegerRangesList objects.
Details
Roughly the same set of utilities are provided for HitsList as for Hits :
The as.matrix
method coerces a HitsList object in a
similar way to Hits , except a column is prepended that indicates
which space (or element in the query IntegerRangesList )
to which the row corresponds.
The as.table
method flattens or unlists the list, counts the
number of hits for each query range and outputs the counts as a
table
, which has the same shape as from a single Hits
object.
To transpose a HitsList object x
, so that the subject
and query in each space are interchanged, call t(x)
. This
allows, for example, counting the number of hits for each subject
element using as.table
.
Seealso
findOverlaps
in the IRanges package, which returns a HitsList object when the query and subject are IntegerRangesList objects.
Note
This class is highly experimental. It has not been well tested and may disappear at any time.
Author
Michael Lawrence
Hits_class()
Hits objects
Description
The Hits class is a container for representing a set of hits between a set of left nodes and a set of right nodes . Note that only the hits are stored in the object. No information about the left or right nodes is stored, except their number.
For example, the findOverlaps
function, defined
and documented in the IRanges package, returns the hits between
the query
and subject
arguments in a Hits
object.
Usage
## Constructor functions
Hits(from=integer(0), to=integer(0), nLnode=0L, nRnode=0L, ...,
sort.by.query=FALSE)
SelfHits(from=integer(0), to=integer(0), nnode=0L, ...,
sort.by.query=FALSE)
Arguments
Argument | Description |
---|---|
from, to | 2 integer vectors of the same length. The values in from must be >= 1 and <= nLnode . The values in to must be >= 1 and <= nRnode . |
nLnode, nRnode | Number of left and right nodes. |
... | Metadata columns to set on the Hits object. All the metadata columns must be vector-like objects of the same length as from and to . |
sort.by.query | Should the hits in the returned object be sorted by query? If yes, then a SortedByQueryHits object is returned (SortedByQueryHits is a subclass of Hits). |
nnode | Number of nodes. |
Seealso
Hits-comparison for comparing and ordering hits.
The
findOverlaps
function in the IRanges package which returns SortedByQueryHits object.Hits-examples in the IRanges package, for some examples of Hits object basic manipulation.
setops-methods in the IRanges package, for set operations on Hits objects.
Author
Michael Lawrence and Hervé Pagès
Examples
from <- c(5, 2, 3, 3, 3, 2)
to <- c(11, 15, 5, 4, 5, 11)
id <- letters[1:6]
Hits(from, to, 7, 15, id)
Hits(from, to, 7, 15, id, sort.by.query=TRUE)
## ---------------------------------------------------------------------
## selectHits()
## ---------------------------------------------------------------------
x <- c("a", "b", "a", "c", "d")
table <- c("a", "e", "d", "a", "a", "d")
hits <- findMatches(x, table) # sorts the hits by query
hits
selectHits(hits, select="all") # no-op
selectHits(hits, select="first")
selectHits(hits, select="first", nodup=TRUE)
selectHits(hits, select="last")
selectHits(hits, select="last", nodup=TRUE)
selectHits(hits, select="arbitrary")
selectHits(hits, select="count")
## ---------------------------------------------------------------------
## remapHits()
## ---------------------------------------------------------------------
Lnodes.remapping <- factor(c(a="A", b="B", c="C", d="D")[x],
levels=LETTERS[1:4])
remapHits(hits, Lnodes.remapping=Lnodes.remapping)
## See ?`Hits-examples` in the IRanges package for more examples of basic
## manipulation of Hits objects.
## ---------------------------------------------------------------------
## SelfHits objects
## ---------------------------------------------------------------------
hits2 <- SelfHits(c(2, 3, 3, 3, 3, 3, 4, 4, 4), c(4, 3, 2:4, 2, 2:3, 2), 4)
## Hits 2 and 4 are self hits (from 3rd node to itself):
which(isSelfHit(hits2))
## Hits 4, 6, 7, 8, and 9, are redundant hits:
which(isRedundantHit(hits2))
hits3 <- findMatches(x)
hits3[!isSelfHit(hits3)]
|hits3[!(isSelfHit(hits3) | isRedundantHit(hits3))]|
Hits_comparison()
Comparing and ordering hits
Description
==
, !=
, <=
, >=
, <
, >
,
match()
, %in%
, order()
, sort()
, and
rank()
can be used on Hits objects to compare and order hits.
Note that only the "pcompare"
, "match"
, and "order"
methods are actually defined for Hits objects. This is all what is
needed to make all the other comparing and ordering operations (i.e.
==
, !=
, <=
, >=
, <
, >
,
%in%
, sort()
, and rank()
) work on these objects
(see ?`` for more information about this). ## Usage ```r list(list("pcompare"), list("Hits,Hits"))(x, y) list(list("match"), list("Hits,Hits"))(x, table, nomatch=NA_integer_, incomparables=NULL, method=c("auto", "quick", "hash")) list(list("order"), list("Hits"))(..., na.last=TRUE, decreasing=FALSE, method=c("auto", "shell", "radix")) ``` ## Arguments |Argument |Description| |------------- |----------------| |
x, y, table| Compatible [Hits](#hits) objects, that is, [Hits](#hits) objects with the same subject and query lengths. | |
nomatch| The value to be returned in the case when no match is found. It is coerced to an
integer. | |
incomparables| Not supported. | |
method| For
match: Use a Quicksort-based (
method="quick") or a hash-based (
method="hash") algorithm. The latter tends to give better performance, except maybe for some pathological input that we've not encountered so far. When
method="auto"is specified, the most efficient algorithm will be used, that is, the hash-based algorithm if
length(x) <= 2^29, otherwise the Quicksort-based algorithm. For
order: The
methodargument is ignored. | |
...| One or more [Hits](#hits) objects. The additional [Hits](#hits) objects are used to break ties. | |
na.last| Ignored. | |
decreasing|
TRUEor
FALSE. | ## Details Only hits that belong to [Hits](#hits) objects with same subject and query lengths can be compared. Hits are ordered by query hit first, and then by subject hit. On a [Hits](#hits) object,
order,
sort, and
rankare consistent with this order. list(" ", " ", list(list(), list(" ", " ", list("pcompare(x, y)"), ": ", " Performs element-wise (aka "parallel") comparison of 2 ", list("Hits"), " ", " objects ", list("x"), " and ", list("y"), ", that is, returns an integer vector where ", " the i-th element is less than, equal to, or greater than zero if ", " ", list("x[i]"), " is considered to be respectively less than, equal to, or ", " greater than ", list("y[i]"), ". See ", list("?
", list("Vector-comparison"),
"`"), " for
", " how ", list("x"), " or ", list("y"), " is recycled when the 2 objects don't have the ", " same length. ", " ")), " ", " ", list(list(), list(" ", " ", list("match(x, table, nomatch=NAinteger, method=c("auto", "quick", "hash"))"), ": ", " Returns an integer vector of the length of ", list("x"), ", containing the ", " index of the first matching hit in ", list("table"), " (or ", list("nomatch"), " if ", " there is no matching hit) for each hit in ",
list("x"), ".
", " ")), " ", " ", list(list(), list(" ", " ", list("order(...)"), ": ", " Returns a permutation which rearranges its first argument (a ", list("Hits"), " ", " object) into ascending order, breaking ties by further arguments (also ", " ", list("Hits"), " objects). ", " ")), " ", " ")
Seealso
Hits objects.
Vector-comparison for general information about comparing, ordering, and tabulating vector-like objects.
Author
Hervé Pagès
Examples
## ---------------------------------------------------------------------
## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 Hits OBJECTS
## ---------------------------------------------------------------------
hits <- Hits(c(2, 4, 4, 4, 5, 5), c(3, 1, 3, 2, 3, 2), 6, 3)
hits
pcompare(hits, hits[3])
pcompare(hits[3], hits)
hits == hits[3]
hits != hits[3]
hits >= hits[3]
hits < hits[3]
## ---------------------------------------------------------------------
## B. match(), %in%
## ---------------------------------------------------------------------
table <- hits[-c(1, 3)]
match(hits, table)
hits %in% table
## ---------------------------------------------------------------------
## C. order(), sort(), rank()
## ---------------------------------------------------------------------
order(hits)
sort(hits)
rank(hits)
Hits_setops()
Set operations on Hits objects
Description
Perform set operations on Hits objects.
Details
union(x, y)
, intersect(x, y)
, setdiff(x, y)
, and
setequal(x, y)
work on Hits objects x
and y
only if the objects are compatible Hits objects , that is, if they
have the same subject and query lengths. These operations return respectively
the union, intersection, (asymmetric!) difference, and equality of the
sets of hits in x
and y
.
Value
union
returns a Hits object obtained by appending to x
the hits in y
that are not already in x
.
intersect
returns a Hits object obtained by keeping only
the hits in x
that are also in y
.
setdiff
returns a Hits object obtained by dropping from
x
the hits that are in y
.
setequal
returns TRUE
if x
and y
contain the
same sets of hits and FALSE
otherwise.
union
, intersect
, and setdiff
propagate the names and
metadata columns of their first argument ( x
).
Seealso
Hits objects.
Hits-comparison for comparing and ordering hits.
BiocGenerics::
,BiocGenerics::
, andBiocGenerics::
in the BiocGenerics package for general information about these generic functions.
Author
Hervé Pagès and Michael Lawrence
Examples
x <- Hits(c(2, 4, 4, 4, 5, 5), c(3, 1, 3, 2, 3, 2), 6, 3,
score=11:16)
x
y <- Hits(c(1, 3, 4, 4, 5, 5, 5), c(3, 3, 2, 1, 2, 1, 3), 6, 3,
score=21:27)
y
union(x, y)
union(y, x) # same hits as in union(x, y), but in different order
intersect(x, y)
intersect(y, x) # same hits as in intersect(x, y), but in
# different order
setdiff(x, y)
setdiff(y, x)
setequal(x, y)
LLint_class()
LLint vectors
Description
The LLint class is a container for storing a vector of list("large ", " integers") (i.e. long long int values at the C level).
Usage
LLint(length=0L)
as.LLint(x)
is.LLint(x)
Arguments
Argument | Description |
---|---|
length | A non-negative number (i.e. integer, double, or LLint value) specifying the desired length. |
x | Object to be coerced or tested. |
Details
LLint vectors aim to provide the same functionality as integer vectors in base R but their values are stored as long long int values at the C level vs int values for integer vectors. Note that on Intel platforms long long int values are 64-bit and int values 32-bit only. Therefore LLint vectors can hold values in the +/-9.223e18 range (approximately) vs +/-2.147e9 only for integer vectors.
NAs are supported and the NA_LLint_
constant is predefined for
convenience as as(NA, "LLint")
.
Names are not supported for now.
Coercions from/to logical, integer, double, and character are supported.
Operations from the Arith
, Compare
and
Summary
groups are supported.
More operations coming soon...
Seealso
Author
Hervé Pagès
Examples
## A long long int uses 8 bytes (i.e. 64 bits) in C:
.Machine$sizeof.longlong
## ---------------------------------------------------------------------
## SIMPLE EXAMPLES
## ---------------------------------------------------------------------
LLint()
LLint(10)
as.LLint(3e9)
as.LLint("3000000000")
x <- as.LLint(1:10 * 111111111)
x * x
5 * x # result as vector of doubles (i.e. 'x' coerced to double)
5L * x # result as LLint vector (i.e. 5L coerced to LLint vector)
max(x)
min(x)
range(x)
sum(x)
x <- as.LLint(1:20)
prod(x)
x <- as.LLint(1:21)
prod(x) # result is out of LLint range (+/-9.223e18)
prod(as.numeric(x))
x <- as.LLint(1:75000)
sum(x * x * x) == sum(x) * sum(x)
## Note that max(), min() and range() *always* return an LLint vector
## when called on an LLint vector, even when the vector is empty:
max(LLint()) # NA with no warning
min(LLint()) # NA with no warning
## This differs from how max(), min() and range() behave on an empty
## integer vector:
max(integer()) # -Inf with a warning
min(integer()) # Inf with a warning
## ---------------------------------------------------------------------
## GOING FROM STRINGS TO INTEGERS
## ---------------------------------------------------------------------
## as.integer() behaves like as.integer(as.double()) on a character
## vector. With the following consequence:
s <- "-2.9999999999999999"
as.integer(s) # -3
## as.LLint() converts the string *directly* to LLint, without
## coercing to double first:
as.LLint(s) # decimal part ignored
## ---------------------------------------------------------------------
## GOING FROM DOUBLE-PRECISION VALUES TO INTEGERS AND VICE-VERSA
## ---------------------------------------------------------------------
## Be aware that a double-precision value is not guaranteed to represent
## exactly an integer > 2^53. This can cause some surprises:
2^53 == 2^53 + 1 # TRUE, yep!
## And therefore:
as.LLint(2^53) == as.LLint(2^53 + 1) # also TRUE
## This can be even more disturbing when passing a big literal integer
## value because the R parser will turn it into a double-precision value
## before passing it to as.LLint():
x1 <- as.LLint(9007199254740992) # same as as.LLint(2^53)
x1
x2 <- as.LLint(9007199254740993) # same as as.LLint(2^53 + 1)
x2
x1 == x2 # still TRUE
## However, no precision is lost if a string literal is used instead:
x1 <- as.LLint("9007199254740992")
x1
x2 <- as.LLint("9007199254740993")
x2
x1 == x2 # FALSE
x2 - x1
d1 <- as.double(x1)
d2 <- as.double(x2) # warning!
d1 == d2 # TRUE
## ---------------------------------------------------------------------
## LLint IS IMPLEMENTED AS AN S4 CLASS
## ---------------------------------------------------------------------
class(LLint(10))
typeof(LLint(10)) # S4
storage.mode(LLint(10)) # S4
is.vector(LLint(10)) # FALSE
is.atomic(LLint(10)) # FALSE
## This means that an LLint vector cannot go in an ordinary data
## frame:
data.frame(id=as.LLint(1:5)) # error!
## A DataFrame needs to be used instead:
DataFrame(id=as.LLint(1:5))
## ---------------------------------------------------------------------
## SANITY CHECKS
## ---------------------------------------------------------------------
x <- as.integer(c(0, 1, -1, -3, NA, -99))
y <- as.integer(c(-6, NA, -4:3, 0, 1999, 6:10, NA))
xx <- as.LLint(x)
yy <- as.LLint(y)
## Operations from "Arith" group:
stopifnot(identical(x + y, as.integer(xx + yy)))
stopifnot(identical(as.LLint(y + x), yy + xx))
stopifnot(identical(x - y, as.integer(xx - yy)))
stopifnot(identical(as.LLint(y - x), yy - xx))
stopifnot(identical(x * y, as.integer(xx * yy)))
stopifnot(identical(as.LLint(y * x), yy * xx))
stopifnot(identical(x / y, xx / yy))
stopifnot(identical(y / x, yy / xx))
stopifnot(identical(x %/% y, as.integer(xx %/% yy)))
stopifnot(identical(as.LLint(y %/% x), yy %/% xx))
stopifnot(identical(x %% y, as.integer(xx %% yy)))
stopifnot(identical(as.LLint(y %% x), yy %% xx))
stopifnot(identical(x ^ y, xx ^ yy))
stopifnot(identical(y ^ x, yy ^ xx))
## Operations from "Compare" group:
stopifnot(identical(x == y, xx == yy))
stopifnot(identical(y == x, yy == xx))
stopifnot(identical(x != y, xx != yy))
stopifnot(identical(y != x, yy != xx))
stopifnot(identical(x <= y, xx <= yy))
stopifnot(identical(y <= x, yy <= xx))
stopifnot(identical(x >= y, xx >= yy))
stopifnot(identical(y >= x, yy >= xx))
stopifnot(identical(x < y, xx < yy))
stopifnot(identical(y < x, yy < xx))
stopifnot(identical(x > y, xx > yy))
stopifnot(identical(y > x, yy > xx))
## Operations from "Summary" group:
stopifnot(identical(max(y), as.integer(max(yy))))
stopifnot(identical(max(y, na.rm=TRUE), as.integer(max(yy, na.rm=TRUE))))
stopifnot(identical(min(y), as.integer(min(yy))))
stopifnot(identical(min(y, na.rm=TRUE), as.integer(min(yy, na.rm=TRUE))))
stopifnot(identical(range(y), as.integer(range(yy))))
stopifnot(identical(range(y, na.rm=TRUE), as.integer(range(yy, na.rm=TRUE))))
stopifnot(identical(sum(y), as.integer(sum(yy))))
stopifnot(identical(sum(y, na.rm=TRUE), as.integer(sum(yy, na.rm=TRUE))))
stopifnot(identical(prod(y), as.double(prod(yy))))
stopifnot(identical(prod(y, na.rm=TRUE), as.double(prod(yy, na.rm=TRUE))))
List_class()
List objects
Description
List objects are Vector objects with a "[["
,
elementType
and elementNROWS
method.
The List class serves a similar role as list in base R.
It adds one slot, the elementType
slot, to the two slots shared by
all Vector objects.
The elementType
slot is the preferred location for List
subclasses to store the type of data represented in the sequence. It is
designed to take a character of length 1 representing the class of the
sequence elements. While the List class performs no validity checking
based on elementType
, if a subclass expects elements to be of a
given type, that subclass is expected to perform the necessary validity
checking. For example, the subclass IntegerList (defined
in the IRanges package) has elementType = "integer"
and its
validity method checks if this condition is TRUE.
To be functional, a class that inherits from List must define at least
a "[["
method (in addition to the minimum set of Vector
methods).
Seealso
List-utils for common operations on List objects.
Vector objects for the parent class.
The SimpleList class for a direct extension of the List class.
The CompressedList class defined in the IRanges package for another direct extension of the List class.
The IntegerList , RleList , and IRanges classes and constructors defined in the IRanges package for more examples of concrete List subclasses.
The extractList function defined in the IRanges package for grouping elements of a vector-like object into a list-like object.
Author
P. Aboyoun and H. Pagès
Examples
showClass("List") # shows (some of) the known subclasses
List_utils()
Common operations on List objects
Description
Various functions and methods for looping on List objects, functional programming on List objects, and evaluation of an expression in a List object.
Usage
## Looping on List objects:
## ------------------------
list(list("lapply"), list("List"))(X, FUN, ...)
list(list("sapply"), list("List"))(X, FUN, ..., simplify=TRUE, USE.NAMES=TRUE)
endoapply(X, FUN, ...)
revElements(x, i)
mendoapply(FUN, ..., MoreArgs=NULL)
pc(...)
## Functional programming methods for List objects:
## ------------------------------------------------
list(list("Reduce"), list("List"))(f, x, init, right=FALSE, accumulate=FALSE)
list(list("Filter"), list("List"))(f, x)
list(list("Find"), list("List"))(f, x, right=FALSE, nomatch=NULL)
list(list("Map"), list("List"))(f, ...)
list(list("Position"), list("List"))(f, x, right=FALSE, nomatch=NA_integer_)
## Evaluation of an expression in a List object:
## ---------------------------------------------
list(list("within"), list("List"))(data, expr, ...)
## Constructing list matrices:
## ---------------------------------------------
list(list("rbind"), list("List"))(..., deparse.level=1L)
list(list("cbind"), list("List"))(..., deparse.level=1L)
Arguments
Argument | Description |
---|---|
X, x | A list, data.frame or List object. |
FUN | The function to be applied to each element of X (for endoapply ) or for the elements in ... (for mendoapply ). |
... | For lapply , sapply , and endoapply , optional arguments to FUN . For mendoapply , pc and Map , one or more list-like objects. |
simplify, USE.NAMES | See ?base:: for a description of these arguments. |
MoreArgs | A list of other arguments to FUN . |
i | Index specifying the elements to replace. Can be anything supported by [<- . |
f, init, right, accumulate, nomatch | See ?base:: for a description of these arguments. |
data | A List object. |
expr | Expression to evaluate. |
deparse.level | See ?base:: for a description of this argument. |
Details
list(list("Looping on List objects"), list(" ", " Like the standard ", list(list("lapply")), " function defined in the ", " ", list("base"), " package, the ", list("lapply"), " method for ", list("List"), " objects ", " returns a list of the same length as ", list("X"), ", with each element being ", " the result of applying ", list("FUN"), " to the corresponding element of ", list("X"), ". ", " ", " Like the standard ", list(list("sapply")), " function defined in the ", " ",
list("base"), " package, the ", list("sapply"), " method for ", list("List"), " objects
", " is a user-friendly version of ", list("lapply"), " by default returning a vector ", " or matrix if appropriate. ", " ", " ", list("endoapply"), " and ", list("mendoapply"), " perform the endomorphic equivalents ", " of ", list(list("lapply")), " and ", list(list("mapply")), " by returning ", " objects of the same class as the inputs rather than an ordinary list. ", " ", " ",
list("revElements(x, i)"), " reverses the list elements in ", list("x"), " specified
", " by ", list("i"), ". It's equivalent to, but faster than, doing ", " ", list("x[i] <- endoapply(x[i], rev)"), ". ", " ", " ", list("pc(...)"), " combine list-like objects by concatenating them in an ", " element-wise fashion. It's similar to, but faster than, ", " ", list("mapply(c, ..., SIMPLIFY=FALSE)"), ". With the following differences: ", " ", list(" ", " ", list(), " ",
list("pc()"), " ignores the supplied objects that are NULL.
", " ", list(), " ", list("pc()"), " does not recycle its arguments. All the supplied ", " objects must have the same length. ", " ", list(), " If one of the supplied objects is a ", list("List"), " object, then ", " ", list("pc()"), " returns a ", list("List"), " object. ", " ", list(), " ", list("pc()"), " always returns a homogenous list or ", list("List"), " object, ", " that is, an object where all the list elements have the same type. ",
" "), "
", " ")) list(list("Functional programming methods for List objects"), list(" ", " The R base package defines some higher-order functions that are commonly ", " found in Functional Programming Languages. ", " See ", list("?base::", list("Reduce")), " for the details, and, in particular, ", " for a description of their arguments. ", " The ", list("S4Vectors"), " package provides methods for ", list("List"), " objects, so, ", " in addition to be an ordinary vector or list, the ", list(
"x"), " argument can
", " also be a ", list("List"), " object. ", " ")) list(list("Evaluation of an expression in a List object"), list(" ", " ", list("within"), " evaluates ", list("expr"), " within ", list("as.env(data)"), " via ", " ", list("eval(data)"), ". Similar to ", list("with"), ", except assignments made ", " during evaluation are taken as assignments into ", list("data"), ", i.e., ", " new symbols have their value appended to ", list("data"), ", and assigning ", " new values to existing symbols results in replacement. ", " ")) list(list("Binding Lists into a matrix"), list(" ", " There are methods for ", list("cbind"), " and ", list("rbind"), " that will bind ", " multiple lists together into a basic list matrix. The usual ", " geometric constraints apply. In the future, this might return a List ", " (+ dimensions), but for now the return value is an ordinary list. ", " "))
Value
endoapply
returns an object of the same class as X
,
each element of which is the result of applying FUN
to the
corresponding element of X
.
mendoapply
returns an object of the same class as the first
object specified in list() , each element of which is the result
of applying FUN
to the corresponding elements of list() .
pc
returns a list or List object of the same length as the
input objects.
See ?base::
for the value returned by the
functional programming methods.
See ?base::
for the value returned by
within
.
cbind
and rbind
return a list matrix.
Seealso
The List class.
base::
andbase::
for the defaultlapply
andmapply
methods.base::
for the default functional programming methods.base::
for the defaultwithin
method.base::
andbase::
for the default matrix binding methods.
Author
P. Aboyoun and H. Pagès
Examples
a <- data.frame(x = 1:10, y = rnorm(10))
b <- data.frame(x = 1:10, y = rnorm(10))
endoapply(a, function(x) (x - mean(x))/sd(x))
mendoapply(function(e1, e2) (e1 - mean(e1)) * (e2 - mean(e2)), a, b)
x <- list(a=11:13, b=26:21, c=letters)
y <- list(-(5:1), c("foo", "bar"), 0.25)
pc(x, y)
library(IRanges)
x <- IntegerList(a=11:13, b=26:21, c=31:36, d=4:2)
y <- NumericList(-(5:1), 1:2, numeric(0), 0.25)
pc(x, y)
Reduce("+", x)
Filter(is.unsorted, x)
pos1 <- Position(is.unsorted, x)
stopifnot(identical(Find(is.unsorted, x), x[[pos1]]))
pos2 <- Position(is.unsorted, x, right=TRUE)
stopifnot(identical(Find(is.unsorted, x, right=TRUE), x[[pos2]]))
y <- x * 1000L
Map("c", x, y)
rbind(x, y)
cbind(x, y)
Pairs_class()
Pairs objects
Description
Pairs
is a Vector
that stores two parallel vectors (any
object that can be a column in a DataFrame ). It
provides conveniences for performing binary operations on the vectors,
as well as for converting between an equivalent list
representation. Virtually all of the typical R vector operations
should behave as expected.
A typical use case is representing the pairing from a
findOverlaps
call, for which
findOverlapPairs
is a shortcut.
Seealso
Hits-class , a typical way to define a pairing.
findOverlapPairs
in the IRanges package, which generates an instance of this class based on overlaps.setops-methods in the IRanges package, for set operations on Pairs objects.
Author
Michael Lawrence
Examples
p <- Pairs(1:10, Rle(1L, 10), score=rnorm(10), names=letters[1:10])
identical(first(p), 1:10)
mcols(p)$score
p%in% p[1:5]
as.data.frame(p)
z <- zipup(p)
first(p) <- Rle(1:10)
identical(zipdown(z), p)
Rle_class()
Rle objects
Description
The Rle class is a general container for storing an atomic vector
that is stored in a run-length encoding format. It is based on the
rle
function from the base package.
Seealso
Rle-utils , Rle-runstat , and aggregate for more operations on Rle objects.
Author
P. Aboyoun
Examples
x <- Rle(10:1, 1:10)
x
runLength(x)
runValue(x)
nrun(x)
diff(x)
unique(x)
sort(x)
x[c(1,3,5,7,9)]
x > 4
x2 <- Rle(LETTERS[c(21:26, 25:26)], 8:1)
table(x2)
y <- Rle(c(TRUE,TRUE,FALSE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE))
y
as.vector(y)
rep(y, 10)
c(y, x > 5)
Rle_runstat()
Fixed-width running window summaries
Description
The runsum
, runmean
, runmed
, runwtsum
,
runq
functions calculate the sum, mean, median, weighted sum,
and order statistic for fixed width running windows.
Usage
runsum(x, k, endrule = c("drop", "constant"), ...)
runmean(x, k, endrule = c("drop", "constant"), ...)
list(list("smoothEnds"), list("Rle"))(y, k = 3)
list(list("runmed"), list("Rle"))(x, k, endrule = c("median", "keep", "drop", "constant"),
algorithm = NULL, print.level = 0)
runwtsum(x, k, wt, endrule = c("drop", "constant"), ...)
runq(x, k, i, endrule = c("drop", "constant"), ...)
Arguments
Argument | Description |
---|---|
x,y | The data object. |
k | An integer indicating the fixed width of the running window. Must be odd when endrule != "drop" . |
|endrule
| A character string indicating how the values at the beginning and the end (of the data) should be treated. list("
", " ", list(list(list(""median"")), list("see ", list(list("runmed")), ";")), "
", " ", list(list(list(""keep"")), list("see ", list(list("runmed")), ";")), "
", " ", list(list(list(""drop"")), list("do not extend the running statistics to
", " be the same length as the underlying vectors;")), "
", " ", list(list(list(""constant"")), list("copies running statistic to the
", " first values and analogously for the last ones making the
", " smoothed ends ", |
list("constant"), ".")), "
", " ")
|wt
| A numeric vector of length k
that provides the weights to use. |
|i
| An integer in [0, k] indicating which order statistic to calculate. |
|algorithm,print.level
| See ?stats::
for a description of these arguments. |
|list()
| Additional arguments passed to methods. Specifically, na.rm
. When na.rm = TRUE
, the NA
and NaN
values are removed. When na.rm = FALSE
, NA
is returned if either NA
or NaN
are in the specified window. |
Details
The runsum
, runmean
, runmed
, runwtsum
,
and runq
functions provide efficient methods for calculating
the specified numeric summary by performing the looping in compiled code.
Value
An object of the same class as x
.
Seealso
runmed
, Rle-class , RleList-class
Author
P. Aboyoun and V. Obenchain
Examples
x <- Rle(1:10, 1:10)
runsum(x, k = 3)
runsum(x, k = 3, endrule = "constant")
runmean(x, k = 3)
runwtsum(x, k = 3, wt = c(0.25, 0.5, 0.25))
runq(x, k = 5, i = 3, endrule = "constant")
## Missing and non-finite values
x <- Rle(c(1, 2, NA, 0, 3, Inf, 4, NaN))
runsum(x, k = 2)
runsum(x, k = 2, na.rm = TRUE)
runmean(x, k = 2, na.rm = TRUE)
runwtsum(x, k = 2, wt = c(0.25, 0.5), na.rm = TRUE)
runq(x, k = 2, i = 2, na.rm = TRUE) ## max value in window
## The .naive_runsum() function demonstrates the semantics of
## runsum(). This test ensures the behavior is consistent with
## base::sum().
.naive_runsum <- function(x, k, na.rm=FALSE)
sapply(0:(length(x)-k),
function(offset) sum(x[1:k + offset], na.rm=na.rm))
x0 <- c(1, Inf, 3, 4, 5, NA)
x <- Rle(x0)
target1 <- .naive_runsum(x0, 3, na.rm = TRUE)
target2 <- .naive_runsum(x, 3, na.rm = TRUE)
stopifnot(target1 == target2)
current <- as.vector(runsum(x, 3, na.rm = TRUE))
stopifnot(target1 == current)
## runmean() and runwtsum() :
x <- Rle(c(2, 1, NA, 0, 1, -Inf))
runmean(x, k = 3)
runmean(x, k = 3, na.rm = TRUE)
runwtsum(x, k = 3, wt = c(0.25, 0.50, 0.25))
runwtsum(x, k = 3, wt = c(0.25, 0.50, 0.25), na.rm = TRUE)
## runq() :
runq(x, k = 3, i = 1, na.rm = TRUE) ## smallest value in window
runq(x, k = 3, i = 3, na.rm = TRUE) ## largest value in window
## When na.rm = TRUE, it is possible the number of non-NA
## values in the window will be less than the 'i' specified.
## Here we request the 4th smallest value in the window,
## which tranlates to the value at the 4/5 (0.8) percentile.
x <- Rle(c(1, 2, 3, 4, 5))
runq(x, k=length(x), i=4, na.rm=TRUE)
## The same request on a Rle with two missing values
## finds the value at the 0.8 percentile of the vector
## at the new length of 3 after the NA's have been removed.
## This translates to round((0.8) * 3).
x <- Rle(c(1, 2, 3, NA, NA))
runq(x, k=length(x), i=4, na.rm=TRUE)
Rle_utils()
Common operations on Rle objects
Description
Common operations on Rle objects.
Seealso
Rle objects
Author
P. Aboyoun
Examples
x <- Rle(10:1, 1:10)
x
sqrt(x)
x^2 + 2 * x + 1
range(x)
sum(x)
mean(x)
z <- c("the", "quick", "red", "fox", "jumps", "over", "the", "lazy", "brown", "dog")
z <- Rle(z, seq_len(length(z)))
chartr("a", "@", z)
toupper(z)
S4Vectors_internals()
S4Vectors internals
Description
Objects, classes and methods defined in the S4Vectors package that are not intended to be used directly.
SimpleList_class()
SimpleList objects
Description
The (non-virtual) SimpleList class extends the List virtual class.
Details
The SimpleList class is the simplest, most generic concrete implementation of the List abstraction. It provides an implementation that subclasses can easily extend.
In a SimpleList object the list elements are stored internally in an ordinary list.
Seealso
List objects for the parent class.
The CompressedList class defined in the IRanges package for a more efficient alternative to SimpleList.
The SimpleIntegerList class defined in the IRanges package for a SimpleList subclass example.
The DataFrame class for another SimpleList subclass example.
Examples
## Displaying a SimpleList object:
x1 <- SimpleList(a=letters, i=Rle(22:20, 4:2))
class(x1)
## The "Simple" prefix is removed from the real class name of the
## object:
x1
library(IRanges)
x2 <- IntegerList(11:12, integer(0), 3:-2, compress=FALSE)
class(x2)
## The "Simple" prefix is removed from the real class name of the
## object:
x2
## This is controlled by internal helper classNameForDisplay():
classNameForDisplay(x2)
Vector_class()
Vector objects
Description
The Vector virtual class serves as the heart of the S4Vectors package and has over 90 subclasses. It serves a similar role as vector in base R.
The Vector class supports the storage of global and element-wise metadata:
The global metadata annotates the object as a whole: this metadata is accessed via the
metadata
accessor and is represented as an ordinary list;The element-wise metadata annotates individual elements of the object: this metadata is accessed via the
mcols
accessor (mcols
stands for metadata columns ) and is represented as a DataTable object (i.e. as an instance of a concrete subclass of DataTable , e.g. a DataFrame object), with a row for each element and a column for each metadata variable. Note that the element-wise metadata can also beNULL
.
To be functional, a class that inherits from Vector must define at
least a length
and a "["
method.
Seealso
Vector-comparison for comparing, ordering, and tabulating vector-like objects.
Vector-setops for set operations on vector-like objects.
Vector-merge for merging vector-like objects.
List for a direct Vector extension that serves a similar role as list in base R.
extractList for grouping elements of a vector-like object into a list-like object.
DataTable which is the type of objects returned by the
mcols
accessor.The Annotated class, which Vector extends.
Examples
showClass("Vector") # shows (some of) the known subclasses
Vector_comparison()
Compare, order, tabulate vector-like objects
Description
Generic functions and methods for comparing, ordering, and tabulating vector-like objects.
Usage
## Element-wise (aka "parallel") comparison of 2 Vector objects
## ------------------------------------------------------------
pcompare(x, y)
list(list("=="), list("Vector,Vector"))(e1, e2)
list(list("=="), list("Vector,ANY"))(e1, e2)
list(list("=="), list("ANY,Vector"))(e1, e2)
list(list("<="), list("Vector,Vector"))(e1, e2)
list(list("<="), list("Vector,ANY"))(e1, e2)
list(list("<="), list("ANY,Vector"))(e1, e2)
list(list("!="), list("Vector,Vector"))(e1, e2)
list(list("!="), list("Vector,ANY"))(e1, e2)
list(list("!="), list("ANY,Vector"))(e1, e2)
list(list(">="), list("Vector,Vector"))(e1, e2)
list(list(">="), list("Vector,ANY"))(e1, e2)
list(list(">="), list("ANY,Vector"))(e1, e2)
list(list("<"), list("Vector,Vector"))(e1, e2)
list(list("<"), list("Vector,ANY"))(e1, e2)
list(list("<"), list("ANY,Vector"))(e1, e2)
list(list(">"), list("Vector,Vector"))(e1, e2)
list(list(">"), list("Vector,ANY"))(e1, e2)
list(list(">"), list("ANY,Vector"))(e1, e2)
## selfmatch()
## -----------
selfmatch(x, ...)
## duplicated() & unique()
## -----------------------
list(list("duplicated"), list("Vector"))(x, incomparables=FALSE, ...)
list(list("unique"), list("Vector"))(x, incomparables=FALSE, ...)
## %in%
## ----
list(list("%in%"), list("Vector,Vector"))(x, table)
list(list("%in%"), list("Vector,ANY"))(x, table)
list(list("%in%"), list("ANY,Vector"))(x, table)
## findMatches() & countMatches()
## ------------------------------
findMatches(x, table, select=c("all", "first", "last"), ...)
countMatches(x, table, ...)
## rank()
## ------
list(list("rank"), list("Vector"))(x, na.last = TRUE, ties.method = c("average",
"first", "last", "random", "max", "min"), by)
## sort()
## ------
list(list("sort"), list("Vector"))(x, decreasing=FALSE, na.last=NA, by)
## table()
## -------
list(list("table"), list("Vector"))(...)
Arguments
Argument | Description |
---|---|
x, y, e1, e2, table | Vector-like objects. |
incomparables | The duplicated method for Vector objects does NOT support this argument. The unique method for Vector objects, which is implemented on top of duplicated , propagates this argument to its call to duplicated . See ?base:: and ?base:: for more information about this argument. |
select | Only select="all" is supported at the moment. Note that you can use match if you want to do select="first" . Otherwise you're welcome to request this on the Bioconductor mailing list. |
ties.method | See ?base:: . |
decreasing, na.last | See ?base:: . |
by | A formula referencing the metadata columns by which to sort, e.g., ~ x + y sorts by column x , breaking ties with column y . |
... | A Vector object for table (the table method for Vector objects can only take one input object). Otherwise, extra arguments supported by specific methods. In particular: |
The default
selfmatch
method, which is implemented on top ofmatch
, propagates the extra arguments to its call tomatch
.The
duplicated
method for Vector objects, which is implemented on top ofselfmatch
, accepts extra argumentfromLast
and propagates the other extra arguments to its call toselfmatch
. See?base::
for more information about this argument.The
unique
method for Vector objects, which is implemented on top ofduplicated
, propagates the extra arguments to its call toduplicated
.The default
findMatches
andcountMatches
methods, which are implemented on top ofmatch
andselfmatch
, propagate the extra arguments to their calls tomatch
andselfmatch
.The
sort
method for Vector objects, which is implemented on top oforder
, only accepts extra argumentna.last
and propagates it to its call toorder
.
Details
Doing pcompare(x, y)
on 2 vector-like objects x
and y
of length 1 must return an integer less than, equal to, or greater than zero
if the single element in x
is considered to be respectively less than,
equal to, or greater than the single element in y
.
If x
or y
have a length != 1, then they are typically expected
to have the same length so pcompare(x, y)
can operate element-wise,
that is, in that case it returns an integer vector of the same length
as x
and y
where the i-th element is the result of compairing
x[i]
and y[i]
. If x
and y
don't have the same
length and are not zero-length vectors, then the shortest is first
recycled to the length of the longest. If one of them is a zero-length
vector then pcompare(x, y)
returns a zero-length integer vector.
selfmatch(x, ...)
is equivalent to match(x, x, ...)
. This
is actually how the default method is implemented. However note that
selfmatch(x, ...)
will typically be more efficient than
match(x, x, ...)
on vector-like objects for which a specific
selfmatch
method is implemented.
findMatches
is an enhanced version of match
which, by default
(i.e. if select="all"
), returns all the matches in a Hits
object.
countMatches
returns an integer vector of the length of x
containing the number of matches in table
for each element
in x
.
Value
For pcompare
: see Details section above.
For selfmatch
: an integer vector of the same length as x
.
For duplicated
, unique
, and %in%
: see
?BiocGenerics::
,
?BiocGenerics::
,
and ?`` . For
findMatches: a [Hits](#hits) object by default (i.e. if
select="all"). For
countMatches: an integer vector of the length of
xcontaining the number of matches in
tablefor each element in
x. For
sort: see
?BiocGenerics::. For
table: a 1D array of integer values promoted to the
"table"class. See
?BiocGeneric::for more information. ## Seealso * The [Vector](#vector) class. * [Hits-comparison](#hits-comparison) for comparing and ordering hits. * [Vector-setops](#vector-setops) for set operations on vector-like objects. * [Vector-merge](#vector-merge) for merging vector-like objects. * [IntegerRanges-comparison](#integerranges-comparison) in the IRanges package for comparing and ordering ranges. * [
==](#==) and [
%in%](#%in%) in the base package, and
BiocGenerics::,
BiocGenerics::,
BiocGenerics::,
BiocGenerics::,
BiocGenerics::,
BiocGenerics::in the BiocGenerics package for general information about the comparison/ordering operators and functions. * The [Hits](#hits) class. *
BiocGeneric::in the BiocGenerics package. ## Note The following notes are for developers who want to implement comparing, ordering, and tabulating methods for their own [Vector](#vector) subclass: * The 6 traditional binary comparison operators are:
==,
!=,
<=,
>=,
<, and
>. The list("S4Vectors") package provides the following methods for these operators: list(" ", "setMethod("==", c("Vector", "Vector"), ", " function(e1, e2) { pcompare(e1, e2) == 0L } ", ") ", "setMethod("<=", c("Vector", "Vector"), ", " function(e1, e2) { pcompare(e1, e2) <= 0L } ", ") ", "setMethod("!=", c("Vector", "Vector"), ", " function(e1, e2) { !(e1 == e2) } ", ") ", "setMethod(">=", c("Vector", "Vector"), ", " function(e1, e2) { e2 <= e1 } ", ") ", "setMethod("<", c("Vector", "Vector"), ", " function(e1, e2) { !(e2 <= e1) } ", ") ", "setMethod(">", c("Vector", "Vector"), ", " function(e1, e2) { !(e1 <= e2) } ", ") ", " ") With these definitions, the 6 binary operators work out-of-the-box on [Vector](#vector) objects for which
pcompareworks the expected way. If
pcompareis not implemented, then it's enough to implement
==and
<=methods to have the 4 remaining operators (
!=,
>=,
<, and
>) work out-of-the-box. * The list("S4Vectors") package provides no
pcomparemethod for [Vector](#vector) objects. Specific
pcomparemethods need to be implemented for specific [Vector](#vector) subclasses (e.g. for [Hits](#hits) and [IntegerRanges](#integerranges) objects). These specific methods must obey the rules described in the Details section above. * The
duplicated,
unique, and
%in%methods for [Vector](#vector) objects are implemented on top of
selfmatch,
duplicated, and
match, respectively, so they work out-of-the-box on [Vector](#vector) objects for which
selfmatch,
duplicated, and
matchwork the expected way. * Also the default
findMatchesand
countMatchesmethods are implemented on top of
matchand
selfmatchso they work out-of-the-box on [Vector](#vector) objects for which those things work the expected way. * However, since
selfmatchitself is also implemented on top of
match, then having
matchwork the expected way is actually enough to get
selfmatch,
duplicated,
unique,
%in%,
findMatches, and
countMatcheswork out-of-the-box on [Vector](#vector) objects. * The
sortmethod for [Vector](#vector) objects is implemented on top of
order, so it works out-of-the-box on [Vector](#vector) objects for which
orderworks the expected way. * The
tablemethod for [Vector](#vector) objects is implemented on top of
selfmatch,
order, and
as.character, so it works out-of-the-box on a [Vector](#vector) object for which those things work the expected way. * The list("S4Vectors") package provides no
matchor
ordermethods for [Vector](#vector) objects. Specific methods need to be implemented for specific [Vector](#vector) subclasses (e.g. for [Hits](#hits) and [IntegerRanges](#integerranges) objects). ## Author Hervé Pagès ## Examples ```r ## --------------------------------------------------------------------- ## A. SIMPLE EXAMPLES ## --------------------------------------------------------------------- y <- c(16L, -3L, -2L, 15L, 15L, 0L, 8L, 15L, -2L) selfmatch(y) x <- c(unique(y), 999L) findMatches(x, y) countMatches(x, y) ## See ?
IntegerRanges-comparison` for more examples (on IntegerRanges
## objects). You might need to load the IRanges package first.
## ---------------------------------------------------------------------
## B. FOR DEVELOPERS: HOW TO IMPLEMENT THE BINARY COMPARISON OPERATORS
## FOR YOUR Vector SUBCLASS
## ---------------------------------------------------------------------
## The answer is: don't implement them. Just implement pcompare() and the
## binary comparison operators will work out-of-the-box. Here is an
## example:
## (1) Implement a simple Vector subclass.
setClass("Raw", contains="Vector", representation(data="raw"))
setMethod("length", "Raw", function(x) length(x@data))
setMethod("[", "Raw",
function(x, i, j, ..., drop) { x@data <- x@data[i]; x }
)
x <- new("Raw", data=charToRaw("AB.x0a-BAA+C"))
stopifnot(identical(length(x), 12L))
stopifnot(identical(x[7:3], new("Raw", data=charToRaw("-a0x."))))
## (2) Implement a "pcompare" method for Raw objects.
setMethod("pcompare", c("Raw", "Raw"),
function(x, y) {as.integer(x@data) - as.integer(y@data)}
)
stopifnot(identical(which(x == x[1]), c(1L, 9L, 10L)))
stopifnot(identical(x[x < x[5]], new("Raw", data=charToRaw(".-+"))))
```
Vector_merge()
Merge vector-like objects
Description
A merge
method for vector-like objects.
Usage
list(list("merge"), list("Vector,Vector"))(x, y, ..., all=FALSE, all.x=NA, all.y=NA, sort=TRUE)
Arguments
Argument | Description |
---|---|
x, y, ... | Vector-like objects, typically all of the same class and typically not list-like objects (even though some list-like objects like IntegerRanges and DNAStringSet are supported). Duplicated elements in each object are removed with a warning. |
all | TRUE or FALSE . Whether the vector elements in the result should be the union (when all=TRUE ) or intersection (when all=FALSE ) of the vector elements in x , y , ... . |
all.x, all.y | To be used only when merging 2 objects (binary merge). Both all.x and all.y must be single logicals. If any of them is NA , then it's set to the value of all . Setting both of them to TRUE or both of them to FALSE is equivalent to setting all to TRUE or to FALSE , respectively (see above). If all.x is TRUE and all.y is FALSE then the vector elements in the result will be the unique elements in x . If all.x is FALSE and all.y is TRUE then the vector elements in the result will be the unique elements in y . |
sort | Whether to sort the merged result. |
Details
This merge
method acts much like merge.data.frame
,
except for 3 important differences:
The matching is based on the vector values, not arbitrary columns in a table.
Self merging is a no-op if
sort=FALSE
(or object already sorted) and if the object has no duplicates.This
merge
method accepts an arbitrary number of vector-like objects (n-ary merge).
If some of the objects to merge are list-like objects not supported by
the method described here, then the merging is simply done by calling
base::merge()
on the objects. This might succeed or not...
Value
A vector-like object of the same class as the input objects (if they all have the same class) containing the merged vector values and metadata columns.
Seealso
The Vector class.
Vector-comparison for comparing and ordering vector-like objects.
Vector-setops for set operations on vector-like objects.
Examples
library(GenomicRanges)
x <- GRanges(c("chr1:1-1000", "chr2:2000-3000"),
score=c(0.45, 0.1), a1=c(5L, 7L), a2=c(6, 8))
y <- GRanges(c("chr2:150-151", "chr1:1-10", "chr2:2000-3000"),
score=c(0.7, 0.82, 0.1), b1=c(0L, 5L, 1L), b2=c(1, -2, 1))
merge(x, y)
merge(x, y, all=TRUE)
merge(x, y, all.x=TRUE)
merge(x, y, all.y=TRUE)
## Shared metadata columns must agree:
mcols(x)$score[2] <- 0.11
#merge(x, y) # error!
## NAs agree with anything:
mcols(x)$score[2] <- NA
merge(x, y)
Vector_setops()
Set operations on vector-like objects
Description
Perform set operations on Vector objects.
Usage
list(list("union"), list("Vector,Vector"))(x, y)
list(list("intersect"), list("Vector,Vector"))(x, y)
list(list("setdiff"), list("Vector,Vector"))(x, y)
list(list("setequal"), list("Vector,Vector"))(x, y)
Arguments
Argument | Description |
---|---|
x, y | Vector-like objects. |
Details
The union
, intersect
, and setdiff
methods for
Vector objects return a Vector object containing respectively
the union, intersection, and (asymmetric!) difference of the 2 sets of
vector elements in x
and y
.
The setequal
method for Vector objects checks for list("set
", " equality") between x
and y
.
They're defined as follow:
list(" setMethod("union", c("Vector", "Vector"),
", " function(x, y) unique(c(x, y))
", " )
", " setMethod("intersect", c("Vector", "Vector"),
", " function(x, y) unique(x[x %in% y])
", " )
", " setMethod("setdiff", c("Vector", "Vector"),
", " function(x, y) unique(x[!(x %in% y)])
", " )
", " setMethod("setequal", c("Vector", "Vector"),
", " function(x, y) all(x %in% y) && all(y %in% x)
", " )
", " ")
so they work out-of-the-box on Vector objects for which c
,
unique
, and %in%
are defined.
Value
union
returns a Vector object obtained by appending to x
the elements in y
that are not already in x
.
intersect
returns a Vector object obtained by keeping only
the elements in x
that are also in y
.
setdiff
returns a Vector object obtained by dropping from
x
the elements that are in y
.
setequal
returns TRUE
if x
and y
contain the
same sets of vector elements and FALSE
otherwise.
union
, intersect
, and setdiff
propagate the names and
metadata columns of their first argument ( x
).
Seealso
Vector-comparison for comparing and ordering vector-like objects.
Vector-merge for merging vector-like objects.
Vector objects.
BiocGenerics::
,BiocGenerics::
, andBiocGenerics::
in the BiocGenerics package for general information about these generic functions.
Author
Hervé Pagès
Examples
## See ?`Hits-setops` for some examples.
aggregate_methods()
Compute summary statistics of subsets of vector-like objects
Description
The S4Vectors package defines aggregate
methods
for Vector , Rle , and List objects.
Usage
list(list("aggregate"), list("Vector"))(x, by, FUN, start=NULL, end=NULL, width=NULL,
frequency=NULL, delta=NULL, ..., simplify=TRUE)
list(list("aggregate"), list("Rle"))(x, by, FUN, start=NULL, end=NULL, width=NULL,
frequency=NULL, delta=NULL, ..., simplify=TRUE)
list(list("aggregate"), list("List"))(x, by, FUN, start=NULL, end=NULL, width=NULL,
frequency=NULL, delta=NULL, ..., simplify=TRUE)
Arguments
Argument | Description |
---|---|
x | A Vector , Rle , or List object. |
by | An object with start , end , and width methods. If x is a List object, the by parameter can be a IntegerRangesList object to aggregate within the list elements rather than across them. When by is a IntegerRangesList object, the output is either a SimpleAtomicList object, if possible, or a SimpleList object, if not. |
FUN | The function, found via match.fun , to be applied to each subset of x . |
start, end, width | The start, end, and width of the subsets. If by is missing, then two of the three must be supplied and have the same length. |
frequency, delta | Optional arguments that specify the sampling frequency and increment within the subsets (in the same fashion as window from the stats package does). |
... | Optional arguments to FUN . |
simplify | A logical value specifying whether the result should be simplified to a vector or matrix if possible. |
Details
Subsets of x
can be specified either via the by
argument
or via the start
, end
, width
, frequency
, and
delta
arguments.
For example, if start
and end
are specified, then:
list("
", " aggregate(x, FUN=FUN, start=start, end=end, ..., simplify=simplify)
")
is equivalent to:
list("
", " sapply(seq_along(start),
", " function(i) FUN(x[start[i]:end[i]], ...), simplify=simplify)
")
(replace x[start[i]:end[i]]
with 2D-style subsetting
x[start[i]:end[i], ]
if x
is a DataFrame object).
Seealso
The
aggregate
function in the stats package.The
start
,end
, andwidth
generic functions defined in the BiocGenerics package.
Examples
x <- Rle(10:2, 1:9)
aggregate(x, x > 4, mean)
aggregate(x, FUN=mean, start=1:26, width=20)
## Note that aggregate() works on a DataFrame object the same way it
## works on an ordinary data frame:
aggregate(DataFrame(state.x77), list(Region=state.region), mean)
aggregate(weight ~ feed, data=DataFrame(chickwts), mean)
library(IRanges)
by <- IRanges(start=1:26, width=20, names=LETTERS)
aggregate(x, by, is.unsorted)
character_utils()
Some utility functions to operate on strings
Description
Some low-level string utilities to operate on ordinary character vectors. For more advanced string manipulations, see the Biostrings package.
Usage
unstrsplit(x, sep="")
## more to come...
Arguments
Argument | Description |
---|---|
x | A list-like object where each list element is a character vector, or a character vector (identity). |
sep | A single string containing the separator. |
Details
unstrsplit(x, sep)
is equivalent to (but much faster than)
sapply(x, paste0, collapse=sep)
. It performs the reverse
transformation of strsplit
, that is,
if x
is a character vector with no NAs and sep
a single
string, then unstrsplit(strsplit(x, split=sep, fixed=TRUE), sep)
is identical to x
. A notable exception to this though is when
strsplit
finds a match at the end of a string, in which case the
last element of the output (which should normally be an empty string)
is not returned (see ?strsplit
for the details).
Value
A character vector with one string per list element in x
.
Seealso
- The
strsplit
function in the base package.
Author
Hervé Pagès
Examples
x <- list(A=c("abc", "XY"), B=NULL, C=letters[1:4])
unstrsplit(x)
unstrsplit(x, sep=",")
unstrsplit(x, sep=" => ")
data(islands)
x <- names(islands)
y <- strsplit(x, split=" ", fixed=TRUE)
x2 <- unstrsplit(y, sep=" ")
stopifnot(identical(x, x2))
## But...
names(x) <- x
y <- strsplit(x, split="in", fixed=TRUE)
x2 <- unstrsplit(y, sep="in")
y[x != x2]
## In other words: strsplit() behavior sucks :-/
expand_methods()
Unlist the list-like columns of a DataFrame object
Description
expand
transforms a DataFrame object into a new
DataFrame object where the columns specified by the user are
unlisted. The transformed DataFrame object has the same colnames
as the original but typically more rows.
Usage
list(list("expand"), list("DataFrame"))(x, colnames, keepEmptyRows = FALSE, recursive = TRUE)
Arguments
Argument | Description |
---|---|
x | A DataFrame object with list-like columns or a Vector object with list-like metadata columns (i.e. with list-like columns in mcols(x) ). |
colnames | A character or numeric vector containing the names or indices of the list-like columns to unlist. The order in which columns are unlisted is controlled by the column order in this vector. This defaults to all of the recursive (list-like) columns in x . |
keepEmptyRows | A logical indicating if rows containing empty list elements in the specified colnames should be retained or dropped. When TRUE , list elements are replaced with NA and all rows are kept. When FALSE , rows with empty list elements in the colnames columns are dropped. |
recursive | If TRUE , expand each column recursively, with the result representing their cartesian product. If FALSE , expand all of the columns in parallel, which requires that they all share the same skeleton. |
Value
A DataFrame object that has been expanded row-wise to match the length of the unlisted columns.
Seealso
- DataFrame objects.
Examples
library(IRanges)
aa <- CharacterList("a", paste0("d", 1:2), paste0("b", 1:3), c(), "c")
bb <- CharacterList(paste0("sna", 1:2),"foo", paste0("bar",1:3),c(),"hica")
df <- DataFrame(aa=aa, bb=bb, cc=11:15)
## Expand by all list-like columns (aa, bb), dropping rows with empty
## list elements:
expand(df)
## Expand the aa column only:
expand(df, colnames="aa", keepEmptyRows=TRUE)
expand(df, colnames="aa", keepEmptyRows=FALSE)
## Expand the aa and then the bb column:
expand(df, colnames=c("aa","bb"), keepEmptyRows=TRUE)
expand(df, colnames=c("aa","bb"), keepEmptyRows=FALSE)
## Expand the aa and dd column in parallel:
df$dd <- relist(seq_along(unlist(aa)), aa)
expand(df, colnames=c("aa","dd"), recursive=FALSE)
integer_utils()
Some utility functions to operate on integer vectors
Description
Some low-level utility functions to operate on ordinary integer vectors.
Usage
isSequence(x, of.length=length(x))
toListOfIntegerVectors(x, sep=",")
## more to come...
Arguments
Argument | Description |
---|---|
x | For isSequence() : An integer vector. For toListOfIntegerVectors() : A character vector where each element is a string containing comma-separated integers in decimal representation. Alternatively x can be a list of raw vectors, in which case it's treated like if it was sapply(x, rawToChar) . |
of.length | The expected length of the integer sequence. |
sep | The separator represented as a single-letter string. |
Details
isSequence()
returns TRUE
or FALSE
depending
on whether x
is identical to seq_len(of.length)
or not.
toListOfIntegerVectors()
is a fast and memory-efficient
implementation of
lapply(strsplit(x, sep, fixed=TRUE), as.integer)
but, unlike the above code, it will raise an error if the input contains NAs or strings that don't represent integer values.
Value
A list parallel to x
where each list element is an integer
vector.
Seealso
Author
Hervé Pagès
Examples
## ---------------------------------------------------------------------
## isSequence()
## ---------------------------------------------------------------------
isSequence(1:5) # TRUE
isSequence(5:1) # FALSE
isSequence(0:5) # FALSE
isSequence(integer(0)) # TRUE
isSequence(1:5, of.length=5) # TRUE (the expected length)
isSequence(1:5, of.length=6) # FALSE (not the expected length)
## ---------------------------------------------------------------------
## toListOfIntegerVectors()
## ---------------------------------------------------------------------
x <- c("1116,0,-19",
" +55291 , 2476,",
"19184,4269,5659,6470,6721,7469,14601",
"7778889, 426900, -4833,5659,6470,6721,7096",
"19184 , -99999")
y <- toListOfIntegerVectors(x)
y
## When it doesn't choke on an NA or string that doesn't represent
## an integer value, toListOfIntegerVectors() is equivalent to
## the function below but is faster and more memory-efficient:
toListOfIntegerVectors2 <- function(x, sep=",")
{
lapply(strsplit(x, sep, fixed=TRUE), as.integer)
}
y2 <- toListOfIntegerVectors2(x)
stopifnot(identical(y, y2))
isSorted()
Test if a vector-like object is sorted
Description
isSorted
and isStrictlySorted
test if a vector-like object
is sorted or strictly sorted, respectively.
isConstant
tests if a vector-like or array-like object is constant.
Currently only isConstant
methods for vectors or arrays of type
integer or double are implemented.
Usage
isSorted(x)
isStrictlySorted(x)
isConstant(x)
Arguments
Argument | Description |
---|---|
x | A vector-like object. Can also be an array-like object for isConstant . |
Details
Vector-like objects of length 0 or 1 are always considered to be sorted, strictly sorted, and constant.
Strictly sorted and constant objects are particular cases of sorted objects.
isStrictlySorted(x)
is equivalent to
isSorted(x) && !anyDuplicated(x)
Value
A single logical i.e. TRUE
, FALSE
or NA
.
Seealso
duplicated
andunique
.
Author
Hervé Pagès
Examples
## ---------------------------------------------------------------------
## A. isSorted() and isStrictlySorted()
## ---------------------------------------------------------------------
x <- 1:10
isSorted(x) # TRUE
isSorted(-x) # FALSE
isSorted(rev(x)) # FALSE
isSorted(-rev(x)) # TRUE
isStrictlySorted(x) # TRUE
x2 <- rep(x, each=2)
isSorted(x2) # TRUE
isStrictlySorted(x2) # FALSE
## ---------------------------------------------------------------------
## B. "isConstant" METHOD FOR integer VECTORS
## ---------------------------------------------------------------------
## On a vector with no NAs:
stopifnot(isConstant(rep(-29L, 10000)))
## On a vector with NAs:
stopifnot(!isConstant(c(0L, NA, -29L)))
stopifnot(is.na(isConstant(c(-29L, -29L, NA))))
## On a vector of length <= 1:
stopifnot(isConstant(NA_integer_))
## ---------------------------------------------------------------------
## C. "isConstant" METHOD FOR numeric VECTORS
## ---------------------------------------------------------------------
## This method does its best to handle rounding errors and special
## values NA, NaN, Inf and -Inf in a way that "makes sense".
## Below we only illustrate handling of rounding errors.
## Here values in 'x' are "conceptually" the same:
x <- c(11/3,
2/3 + 4/3 + 5/3,
50 + 11/3 - 50,
7.00001 - 1000003/300000)
## However, due to machine rounding errors, they are not *strictly*
## equal:
duplicated(x)
unique(x)
## only *nearly* equal:
all.equal(x, rep(11/3, 4)) # TRUE
## 'isConstant(x)' uses 'all.equal()' internally to decide whether
## the values in 'x' are all the same or not:
stopifnot(isConstant(x))
## This is not perfect though:
isConstant((x - 11/3) * 1e8) # FALSE on Intel Pentium paltforms
# (but this is highly machine dependent!)
shiftApply_methods()
Apply a function over subsequences of 2 vector-like objects
Description
shiftApply
loops and applies a function overs subsequences
of vector-like objects X
and Y
.
Usage
shiftApply(SHIFT, X, Y, FUN, ..., OFFSET=0L, simplify=TRUE, verbose=FALSE)
Arguments
Argument | Description |
---|---|
SHIFT | A non-negative integer vector of shift values. |
X, Y | The vector-like objects to shift. |
FUN | The function, found via match.fun , to be applied to each set of shifted vectors. |
... | Further arguments for FUN . |
OFFSET | A non-negative integer offset to maintain throughout the shift operations. |
simplify | A logical value specifying whether or not the result should be simplified to a vector or matrix if possible. |
verbose | A logical value specifying whether or not to print the i indices to track the iterations. |
Details
Let i
be the indices in SHIFT
,
X_i = window(X, 1 + OFFSET, length(X) - SHIFT[i])
, and
Y_i = window(Y, 1 + SHIFT[i], length(Y) - OFFSET)
.
shiftApply
calculates the set of FUN(X_i, Y_i, ...)
values
and returns the results in a convenient form.
Seealso
Examples
set.seed(0)
lambda <- c(rep(0.001, 4500), seq(0.001, 10, length = 500),
seq(10, 0.001, length = 500))
xRle <- Rle(rpois(1e7, lambda))
yRle <- Rle(rpois(1e7, lambda[c(251:length(lambda), 1:250)]))
cor(xRle, yRle)
shifts <- seq(235, 265, by=3)
corrs <- shiftApply(shifts, yRle, xRle, FUN=cor)
cor(xRle, yRle)
shiftApply(249:251, yRle, xRle,
FUN=function(x, y) var(x, y) / (sd(x) * sd(y)))
show_utils()
Display utilities
Description
Low-level utility functions and classes defined in the S4Vectors package to support display of vector-like objects. They are not intended to be used directly.
split_methods()
Divide a vector-like object into groups
Description
split
divides the data in a vector-like object x
into the
groups defined by f
.
NOTE: This man page is for the split
methods defined in the
S4Vectors package. See ?base::
for the
default method (defined in the base package).
Usage
list(list("split"), list("Vector,ANY"))(x, f, drop=FALSE)
list(list("split"), list("ANY,Vector"))(x, f, drop=FALSE)
list(list("split"), list("Vector,Vector"))(x, f, drop=FALSE)
list(list("split"), list("list,Vector"))(x, f, drop=FALSE, ...)
Arguments
Argument | Description |
---|---|
x, f | 2 vector-like objects of the same length. f will typically be a factor, but not necessarily. |
drop | Logical indicating if levels that do not occur should be dropped (if f is a factor). |
... | Arguments passed to base:: (see Details below). |
Details
The first 3 methods just delegate to the
IRanges::
function defined in the
IRanges package.
The method for list
does:
split(x, as.vector(f), drop=drop, ...)
Value
All these methods behave like base::
except that the
first 3 methods return a List object instead of an ordinary list.
Seealso
The
split
function in the base package.The
splitAsList
function in the IRanges package.
Examples
## On an Rle object:
x <- Rle(101:105, 6:2)
split(x, c("B", "B", "A", "B", "A"))
## On a DataFrame object:
groups <- c("group1", "group2")
DF <- DataFrame(
a=letters[1:10],
i=101:110,
group=rep(factor(groups, levels=groups), c(3, 7))
)
split(DF, DF$group)
subsetting_utils()
Subsetting utilities
Description
Low-level utility functions and classes defined in the S4Vectors package to support subsetting of vector-like objects. They are not intended to be used directly.
zip_methods()
Convert between parallel vectors and lists
Description
The zipup
and zipdown
functions convert between two
parallel vectors and a list of doublets (elements of length 2). The
metaphor, borrowed from Python's zip
, is that of a zipper. The
zipup
function interleaves the elements of the parallel vectors
into a list of doublets. The inverse operation is zipdown
,
which returns a Pairs object.
Usage
zipup(x, y, ...)
zipdown(x, ...)
Arguments
Argument | Description |
---|---|
x,y | For zipup , any vector-like object. For zipdown , a doublet list. |
list() | Arguments passed to methods. |
Value
For zipup
, a list-like object, where every element is of length 2.
For zipdown
, a Pairs object.
Seealso
- Pairs objects.
Examples
z <- zipup(1:10, Rle(1L, 10))
pairs <- zipdown(z)