bioconductor v3.9.0 IRanges
Provides efficient low-level and highly reusable S4
Link to this section Summary
Functions
Lists of Atomic Vectors in Natural and Rle Form
Common operations on AtomicList objects
CompressedHitsList objects
CompressedList objects
List of DataFrames
Common operations on DataFrame objects
Grouping objects
Examples of basic manipulation of Hits objects
IPosRanges objects
Comparing and ordering ranges
Memory-efficient representation of integer positions
List of IRanges and NormalIRanges
IRanges and NormalIRanges objects
The IRanges constructor and supporting functions
IRanges internals
IRanges utility functions
IntegerRangesList objects
IntegerRanges objects
List objects (old man page)
MaskCollection objects
Nested Containment List objects
Data on ranges
Selection of ranges and columns
List of RleViews
The RleViews class
Rle objects (old man page)
Vector objects (old man page)
List of Views
Views objects
Coverage of a set of ranges
Group elements of a vector-like object into a list-like object
Extract list fragments from a list-like object
Finding overlapping ranges
Inter range transformations of an IntegerRanges, Views, IntegerRangesList, or MaskCollection object
Intra range transformations of an IRanges, IPos, Views, RangesList, or MaskCollection object
Split elements belonging to multiple groups
Finding the nearest range neighbor
Squeeze the ranges out of a range-based object
Read a mask from a file
reverse
2 methods that should be documented somewhere else
Set operations on IntegerRanges and IntegerRangesList objects
Slice a vector-like or list-like object
Summarize views on a vector-like object with numeric values
Link to this section Functions
AtomicList_class()
Lists of Atomic Vectors in Natural and Rle Form
Description
An extension of List that holds only atomic vectors in either a natural or run-length encoded form.
Details
The lists of atomic vectors are LogicalList
, IntegerList
,
NumericList
, ComplexList
, CharacterList
, and
RawList
. There is also an RleList
class for
run-length encoded versions of these atomic vector types.
Each of the above mentioned classes is virtual with Compressed and Simple non-virtual representations.
Seealso
AtomicList-utils for common operations on AtomicList objects.
List objects in the S4Vectors package for the parent class.
Author
P. Aboyoun
Examples
int1 <- c(1L,2L,3L,5L,2L,8L)
int2 <- c(15L,45L,20L,1L,15L,100L,80L,5L)
collection <- IntegerList(int1, int2)
## names
names(collection) <- c("one", "two")
names(collection)
names(collection) <- NULL # clear names
names(collection)
names(collection) <- "one"
names(collection) # c("one", NA)
## extraction
collection[[1]] # range1
collection[["1"]] # NULL, does not exist
collection[["one"]] # range1
collection[[NA_integer_]] # NULL
## subsetting
collection[numeric()] # empty
collection[NULL] # empty
collection[] # identity
collection[c(TRUE, FALSE)] # first element
collection[2] # second element
collection[c(2,1)] # reversed
collection[-1] # drop first
collection$one
## replacement
collection$one <- int2
collection[[2]] <- int1
## concatenating
col1 <- IntegerList(one = int1, int2)
col2 <- IntegerList(two = int2, one = int1)
col3 <- IntegerList(int2)
append(col1, col2)
append(col1, col2, 0)
col123 <- c(col1, col2, col3)
col123
## revElements
revElements(col123)
revElements(col123, 4:5)
AtomicList_utils()
Common operations on AtomicList objects
Description
Common operations on AtomicList objects.
Seealso
- AtomicList objects.
Author
P. Aboyoun
Examples
## group generics
int1 <- c(1L,2L,3L,5L,2L,8L)
int2 <- c(15L,45L,20L,1L,15L,100L,80L,5L)
col1 <- IntegerList(one = int1, int2)
2 * col1
col1 + col1
col1 > 2
sum(col1) # equivalent to (but faster than) 'sapply(col1, sum)'
mean(col1) # equivalent to 'sapply(col1, mean)'
CompressedHitsList_class()
CompressedHitsList objects
Description
An efficient representation of HitsList objects.
See ?
for more information about HitsList
objects.
Seealso
HitsList objects.
Note
This class is highly experimental. It has not been well tested and may disappear at any time.
Author
Michael Lawrence
CompressedList_class()
CompressedList objects
Description
Like the SimpleList class defined in the S4Vectors package, the CompressedList class extends the List virtual class.
Details
Unlike the SimpleList class, CompressedList is virtual, that is, it cannot be instantiated. Many concrete (i.e. non-virtual) CompressedList subclasses are defined and documented in this package (e.g. CompressedIntegerList , CompressedCharacterList , CompressedRleList , etc...), as well as in other packages (e.g. GRangesList in the GenomicRanges package, GAlignmentsList in the GenomicAlignments package, etc...). It's easy for developers to extend CompressedList to create a new CompressedList subclass and there is generally very little work involved to make this new subclass fully operational.
In a CompressedList object the list elements are concatenated together in a single vector-like object. The partitioning of this single vector-like object (i.e. the information about where each original list element starts and ends) is also kept in the CompressedList object. This internal representation is generally more memory efficient than SimpleList , especially if the object has many list elements (e.g. thousands or millions). Also it makes it possible to implement many basic list operations very efficiently.
Many objects like LogicalList , IntegerList ,
CharacterList , RleList , etc... exist in 2 flavors:
CompressedList and SimpleList . Each flavor is
incarnated by a concrete subclass: CompressedLogicalList and
SimpleLogicalList for virtual class LogicalList ,
CompressedIntegerList and SimpleIntegerList for
virtual class IntegerList , etc...
It's easy to switch from one representation to the other with
as(x, "CompressedList")
and as(x, "SimpleList")
.
Also the constructor function for those virtual classes have a
switch that lets the user choose the representation at construction
time e.g. CharacterList(..., compress=TRUE)
or
CharacterList(..., compress=FALSE)
. See below for more
information.
Seealso
The List class defined and documented in the S4Vectors package for the parent class.
The SimpleList class defined and documented in the S4Vectors package for an alternative to CompressedList.
The CompressedIntegerList class for a CompressedList subclass example.
Examples
## Displaying a CompressedList object:
x <- IntegerList(11:12, integer(0), 3:-2, compress=TRUE)
class(x)
## The "Simple" prefix is removed from the real class name of the
## object:
x
## This is controlled by internal helper classNameForDisplay():
classNameForDisplay(x)
DataFrameList_class()
List of DataFrames
Description
Represents a list of DataFrame objects.
The SplitDataFrameList
class contains the additional restriction
that all the columns be of the same name and type. Internally it is stored
as a list of DataFrame
objects and extends
List .
Seealso
DataFrame
Author
Michael Lawrence
DataFrame_utils()
Common operations on DataFrame objects
Description
Common operations on DataFrame objects.
Seealso
DataTable and Vector
Author
Michael Lawrence
Examples
## split
sw <- DataFrame(swiss)
swsplit <- split(sw, sw[["Education"]])
## rbind
do.call(rbind, as.list(swsplit))
## cbind
cbind(DataFrame(score), DataFrame(counts))
Grouping_class()
Grouping objects
Description
We call list("grouping") an arbitrary mapping from a collection of NO objects to a collection of NG groups, or, more formally, a bipartite graph between integer sets [1, NO] and [1, NG]. Objects mapped to a given group are said to belong to, or to be assigned to, or to be in that group. Additionally, the objects in each group are ordered. So for example the 2 following groupings are considered different: list(" ", " Grouping 1: NG = 3, NO = 5 ", " group objects ", " 1 : 4, 2 ", " 2 : ", " 3 : 4 ", " ", " Grouping 2: NG = 3, NO = 5 ", " group objects ", " 1 : 2, 4 ", " 2 : ", " 3 : 4 ") There are no restriction on the mapping e.g. any object can be mapped to 0, 1, or more groups, and can be mapped twice to the same group. Also some or all the groups can be empty.
The Grouping class is a virtual class that formalizes the most general kind of grouping. More specific groupings (e.g. list("many-to-one groupings") or list("block-groupings") ) are formalized via specific Grouping subclasses.
This man page documents the core Grouping API, and 3 important Grouping subclasses: ManyToOneGrouping, GroupingRanges, and Partitioning (the last one deriving from the 2 first).
Seealso
IntegerList-class , IntegerRanges-class , IRanges-class , successiveIRanges , cumsum , diff
Author
Hervé Pagès, Michael Lawrence
Examples
showClass("Grouping") # shows (some of) the known subclasses
## ---------------------------------------------------------------------
## A. H2LGrouping OBJECTS
## ---------------------------------------------------------------------
high2low <- c(NA, NA, 2, 2, NA, NA, NA, 6, NA, 1, 2, NA, 6, NA, NA, 2)
h2l <- H2LGrouping(high2low)
h2l
## The core Grouping API:
length(h2l)
nobj(h2l) # same as 'length(h2l)' for H2LGrouping objects
h2l[[1]]
h2l[[2]]
h2l[[3]]
h2l[[4]]
h2l[[5]]
grouplengths(h2l) # same as 'unname(sapply(h2l, length))'
grouplengths(h2l, 5:2)
members(h2l, 5:2) # all the members are put together and sorted
togroup(h2l)
togroup(h2l, 5:2)
togrouplength(h2l) # same as 'grouplengths(h2l, togroup(h2l))'
togrouplength(h2l, 5:2)
## The List API:
as.list(h2l)
sapply(h2l, length)
## ---------------------------------------------------------------------
## B. Dups OBJECTS
## ---------------------------------------------------------------------
dups1 <- as(h2l, "Dups")
dups1
duplicated(dups1) # same as 'duplicated(togroup(dups1))'
### The purpose of a Dups object is to describe the groups of duplicated
### elements in a vector-like object:
x <- c(2, 77, 4, 4, 7, 2, 8, 8, 4, 99)
x_high2low <- high2low(x)
x_high2low # same length as 'x'
dups2 <- Dups(x_high2low)
dups2
togroup(dups2)
duplicated(dups2)
togrouplength(dups2) # frequency for each element
table(x)
## ---------------------------------------------------------------------
## C. Partitioning OBJECTS
## ---------------------------------------------------------------------
pbe1 <- PartitioningByEnd(c(4, 7, 7, 8, 15), names=LETTERS[1:5])
pbe1 # the 3rd partition is empty
## The core Grouping API:
length(pbe1)
nobj(pbe1)
pbe1[[1]]
pbe1[[2]]
pbe1[[3]]
grouplengths(pbe1) # same as 'unname(sapply(pbe1, length))'
# and 'width(pbe1)'
togroup(pbe1)
togrouplength(pbe1) # same as 'grouplengths(pbe1, togroup(pbe1))'
names(pbe1)
## The IntegerRanges core API:
start(pbe1)
end(pbe1)
width(pbe1)
## The List API:
as.list(pbe1)
sapply(pbe1, length)
## Replacing the names:
names(pbe1)[3] <- "empty partition"
pbe1
## Coercion to an IRanges object:
as(pbe1, "IRanges")
## Other examples:
PartitioningByEnd(c(0, 0, 19), names=LETTERS[1:3])
PartitioningByEnd() # no partition
PartitioningByEnd(integer(9)) # all partitions are empty
x <- c(1L, 5L, 5L, 6L, 8L)
pbe2 <- PartitioningByEnd(x, NG=10L)
stopifnot(identical(togroup(pbe2), x))
pbw2 <- PartitioningByWidth(x, NG=10L)
stopifnot(identical(togroup(pbw2), x))
## ---------------------------------------------------------------------
## D. RELATIONSHIP BETWEEN Partitioning OBJECTS AND successiveIRanges()
## ---------------------------------------------------------------------
mywidths <- c(4, 3, 0, 1, 7)
## The 3 following calls produce the same ranges:
ir <- successiveIRanges(mywidths) # IRanges instance.
pbe <- PartitioningByEnd(cumsum(mywidths)) # PartitioningByEnd instance.
pbw <- PartitioningByWidth(mywidths) # PartitioningByWidth instance.
stopifnot(identical(as(ir, "PartitioningByEnd"), pbe))
stopifnot(identical(as(ir, "PartitioningByWidth"), pbw))
Hits_class_leftovers()
Examples of basic manipulation of Hits objects
Description
IMPORTANT NOTE - 4/29/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for Hits objects located in the S4Vectors package.
Details
The as.data.frame
method coerces a Hits
object to a two column
data.frame
with one row for each hit, where the value in the first
column is the index of an element in the query and the value in the second
column is the index of an element in the subject.
Seealso
The Hits class defined and documented in the S4Vectors package.
Examples
query <- IRanges(c(1, 4, 9), c(5, 7, 10))
subject <- IRanges(c(2, 2, 10), c(2, 3, 12))
hits <- findOverlaps(query, subject)
as.matrix(hits)
as.data.frame(hits)
as.table(hits) # hits per query
as.table(t(hits)) # hits per subject
## Turn a Hits object into an IntegerList object with one list element
## per element in the original query.
as(hits, "IntegerList")
as(hits, "List") # same as as(hits, "IntegerList")
## Turn a Hits object into a PartitioningByEnd object that describes
## the grouping of hits by query.
as(hits, "PartitioningByEnd")
as(hits, "Partitioning") # same as as(hits, "PartitioningByEnd")
## ---------------------------------------------------------------------
## remapHits()
## ---------------------------------------------------------------------
hits2 <- remapHits(hits,
Rnodes.remapping=factor(c("e", "e", "d"), letters[1:5]))
hits2
hits3 <- remapHits(hits,
Rnodes.remapping=c(5, 5, 4), new.nRnode=5)
hits3
stopifnot(identical(hits2, hits3))
IPosRanges_class()
IPosRanges objects
Description
The IPosRanges virtual class is a general container for storing a vector of ranges of integer positions.
Details
An IPosRanges object is a vector-like object where each element describes a "range of integer positions".
A "range of integer values" is a finite set of consecutive integer values. Each range can be fully described with exactly 2 integer values which can be arbitrarily picked up among the 3 following values: its "start" i.e. its smallest (or first, or leftmost) value; its "end" i.e. its greatest (or last, or rightmost) value; and its "width" i.e. the number of integer values in the range. For example the set of integer values that are greater than or equal to -20 and less than or equal to 400 is the range that starts at -20 and has a width of 421. In other words, a range is a closed, one-dimensional interval with integer end points and on the domain of integers.
The starting point (or "start") of a range can be any integer (see
start
below) but its "width" must be a non-negative integer
(see width
below). The ending point (or "end") of a range is
equal to its "start" plus its "width" minus one (see end
below).
An "empty" range is a range that contains no value i.e. a range that
has a null width. Depending on the context, it can be interpreted
either as just the empty set of integers or, more precisely,
as the position between its "end" and its "start" (note that
for an empty range, the "end" equals the "start" minus one).
The length of an IPosRanges object is the number of ranges in it, not the number of integer values in its ranges.
An IPosRanges object is considered empty iff all its ranges are empty.
IPosRanges objects have a vector-like semantic i.e. they only support single subscript subsetting (unlike, for example, standard R data frames which can be subsetted by row and by column).
The IPosRanges class itself is a virtual class. The following classes derive directly from it: IRanges , IPos , NCList , and GroupingRanges .
Seealso
IRanges objects ( NormalIRanges objects are documented in the same man page).
The IPos class, a memory-efficient IPosRanges derivative for representing integer positions (i.e. integer ranges of width 1).
IPosRanges-comparison for comparing and ordering ranges.
findOverlaps-methods for finding/counting overlapping ranges.
intra-range-methods and inter-range-methods for intra range and inter range transformations of IntegerRanges derivatives.
coverage-methods for computing the coverage of a set of ranges.
setops-methods for set operations on ranges.
nearest-methods for finding the nearest range neighbor.
Author
H. Pagès and M. Lawrence
Examples
## ---------------------------------------------------------------------
## Basic manipulation
## ---------------------------------------------------------------------
x <- IRanges(start=c(2:-1, 13:15), width=c(0:3, 2:0))
x
length(x)
start(x)
width(x)
end(x)
isEmpty(x)
as.matrix(x)
as.data.frame(x)
## Subsetting:
x[4:2] # 3 ranges
x[-1] # 6 ranges
x[FALSE] # 0 range
x0 <- x[width(x) == 0] # 2 ranges
isEmpty(x0)
## Use the replacement methods to resize the ranges:
width(x) <- width(x) * 2 + 1
x
end(x) <- start(x) # equivalent to width(x) <- 0
x
width(x) <- c(2, 0, 4)
x
start(x)[3] <- end(x)[3] - 2 # resize the 3rd range
x
## Name the elements:
names(x)
names(x) <- c("range1", "range2")
x
x[is.na(names(x))] # 5 ranges
x[!is.na(names(x))] # 2 ranges
ir <- IRanges(c(1,5), c(3,10))
ir*1 # no change
ir*c(1,2) # zoom second range by 2X
ir*-2 # zoom out 2X
IPosRanges_comparison()
Comparing and ordering ranges
Description
Methods for comparing and/or ordering the ranges in IPosRanges derivatives (e.g. IRanges , IPos , or NCList objects).
Usage
## match() & selfmatch()
## ---------------------
list(list("match"), list("IPosRanges,IPosRanges"))(x, table, nomatch=NA_integer_, incomparables=NULL,
method=c("auto", "quick", "hash"))
list(list("selfmatch"), list("IPosRanges"))(x, method=c("auto", "quick", "hash"))
## order() and related methods
## ----------------------------
list(list("is.unsorted"), list("IPosRanges"))(x, na.rm=FALSE, strictly=FALSE)
list(list("order"), list("IPosRanges"))(..., na.last=TRUE, decreasing=FALSE,
method=c("auto", "shell", "radix"))
## Generalized parallel comparison of 2 IPosRanges derivatives
## -----------------------------------------------------------
list(list("pcompare"), list("IPosRanges,IPosRanges"))(x, y)
rangeComparisonCodeToLetter(code)
Arguments
Argument | Description |
---|---|
x, table, y | IPosRanges derivatives e.g. IRanges , IPos , or NCList objects. |
nomatch | The value to be returned in the case when no match is found. It is coerced to an integer . |
incomparables | Not supported. |
method | For match and selfmatch : Use a Quicksort-based ( method="quick" ) or a hash-based ( method="hash" ) algorithm. The latter tends to give better performance, except maybe for some pathological input that we've not encountered so far. When method="auto" is specified, the most efficient algorithm will be used, that is, the hash-based algorithm if length(x) <= 2^29 , otherwise the Quicksort-based algorithm. For order : The method argument is ignored. |
na.rm | Ignored. |
strictly | Logical indicating if the check should be for strictly increasing values. |
... | One or more IPosRanges derivatives. The 2nd and following objects are used to break ties. |
na.last | Ignored. |
decreasing | TRUE or FALSE . |
code | A vector of codes as returned by pcompare . |
Details
Two ranges of an IPosRanges derivative are considered equal iff
they share the same start and width.
duplicated()
and unique()
on an IPosRanges
derivative are conforming to this.
Note that with this definition, 2 empty ranges are generally not equal (they need to share the same start to be considered equal). This means that, when it comes to comparing ranges, an empty range is interpreted as a position between its end and start. For example, a typical usecase is comparison of insertion points defined along a string (like a DNA sequence) and represented as empty ranges.
The "natural order" for the elements of an IPosRanges derivative is to order them (a) first by start and (b) then by width. This way, the space of integer ranges is totally ordered.
pcompare()
, ==
, !=
, <=
, >=
, <
and >
on IPosRanges derivatives behave accordingly to
this "natural order".
is.unsorted()
, order()
, sort()
, rank()
on
IPosRanges derivatives also behave accordingly to this
"natural order".
Finally, note that some list("inter range transformations") like
reduce
or disjoin
also use this "natural order"
implicitly when operating on IPosRanges derivatives.
list(" ", " ", list(list(), list(" ", " ", list("pcompare(x, y)"), ": ", " Performs element-wise (aka "parallel") comparison of 2 ", " ", list("IPosRanges"), " objects of ", list("x"), " and ", list("y"), ", that is, ", " returns an integer vector where the i-th element is a code describing ", " how ", list("x[i]"), " is qualitatively positioned with respect to ", list("y[i]"), ". ", " ", " Here is a summary of the 13 predefined codes (and their letter ",
" equivalents) and their meanings:
", " ", list(" ", " -6 a: x[i]: .oooo....... 6 m: x[i]: .......oooo. ", " y[i]: .......oooo. y[i]: .oooo....... ", " ", " -5 b: x[i]: ..oooo...... 5 l: x[i]: ......oooo.. ", " y[i]: ......oooo.. y[i]: ..oooo...... ", " ", " -4 c: x[i]: ...oooo..... 4 k: x[i]: .....oooo... ", " y[i]: .....oooo... y[i]: ...oooo..... ", " ", " -3 d: x[i]: ...oooooo... 3 j: x[i]: .....oooo... ",
" y[i]: .....oooo... y[i]: ...oooooo...
", " ", " -2 e: x[i]: ..oooooooo.. 2 i: x[i]: ....oooo.... ", " y[i]: ....oooo.... y[i]: ..oooooooo.. ", " ", " -1 f: x[i]: ...oooo..... 1 h: x[i]: ...oooooo... ", " y[i]: ...oooooo... y[i]: ...oooo..... ", " ", " 0 g: x[i]: ...oooooo... ", " y[i]: ...oooooo... ", " "), " ", " ", " Note that this way of comparing ranges is a refinement over the ",
" standard ranges comparison defined by the ", list("=="), ", ", list("!="), ",
", " ", list("<="), ", ", list(">="), ", ", list("<"), " and ", list(">"), " operators. In particular ", " a code that is ", list("< 0"), ", ", list("= 0"), ", or ", list("> 0"), ", corresponds to ", " ", list("x[i] < y[i]"), ", ", list("x[i] == y[i]"), ", or ", list("x[i] > y[i]"), ", ", " respectively. ", " ", " The ", list("pcompare"), " method for ", list("IPosRanges"), " derivatives is ",
" guaranteed to return predefined codes only but methods for other
", " objects (e.g. for ", list("GenomicRanges"), " objects) can ", " return non-predefined codes. Like for the predefined codes, the sign ", " of any non-predefined code must tell whether ", list("x[i]"), " is less than, ", " or greater than ", list("y[i]"), ". ", " ")), " ", " ", list(list(), list(" ", " ", list("rangeComparisonCodeToLetter(x)"), ": ", " Translate the codes returned by ",
list("pcompare"), ". The 13 predefined
", " codes are translated as follow: -6 -> a; -5 -> b; -4 -> c; -3 -> d; ", " -2 -> e; -1 -> f; 0 -> g; 1 -> h; 2 -> i; 3 -> j; 4 -> k; 5-> l; 6 -> m. ", " Any non-predefined code is translated to X. ", " The translated codes are returned in a factor with 14 levels: ", " a, b, ..., l, m, X. ", " ")), " ", " ", list(list(), list(" ", " ", list("match(x, table, nomatch=NAinteger, method=c("auto", "quick", "hash"))"),
":
", " Returns an integer vector of the length of ", list("x"), ", ", " containing the index of the first matching range in ", list("table"), " ", " (or ", list("nomatch"), " if there is no matching range) for each range ", " in ", list("x"), ". ", " ")), " ", " ", list(list(), list(" ", " ", list("selfmatch(x, method=c("auto", "quick", "hash"))"), ": ", " Equivalent to, but more efficient than, ", " ", list("match(x, x, method=method)"),
".
", " ")), " ", " ", list(list(), list(" ", " ", list("duplicated(x, fromLast=FALSE, method=c("auto", "quick", "hash"))"), ": ", " Determines which elements of ", list("x"), " are equal to elements ", " with smaller subscripts, and returns a logical vector indicating ", " which elements are duplicates. ", list("duplicated(x)"), " is equivalent to, ", " but more efficient than, ", list("duplicated(as.data.frame(x))"), " on an ", " ", list("IPosRanges"),
" derivative.
", " See ", list(list("duplicated")), " in the ", list("base"), " package for more ", " details. ", " ")), " ", " ", list(list(), list(" ", " ", list("unique(x, fromLast=FALSE, method=c("auto", "quick", "hash"))"), ": ", " Removes duplicate ranges from ", list("x"), ". ", list("unique(x)"), " is equivalent ", " to, but more efficient than, ", list("unique(as.data.frame(x))"), " on an ", " ", list("IPosRanges"), " derivative. ",
" See ", list(list("unique")), " in the ", list("base"), " package for more
", " details. ", " ")), " ", " ", list(list(), list(" ", " ", list("x %in% table"), ": ", " A shortcut for finding the ranges in ", list("x"), " that match any of ", " the ranges in ", list("table"), ". Returns a logical vector of length ", " equal to the number of ranges in ", list("x"), ". ", " ")), " ", " ", list(list(), list(" ", " ", list("findMatches(x, table, method=c("auto", "quick", "hash"))"),
":
", " An enhanced version of ", list("match"), " that returns all the matches ", " in a ", list("Hits"), " object. ", " ")), " ", " ", list(list(), list(" ", " ", list("countMatches(x, table, method=c("auto", "quick", "hash"))"), ": ", " Returns an integer vector of the length of ", list("x"), " containing the ", " number of matches in ", list("table"), " for each element in ", list("x"), ". ", " ")), " ", " ", list(list(), list(" ", " ",
list("order(...)"), ":
", " Returns a permutation which rearranges its first argument (an ", " ", list("IPosRanges"), " derivative) into ascending order, breaking ties ", " by further arguments (also ", list("IPosRanges"), " derivatives). ", " ")), " ", " ", list(list(), list(" ", " ", list("sort(x)"), ": ", " Sorts ", list("x"), ". ", " See ", list(list("sort")), " in the ", list("base"), " package for more details. ", " ")), " ", " ", list(
list(), list("
", " ", list("rank(x, na.last=TRUE, ties.method=c("average", "first", "random", "max", "min"))"), ": ", " Returns the sample ranks of the ranges in ", list("x"), ". ", " See ", list(list("rank")), " in the ", list("base"), " package for more details. ", " ")), " ", " ")
Seealso
The IPosRanges class.
Vector-comparison in the S4Vectors package for general information about comparing, ordering, and tabulating vector-like objects.
GenomicRanges-comparison in the GenomicRanges package for comparing and ordering genomic ranges.
findOverlaps
for finding overlapping ranges.intra-range-methods and inter-range-methods for intra and inter range transformations.
setops-methods for set operations on IRanges objects.
Author
Hervé Pagès
Examples
## ---------------------------------------------------------------------
## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 IPosRanges
## DERIVATIVES
## ---------------------------------------------------------------------
x0 <- IRanges(1:11, width=4)
x0
y0 <- IRanges(6, 9)
pcompare(x0, y0)
pcompare(IRanges(4:6, width=6), y0)
pcompare(IRanges(6:8, width=2), y0)
pcompare(x0, y0) < 0 # equivalent to 'x0 < y0'
pcompare(x0, y0) == 0 # equivalent to 'x0 == y0'
pcompare(x0, y0) > 0 # equivalent to 'x0 > y0'
rangeComparisonCodeToLetter(-10:10)
rangeComparisonCodeToLetter(pcompare(x0, y0))
## Handling of zero-width ranges (a.k.a. empty ranges):
x1 <- IRanges(11:17, width=0)
x1
pcompare(x1, x1[4])
pcompare(x1, IRanges(12, 15))
## Note that x1[2] and x1[6] are empty ranges on the edge of non-empty
## range IRanges(12, 15). Even though -1 and 3 could also be considered
## valid codes for describing these configurations, pcompare()
## considers x1[2] and x1[6] to be *adjacent* to IRanges(12, 15), and
## thus returns codes -5 and 5:
pcompare(x1[2], IRanges(12, 15)) # -5
pcompare(x1[6], IRanges(12, 15)) # 5
x2 <- IRanges(start=c(20L, 8L, 20L, 22L, 25L, 20L, 22L, 22L),
width=c( 4L, 0L, 11L, 5L, 0L, 9L, 5L, 0L))
x2
which(width(x2) == 0) # 3 empty ranges
x2[2] == x2[2] # TRUE
x2[2] == x2[5] # FALSE
x2 == x2[4]
x2 >= x2[3]
## ---------------------------------------------------------------------
## B. match(), selfmatch(), %in%, duplicated(), unique()
## ---------------------------------------------------------------------
table <- x2[c(2:4, 7:8)]
match(x2, table)
x2 %in% table
duplicated(x2)
unique(x2)
## ---------------------------------------------------------------------
## C. findMatches(), countMatches()
## ---------------------------------------------------------------------
findMatches(x2, table)
countMatches(x2, table)
x2_levels <- unique(x2)
countMatches(x2_levels, x2)
## ---------------------------------------------------------------------
## D. order() AND RELATED METHODS
## ---------------------------------------------------------------------
is.unsorted(x2)
order(x2)
sort(x2)
rank(x2, ties.method="first")
IPos_class()
Memory-efficient representation of integer positions
Description
The IPos class is a container for storing a set of integer positions where most of the positions are typically (but not necessarily) adjacent. Because integer positions can be seen as integer ranges of width 1, the IPos class extends the IntegerRanges virtual class. Note that even though an IRanges object can be used for storing integer positions, using an IPos object will be much more memory-efficient, especially when the object contains long runs of adjacent positions in ascending order .
Usage
IPos(pos_runs) # constructor function
Arguments
Argument | Description |
---|---|
pos_runs | An IRanges object (or any other IntegerRanges derivative) where each range is interpreted as a run of adjacent ascending positions. If pos_runs is not an IntegerRanges derivative, IPos() first tries to coerce it to one with as(pos_runs, "IntegerRanges", strict=FALSE) . |
Value
An IPos object.
Seealso
The GPos class in the list("GenomicRanges") package for a memory-efficient representation of list("genomic ", " positions") (i.e. genomic ranges of width 1).
IntegerRanges and IRanges objects.
IPosRanges-comparison for comparing and ordering integer ranges and/or positions.
findOverlaps-methods for finding overlapping integer ranges and/or positions.
nearest-methods for finding the nearest integer range and/or position.
Note
Like for any Vector derivative, the length of an
IPos object cannot exceed .Machine$integer.max
(i.e. 2^31 on
most platforms). IPos()
will return an error if pos_runs
contains too many integer positions.
Author
Hervé Pagès; based on ideas borrowed from Georg Stricker georg.stricker@in.tum.de and Julien Gagneur gagneur@in.tum.de
Examples
## ---------------------------------------------------------------------
## BASIC EXAMPLES
## ---------------------------------------------------------------------
## Example 1:
ipos1 <- IPos(c("44-53", "5-10", "2-5"))
ipos1
length(ipos1)
pos(ipos1) # same as 'start(ipos1)' and 'end(ipos1)'
as.character(ipos1)
as.data.frame(ipos1)
as(ipos1, "IRanges")
as.data.frame(as(ipos1, "IRanges"))
ipos1[9:17]
## Example 2:
pos_runs <- IRanges(c(1, 6, 12, 17), c(5, 10, 16, 20))
ipos2 <- IPos(pos_runs)
ipos2
## Example 3:
ipos3A <- ipos3B <- IPos(c("1-15000", "15400-88700"))
npos <- length(ipos3A)
mcols(ipos3A)$sample <- Rle("sA")
sA_counts <- sample(10, npos, replace=TRUE)
mcols(ipos3A)$counts <- sA_counts
mcols(ipos3B)$sample <- Rle("sB")
sB_counts <- sample(10, npos, replace=TRUE)
mcols(ipos3B)$counts <- sB_counts
ipos3 <- c(ipos3A, ipos3B)
ipos3
## ---------------------------------------------------------------------
## MEMORY USAGE
## ---------------------------------------------------------------------
## Coercion to IRanges works...
ipos4 <- IPos(c("1-125000", "135000-575000"))
ir4 <- as(ipos4, "IRanges")
ir4
## ... but is generally not a good idea:
object.size(ipos4)
object.size(ir4) # 1739 times bigger than the IPos object!
## Shuffling the order of the positions impacts memory usage:
ipos4s <- sample(ipos4)
object.size(ipos4s)
## AN IMPORTANT NOTE: In the worst situations, IPos still performs as
## good as an IRanges object.
object.size(as(ipos4s, "IRanges")) # same size as 'ipos4s'
## Best case scenario is when the object is strictly sorted (i.e.
## positions are in strict ascending order).
## This can be checked with:
is.unsorted(ipos4, strict=TRUE) # 'ipos4' is strictly sorted
## ---------------------------------------------------------------------
## USING MEMORY-EFFICIENT METADATA COLUMNS
## ---------------------------------------------------------------------
## In order to keep memory usage as low as possible, it is recommended
## to use a memory-efficient representation of the metadata columns that
## we want to set on the object. Rle's are particularly well suited for
## this, especially if the metadata columns contain long runs of
## identical values. This is the case for example if we want to use an
## IPos object to represent the coverage of sequencing reads along a
## chromosome.
## Example 5:
library(pasillaBamSubset)
library(Rsamtools) # for the BamFile() constructor function
bamfile1 <- BamFile(untreated1_chr4())
bamfile2 <- BamFile(untreated3_chr4())
ipos5 <- IPos(IRanges(1, seqlengths(bamfile1)[["chr4"]]))
library(GenomicAlignments) # for "coverage" method for BamFile objects
cov1 <- coverage(bamfile1)$chr4
cov2 <- coverage(bamfile2)$chr4
mcols(ipos5) <- DataFrame(cov1, cov2)
ipos5
object.size(ipos5) # lightweight
## Keep only the positions where coverage is at least 10 in one of the
## 2 samples:
|ipos5[mcols(ipos5)$cov1 >= 10 | mcols(ipos5)$cov2 >= 10]|
IRangesList_class()
List of IRanges and NormalIRanges
Description
IRangesList and NormalIRangesList objects for storing IRanges and NormalIRanges objects respectively.
Seealso
IntegerRangesList , the parent of this class, for more functionality.
intra-range-methods and inter-range-methods for intra and inter range transformations of IRangesList objects.
setops-methods for set operations on IRangesList objects.
Author
Michael Lawrence
Examples
range1 <- IRanges(start=c(1,2,3), end=c(5,2,8))
range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5))
named <- IRangesList(one = range1, two = range2)
length(named) # 2
names(named) # "one" and "two"
named[[1]] # range1
unnamed <- IRangesList(range1, range2)
names(unnamed) # NULL
x <- IRangesList(start=list(c(1,2,3), c(15,45,20,1)),
end=list(c(5,2,8), c(15,100,80,5)))
as.list(x)
IRanges_class()
IRanges and NormalIRanges objects
Description
The IRanges class is a simple implementation of the IntegerRanges container where 2 integer vectors of the same length are used to store the start and width values. See the IntegerRanges virtual class for a formal definition of IntegerRanges objects and for their methods (all of them should work for IRanges objects).
Some subclasses of the IRanges class are: NormalIRanges, Views , etc...
A NormalIRanges object is just an IRanges object that is guaranteed to be "normal". See the Normality section in the man page for IntegerRanges objects for the definition and properties of "normal" IntegerRanges objects.
Seealso
IRanges-constructor , IRanges-utils ,
intra-range-methods for intra range transformations,
inter-range-methods for inter range transformations,
Author
Hervé Pagès
Examples
showClass("IRanges") # shows (some of) the known subclasses
## ---------------------------------------------------------------------
## A. MANIPULATING IRanges OBJECTS
## ---------------------------------------------------------------------
## All the methods defined for IntegerRanges objects work on IRanges
## objects.
## See ?IntegerRanges for some examples.
## Also see ?`IRanges-utils` and ?`setops-methods` for additional
## operations on IRanges objects.
## Concatenating IRanges objects
ir1 <- IRanges(c(1, 10, 20), width=5)
mcols(ir1) <- DataFrame(score=runif(3))
ir2 <- IRanges(c(101, 110, 120), width=10)
mcols(ir2) <- DataFrame(score=runif(3))
ir3 <- IRanges(c(1001, 1010, 1020), width=20)
mcols(ir3) <- DataFrame(value=runif(3))
some.iranges <- c(ir1, ir2)
## all.iranges <- c(ir1, ir2, ir3) ## This will raise an error
all.iranges <- c(ir1, ir2, ir3, ignore.mcols=TRUE)
stopifnot(is.null(mcols(all.iranges)))
## ---------------------------------------------------------------------
## B. A NOTE ABOUT PERFORMANCE
## ---------------------------------------------------------------------
## Using an IRanges object for storing a big set of ranges is more
## efficient than using a standard R data frame:
N <- 2000000L # nb of ranges
W <- 180L # width of each range
start <- 1L
end <- 50000000L
set.seed(777)
range_starts <- sort(sample(end-W+1L, N))
range_widths <- rep.int(W, N)
## Instantiation is faster
system.time(x <- IRanges(start=range_starts, width=range_widths))
system.time(y <- data.frame(start=range_starts, width=range_widths))
## Subsetting is faster
system.time(x16 <- x[c(TRUE, rep.int(FALSE, 15))])
system.time(y16 <- y[c(TRUE, rep.int(FALSE, 15)), ])
## Internal representation is more compact
object.size(x16)
object.size(y16)
IRanges_constructor()
The IRanges constructor and supporting functions
Description
The IRanges
function is a constructor that can be used
to create IRanges instances.
solveUserSEW0
and solveUserSEW
are utility functions that
solve a set of user-supplied start/end/width values.
Usage
## IRanges constructor:
IRanges(start=NULL, end=NULL, width=NULL, names=NULL)
## Supporting functions (not for the end user):
solveUserSEW0(start=NULL, end=NULL, width=NULL)
solveUserSEW(refwidths, start=NA, end=NA, width=NA,
rep.refwidths=FALSE,
translate.negative.coord=TRUE,
allow.nonnarrowing=FALSE)
Arguments
Argument | Description |
---|---|
start, end, width | For IRanges and solveUserSEW0 : NULL , or vector of integers (eventually with NAs). For solveUserSEW : vector of integers (eventually with NAs). |
names | A character vector or NULL . |
refwidths | Vector of non-NA non-negative integers containing the reference widths. |
rep.refwidths | TRUE or FALSE . Use of rep.refwidths=TRUE is supported only when refwidths is of length 1. |
translate.negative.coord, allow.nonnarrowing | TRUE or FALSE . |
Seealso
Author
Hervé Pagès
Examples
## ---------------------------------------------------------------------
## A. USING THE IRanges() CONSTRUCTOR
## ---------------------------------------------------------------------
IRanges(start=11, end=rep.int(20, 5))
IRanges(start=11, width=rep.int(20, 5))
IRanges(-2, 20) # only one range
IRanges(start=c(2, 0, NA), end=c(NA, NA, 14), width=11:0)
IRanges() # IRanges instance of length zero
IRanges(names=character())
## With logical input:
x <- IRanges(c(FALSE, TRUE, TRUE, FALSE, TRUE)) # logical vector input
isNormal(x) # TRUE
x <- IRanges(Rle(1:30) %% 5 <= 2) # logical Rle input
isNormal(x) # TRUE
## ---------------------------------------------------------------------
## B. USING solveUserSEW()
## ---------------------------------------------------------------------
refwidths <- c(5:3, 6:7)
refwidths
solveUserSEW(refwidths)
solveUserSEW(refwidths, start=4)
solveUserSEW(refwidths, end=3, width=2)
solveUserSEW(refwidths, start=-3)
solveUserSEW(refwidths, start=-3, width=2)
solveUserSEW(refwidths, end=-4)
## The start/end/width arguments are recycled:
solveUserSEW(refwidths, start=c(3, -4, NA), end=c(-2, NA))
## Using 'rep.refwidths=TRUE':
solveUserSEW(10, start=-(1:6), rep.refwidths=TRUE)
solveUserSEW(10, end=-(1:6), width=3, rep.refwidths=TRUE)
IRanges_internals()
IRanges internals
Description
Objects, classes and methods defined in the IRanges package that are not intended to be used directly.
IRanges_utils()
IRanges utility functions
Description
Utility functions for creating or modifying IRanges objects.
Usage
## Create an IRanges instance:
successiveIRanges(width, gapwidth=0, from=1)
breakInChunks(totalsize, nchunk, chunksize)
## Turn a logical vector into a set of ranges:
whichAsIRanges(x)
## Coercion:
asNormalIRanges(x, force=TRUE)
Arguments
Argument | Description |
---|---|
width | A vector of non-negative integers (with no NAs) specifying the widths of the ranges to create. |
gapwidth | A single integer or an integer vector with one less element than the width vector specifying the widths of the gaps separating one range from the next one. |
from | A single integer specifying the starting position of the first range. |
totalsize | A single non-negative integer. The total size of the object to break. |
nchunk | A single positive integer. The number of chunks. |
chunksize | A single positive integer. The size of the chunks (last chunk might be smaller). |
x | A logical vector for whichAsIRanges . An IRanges object for asNormalIRanges . |
force | TRUE or FALSE . Should x be turned into a NormalIRanges object even if isNormal(x) is FALSE ? |
Details
successiveIRanges
returns an IRanges instance containing
the ranges that have the widths specified in the width
vector
and are separated by the gaps specified in gapwidth
.
The first range starts at position from
.
When gapwidth=0
and from=1
(the defaults), the returned
IRanges can be seen as a partitioning of the 1:sum(width) interval.
See ?Partitioning
for more details on this.
breakInChunks
returns a PartitioningByEnd object
describing the "chunks" that result from breaking a vector-like object
of length totalsize
in the chunks described by nchunk
or
chunksize
.
whichAsIRanges
returns an IRanges instance containing all of
the ranges where x
is TRUE
.
If force=TRUE
(the default), then asNormalIRanges
will
turn x
into a NormalIRanges instance by reordering and
reducing the set of ranges if necessary (i.e. only if isNormal(x)
is FALSE
, otherwise the set of ranges will be untouched).
If force=FALSE
, then asNormalIRanges
will turn x
into a NormalIRanges instance only if isNormal(x)
is
TRUE
, otherwise it will raise an error.
Note that when force=FALSE
, the returned object is guaranteed
to contain exactly the same set of ranges than x
.
as(x, "NormalIRanges")
is equivalent to asNormalIRanges(x, force=TRUE)
.
Seealso
IRanges objects.
Partitioning objects.
equisplit
for splitting a list-like object into a specified number of partitions.intra-range-methods and inter-range-methods for intra range and inter range transformations.
setops-methods for performing set operations on IRanges objects.
Author
Hervé Pagès
Examples
vec <- as.integer(c(19, 5, 0, 8, 5))
successiveIRanges(vec)
breakInChunks(600999, chunksize=50000) # chunks of size 50000 (last
# chunk is smaller)
whichAsIRanges(vec >= 5)
x <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L),
width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L))
asNormalIRanges(x) # 3 non-empty ranges ordered from left to right and
# separated by gaps of width >= 1.
## More on normality:
example(`IRanges-class`)
isNormal(x16) # FALSE
if (interactive())
x16 <- asNormalIRanges(x16) # Error!
whichFirstNotNormal(x16) # 57
isNormal(x16[1:56]) # TRUE
xx <- asNormalIRanges(x16[1:56])
class(xx)
max(xx)
min(xx)
IntegerRangesList_class()
IntegerRangesList objects
Description
The IntegerRangesList virtual class is a general container for storing a list of IntegerRanges objects.
Most users are probably more interested in the IRangesList container, an IntegerRangesList derivative for storing a list of IRanges objects.
Details
The place of IntegerRangesList in the list("Vector class hierarchy") :
| list("
", " Vector
", " ^
", " |
", " List
", " ^
", " |
", " RangesList
", " ^ ^
", " /
", " /
", " /
", " /
", |
| " /
", " /
", " IntegerRangesList GenomicRangesList
", " ^ ^
", " | |
", " IRangesList GRangesList
", " ^ ^ ^ ^
", " / \ /
", " / \ /
", |
" / \ / \
", " SimpleIRangesList \ SimpleGRangesList
", " CompressedIRangesList CompressedGRangesList
", " ")
Note that the list("Vector class hierarchy") has many more classes.
In particular Vector , List ,
RangesList , and IntegerRangesList
have other subclasses not shown here.
Seealso
IRangesList objects.
IntegerRanges and IRanges objects.
Author
M. Lawrence & H. Pagès
Examples
## ---------------------------------------------------------------------
## Basic manipulation
## ---------------------------------------------------------------------
range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8))
range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5))
named <- IRangesList(one = range1, two = range2)
length(named) # 2
start(named) # same as start(c(range1, range2))
names(named) # "one" and "two"
named[[1]] # range1
unnamed <- IRangesList(range1, range2)
names(unnamed) # NULL
# edit the width of the ranges in the list
edited <- named
width(edited) <- rep(c(3,2), elementNROWS(named))
edited
# same as list(range1, range2)
as.list(IRangesList(range1, range2))
# coerce to data.frame
as.data.frame(named)
IRangesList(range1, range2)
## zoom in 2X
collection <- IRangesList(one = range1, range2)
collection * 2
IntegerRanges_class()
IntegerRanges objects
Description
The IntegerRanges virtual class is a general container for storing ranges on the space of integers.
Details
TODO
List_class_leftovers()
List objects (old man page)
Description
IMPORTANT NOTE - 9/4/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for List objects located in the S4Vectors package.
Details
The only thing left here is the documentation of the stack
method for List objects. In the code snippets below, x
is a List object.
list(" ", " ", list(list(), list(" ", " ", list("stack(x, index.var = "name", value.var = "value")"), ": ", " As with ", list(list("stack")), " on a ", list("list"), ", ", " constructs a ", list("DataFrame"), " with two columns: one for the ", " unlisted values, the other indicating the name of the element from ", " which each value was obtained. ", list("index.var"), " specifies the column ", " name for the index (source name) column and ", list("value.var"),
"
", " specifies the column name for the values. ", " ")), " ", " ")
Seealso
- The List class defined and documented in the S4Vectors package.
Examples
starts <- IntegerList(c(1, 5), c(2, 8))
ends <- IntegerList(c(3, 8), c(5, 9))
rgl <- IRangesList(start=starts, end=ends)
rangeDataFrame <- stack(rgl, "space", "ranges")
MaskCollection_class()
MaskCollection objects
Description
The MaskCollection class is a container for storing a collection of masks that can be used to mask regions in a sequence.
Details
In the context of the Biostrings package, a mask is a set of regions
in a sequence that need to be excluded from some computation.
For example, when calling alphabetFrequency
or matchPattern
on a chromosome sequence,
you might want to exclude some regions like the centromere or the repeat
regions. This can be achieved by putting one or several masks on the sequence
before calling alphabetFrequency
on it.
A MaskCollection object is a vector-like object that represents such set of masks. Like standard R vectors, it has a "length" which is the number of masks contained in it. But unlike standard R vectors, it also has a "width" which determines the length of the sequences it can be "put on". For example, a MaskCollection object of width 20000 can only be put on an XString object of 20000 letters.
Each mask in a MaskCollection object x
is just a finite set of
integers that are >= 1 and <= width(x)
.
When "put on" a sequence, these integers indicate the positions of the
letters to mask.
Internally, each mask is represented by a NormalIRanges
object.
Seealso
NormalIRanges-class ,
read.Mask ,
MaskedXString-class ,
reverse
,
alphabetFrequency
,
matchPattern
Author
Hervé Pagès
Examples
## Making a MaskCollection object:
mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2))
mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1))
mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4))
mymasks <- append(append(mask1, mask2), mask3)
mymasks
length(mymasks)
width(mymasks)
collapse(mymasks)
## Names and descriptions:
names(mymasks) <- c("A", "B", "C") # names should be short and unique...
mymasks
mymasks[c("C", "A")] # ...to make subsetting by names easier
desc(mymasks) <- c("you can be", "more verbose", "here")
mymasks[-2]
## Activate/deactivate masks:
active(mymasks)["B"] <- FALSE
mymasks
collapse(mymasks)
active(mymasks) <- FALSE # deactivate all masks
mymasks
active(mymasks)[-1] <- TRUE # reactivate all masks except mask 1
active(mymasks) <- !active(mymasks) # toggle all masks
## Other advanced operations:
mymasks[[2]]
length(mymasks[[2]])
mymasks[[2]][-3]
append(mymasks[-2], gaps(mymasks[2]))
NCList_class()
Nested Containment List objects
Description
The NCList class is a container for storing the Nested Containment
List representation of a IntegerRanges object. Preprocessing a
IntegerRanges object as a Nested Containment List allows
efficient overlap-based operations like findOverlaps
.
The NCLists class is a container for storing a collection of NCList objects. An NCLists object is typically the result of preprocessing each list element of a IntegerRangesList object as a Nested Containment List. Like with NCList, the NCLists object can then be used for efficient overlap-based operations.
To preprocess a IntegerRanges or IntegerRangesList object,
simply call the NCList
or NCLists
constructor function on it.
Usage
NCList(x, circle.length=NA_integer_)
NCLists(x, circle.length=NA_integer_)
Arguments
Argument | Description |
---|---|
x | The IntegerRanges or IntegerRangesList object to preprocess. |
circle.length | Use only if the space (or spaces if x is a IntegerRangesList object) on top of which the ranges in x are defined needs (need) to be considered circular. If that's the case, then use circle.length to specify the length(s) of the circular space(s). For NCList , circle.length must be a single positive integer (or NA if the space is linear). For NCLists , it must be an integer vector parallel to x (i.e. same length) and with positive or NA values (NAs indicate linear spaces). |
Details
The GenomicRanges package also defines the
GNCList
constructor and class for
preprocessing and representing a vector of genomic ranges as a
data structure based on Nested Containment Lists.
Some important differences between the new findOverlaps/countOverlaps implementation based on Nested Containment Lists (BioC >= 3.1) and the old implementation based on Interval Trees (BioC < 3.1):
With the new implementation, the hits returned by
findOverlaps
are not fully ordered (i.e. ordered by queryHits and subject Hits) anymore, but only partially ordered (i.e. ordered by queryHits only). Other than that, and except for the 2 particular situations mentioned below, the 2 implementations produce the same output. However, the new implementation is faster and more memory efficient.With the new implementation, either the query or the subject can be preprocessed with
NCList
for a IntegerRanges object (replacement forIntervalTree
),NCLists
for a IntegerRangesList object (replacement forIntervalForest
), andGNCList
for a GenomicRanges object (replacement forGIntervalTree
). However, for a one-time use, it is NOT advised to explicitely preprocess the input. This is becausefindOverlaps
orcountOverlaps
will take care of it and do a better job at it (by preprocessing only what's needed when it's needed, and releasing memory as they go).With the new implementation,
countOverlaps
on IntegerRanges or GenomicRanges objects doesn't callfindOverlaps
in order to collect all the hits in a growing Hits object and count them only at the end. Instead, the counting happens at the C level and the hits are not kept. This reduces memory usage considerably when there is a lot of hits.When
minoverlap=0
, zero-width ranges are now interpreted as insertion points and considered to overlap with ranges that contain them. With the old alogrithm, zero-width ranges were always ignored. This is the 1st situation where the new and old implementations produce different outputs.When using
select="arbitrary"
, the new implementation will generally not select the same hits as the old implementation. This is the 2nd situation where the new and old implementations produce different outputs.The new implementation supports preprocessing of a GenomicRanges object with ranges defined on circular sequences (e.g. on the mitochnodrial chromosome). See GNCList in the GenomicRanges package for some examples.
Objects preprocessed with
NCList
,NCLists
, andGNCList
are serializable (withsave
) for later use. Not a typical thing to do though, because preprocessing is very cheap (i.e. very fast and memory efficient).
Value
An NCList object for the NCList
constructor and an NCLists object
for the NCLists
constructor.
Seealso
The
GNCList
constructor and class defined in the GenomicRanges package.findOverlaps
for finding/counting interval overlaps between two range-based objects.IntegerRanges and IntegerRangesList objects.
Author
Hervé Pagès
References
Alexander V. Alekseyenko and Christopher J. Lee -- Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases. Bioinformatics (2007) 23 (11): 1386-1393. doi: 10.1093/bioinformatics/btl647
Examples
## The example below is for illustration purpose only and does NOT
## reflect typical usage. This is because, for a one-time use, it is
## NOT advised to explicitely preprocess the input for findOverlaps()
## or countOverlaps(). These functions will take care of it and do a
## better job at it (by preprocessing only what's needed when it's
## needed, and release memory as they go).
query <- IRanges(c(1, 4, 9), c(5, 7, 10))
subject <- IRanges(c(2, 2, 10), c(2, 3, 12))
## Either the query or the subject of findOverlaps() can be preprocessed:
ppsubject <- NCList(subject)
hits1 <- findOverlaps(query, ppsubject)
hits1
ppquery <- NCList(query)
hits2 <- findOverlaps(ppquery, subject)
hits2
## Note that 'hits1' and 'hits2' contain the same hits but not in the
## same order.
stopifnot(identical(sort(hits1), sort(hits2)))
RangedData_class()
Data on ranges
Description
IMPORTANT NOTE: RangedData
objects are deprecated in BioC 3.9!
The use of RangedData
objects has been discouraged in favor
of GRanges or GRangesList
objects since BioC 2.12, that is, since 2014.
The GRanges and GRangesList
classes are defined in the GenomicRanges package.
See ?GRanges
and ?GenomicRanges
(after loading the
GenomicRanges package) for more information about these classes.
PLEASE MIGRATE YOUR CODE TO USE GRanges OR
GRangesList OBJECTS INSTEAD OF RangedData
OBJECTS AS SOON AS POSSIBLE. Don't hesitate to ask on the bioc-devel
mailing list ( https://bioconductor.org/help/support/#bioc-devel )
if you need help with this.
RangedData
supports storing data, i.e. a set of variables, on a
set of ranges spanning multiple spaces (e.g. chromosomes). Although
the data is split across spaces, it can still be treated as one
cohesive dataset when desired and extends DataTable .
Details
A RangedData
object consists of two primary components:
a IntegerRangesList holding the ranges over
multiple spaces and a parallel SplitDataFrameList ,
holding the split data. There is also an universe
slot
for denoting the source (e.g. the genome) of the ranges and/or
data.
There are two different modes of interacting with a
RangedData
. The first mode treats the object as a contiguous
"data frame" annotated with range information. The accessors
start
, end
, and width
get the corresponding
fields in the ranges as atomic integer vectors, undoing the division
over the spaces. The [[
and matrix-style [,
extraction
and subsetting functions unroll the data in the same way. [[<-
does the inverse. The number
of rows is defined as the total number of ranges and the number of
columns is the number of variables in the data. It is often convenient
and natural to treat the data this way, at least when the data is
small and there is no need to distinguish the ranges by their space.
The other mode is to treat the RangedData
as a list, with an element
(a virtual IntegerRanges / DataFrame
pair) for each space. The length of the object is defined as the number
of spaces and the value returned by the names
accessor gives the
names of the spaces. The list-style [
subset function behaves
analogously.
Seealso
DataTable , the parent of this class, with more utilities.
Author
Michael Lawrence
Examples
ranges <- IRanges(c(1,2,3),c(4,5,6))
filter <- c(1L, 0L, 1L)
score <- c(10L, 2L, NA)
## constructing RangedData instances
## no variables
rd <- RangedData()
rd <- RangedData(ranges)
ranges(rd)
## one variable
rd <- RangedData(ranges, score)
rd[["score"]]
## multiple variables
rd <- RangedData(ranges, filter, vals = score)
rd[["vals"]] # same as rd[["score"]] above
rd$vals
rd[["filter"]]
rd <- RangedData(ranges, score + score)
rd[["score...score"]] # names made valid
## split some data over chromosomes
range2 <- IRanges(start=c(15,45,20,1), end=c(15,100,80,5))
both <- c(ranges, range2)
score <- c(score, c(0L, 3L, NA, 22L))
filter <- c(filter, c(0L, 1L, NA, 0L))
chrom <- paste("chr", rep(c(1,2), c(length(ranges), length(range2))), sep="")
rd <- RangedData(both, score, filter, space = chrom)
rd[["score"]] # identical to score
rd[1][["score"]] # identical to score[1:3]
## subsetting
## list style: [i]
rd[numeric()] # these three are all empty
rd[logical()]
rd[NULL]
rd[] # missing, full instance returned
rd[FALSE] # logical, supports recycling
rd[c(FALSE, FALSE)] # same as above
rd[TRUE] # like rd[]
rd[c(TRUE, FALSE)]
rd[1] # numeric index
rd[c(1,2)]
rd[-2]
## matrix style: [i,j]
rd[,NULL] # no columns
rd[NULL,] # no rows
rd[,1]
rd[,1:2]
rd[,"filter"]
rd[1,] # now by the rows
rd[c(1,3),]
rd[1:2, 1] # row and column
rd[c(1:2,1,3),1] ## repeating rows
## dimnames
colnames(rd)[2] <- "foo"
colnames(rd)
rownames(rd) <- head(letters, nrow(rd))
rownames(rd)
## space names
names(rd)
names(rd)[1] <- "chr1"
## variable replacement
count <- c(1L, 0L, 2L)
rd <- RangedData(ranges, count, space = c(1, 2, 1))
## adding a variable
score <- c(10L, 2L, NA)
rd[["score"]] <- score
rd[["score"]] # same as 'score'
## replacing a variable
count2 <- c(1L, 1L, 0L)
rd[["count"]] <- count2
## numeric index also supported
rd[[2]] <- score
rd[[2]] # gets 'score'
## removing a variable
rd[[2]] <- NULL
ncol(rd) # is only 1
rd$score2 <- score
## combining
rd <- RangedData(ranges, score, space = c(1, 2, 1))
c(rd[1], rd[2]) # equal to 'rd'
rd2 <- RangedData(ranges, score)
RangedSelection_class()
Selection of ranges and columns
Description
A RangedSelection
represents a query against a table
of interval data in terms of ranges and column names. The ranges
select any table row with an overlapping interval. Note that the
intervals are always returned, even if no columns are selected.
Details
Traditionally, tabular data structures have supported the
subset
function, which allows one to select a subset of
the rows and columns from the table. In that case, the rows and
columns are specified by two separate arguments. As querying interval
data sources, especially those external to R, such as binary indexed
files and databases, is increasingly common, there is a need to
encapsulate the row and column specifications into a single data
structure, mostly for the sake of interface
cleanliness. The RangedSelection
class fills that role.
Author
Michael Lawrence
Examples
rl <- IRangesList(chr1 = IRanges(c(1, 5), c(3, 6)))
RangedSelection(rl)
as(rl, "RangedSelection") # same as above
RangedSelection(rl, "score")
RleViewsList_class()
List of RleViews
Description
An extension of ViewsList that holds only RleViews objects. Useful for storing coverage vectors over a set of spaces (e.g. chromosomes), each of which requires a separate RleViews object.
Details
For more information on methods available for RleViewsList objects consult the man pages for ViewsList-class and view-summarization-methods .
Seealso
ViewsList-class , view-summarization-methods
Author
P. Aboyoun
Examples
## Rle objects
subject1 <- Rle(c(3L,2L,18L,0L), c(3,2,1,5))
set.seed(0)
subject2 <- Rle(c(0L,5L,2L,0L,3L), c(8,5,2,7,4))
## Views
rleViews1 <- Views(subject1, 3:0, 5:8)
rleViews2 <- Views(subject2, subject2 > 0)
## RleList and IntegerRangesList objects
rleList <- RleList(subject1, subject2)
rangesList <- IRangesList(IRanges(3:0, 5:8), IRanges(subject2 > 0))
## methods for construction
method1 <- RleViewsList(rleViews1, rleViews2)
method2 <- RleViewsList(rleList = rleList, rangesList = rangesList)
identical(method1, method2)
## calculation over the views
viewSums(method1)
RleViews_class()
The RleViews class
Description
The RleViews class is the basic container for storing a set of views (start/end locations) on the same Rle object.
Details
An RleViews object contains a set of views (start/end locations) on the same
Rle object called "the subject vector" or simply "the subject".
Each view is defined by its start and end locations: both are integers such
that start <= end.
An RleViews object is in fact a particular case of a Views
object (the RleViews class contains the Views class) so it
can be manipulated in a similar manner: see ?
for
more information.
Note that two views can overlap and that a view can be "out of limits"
i.e. it can start before the first element of the subject or/and end
after its last element.
Seealso
Views-class , Rle-class , view-summarization-methods
Author
P. Aboyoun
Examples
subject <- Rle(rep(c(3L, 2L, 18L, 0L), c(3,2,1,5)))
myViews <- Views(subject, 3:0, 5:8)
myViews
subject(myViews)
length(myViews)
start(myViews)
end(myViews)
width(myViews)
myViews[[2]]
set.seed(0)
vec <- Rle(sample(0:2, 20, replace = TRUE))
vec
Views(vec, vec > 0)
Rle_class_leftovers()
Rle objects (old man page)
Description
IMPORTANT NOTE - 7/3/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for Rle objects located in the S4Vectors package.
Seealso
The Rle class defined and documented in the S4Vectors package.
Examples
x <- Rle(10:1, 1:10)
x
Vector_class_leftovers()
Vector objects (old man page)
Description
IMPORTANT NOTE - 4/29/2014: This man page is being refactored. Most of the things that used to be documented here have been moved to the man page for Vector objects located in the S4Vectors package.
Seealso
The Vector class defined and documented in the S4Vectors package.
ViewsList_class()
List of Views
Description
An extension of List that holds only Views objects.
Details
ViewsList is a virtual class. Specialized subclasses like e.g. RleViewsList are useful for storing coverage vectors over a set of spaces (e.g. chromosomes), each of which requires a separate RleViews object.
As a List subclass, ViewsList inherits all the methods available for List objects. It also presents an API that is very similar to that of Views , where operations are vectorized over the elements and generally return lists.
Seealso
List-class , RleViewsList-class .
Author
P. Aboyoun and H. Pagès
Examples
showClass("ViewsList")
Views_class()
Views objects
Description
The Views virtual class is a general container for storing a set of views on an arbitrary Vector object, called the "subject".
Its primary purpose is to introduce concepts and provide some facilities that can be shared by the concrete classes that derive from it.
Some direct subclasses of the Views class are: RleViews , XIntegerViews (defined in the XVector package), XStringViews (defined in the Biostrings package), etc...
Seealso
IRanges-class , Vector-class , IRanges-utils , XVector .
Some direct subclasses of the Views class: RleViews-class , XIntegerViews-class , XDoubleViews-class , XStringViews-class .
Author
Hervé Pagès
Examples
showClass("Views") # shows (some of) the known subclasses
## Create a set of 4 views on an XInteger subject of length 10:
subject <- Rle(3:-6)
v1 <- Views(subject, start=4:1, end=4:7)
## Extract the 2nd view:
v1[[2]]
## Some views can be "out of limits"
v2 <- Views(subject, start=4:-1, end=6)
trim(v2)
subviews(v2, end=-2)
## See ?`XIntegerViews-class` in the XVector package for more examples.
coverage_methods()
Coverage of a set of ranges
Description
For each position in the space underlying a set of ranges, counts the number of ranges that cover it.
Usage
coverage(x, shift=0L, width=NULL, weight=1L, ...)
list(list("coverage"), list("IntegerRanges"))(x, shift=0L, width=NULL, weight=1L,
method=c("auto", "sort", "hash"))
list(list("coverage"), list("IntegerRangesList"))(x, shift=0L, width=NULL, weight=1L,
method=c("auto", "sort", "hash"))
Arguments
Argument | Description |
---|---|
x | A IntegerRanges , Views , or IntegerRangesList object. See ?`` in the GenomicRanges package for coverage` methods for other objects. |
shift, weight | shift specifies how much each range in x should be shifted before the coverage is computed. A positive shift value will shift the corresponding range in x to the right, and a negative value to the left. NAs are not allowed. weight assigns a weight to each range in x . |
If
x
is an IntegerRanges or Views object: each of these arguments must be an integer or numeric vector parallel tox
(will get recycled if necessary). Alternatively, each of these arguments can also be specified as a single string naming a metadata column inx
(i.e. a column inmcols(x)
) to be used as theshift
(orweight
) vector. Note that whenx
is an IPos object, each of these arguments can only be a single number.If
x
is an IntegerRangesList object: each of these arguments must be a numeric vector or list-like object of the same length asx
(will get recycled if necessary). If it's a numeric vector, it's first turned into a list withas.list
. After recycling, each list elementshift[[i]]
(orweight[[i]]
) must be an integer or numeric vector parallel tox[[i]]
(will get recycled if necessary). Ifweight
is an integer vector or list-like object of integer vectors, the coverage vector(s) will be returned as integer- Rle object(s). If it's a numeric vector or list-like object of numeric vectors, the coverage vector(s) will be returned as numeric- Rle object(s). |width
| Specifies the length of the returned coverage vector(s). |If
x
is an IntegerRanges object:width
must beNULL
(the default), an NA, or a single non-negative integer. After being shifted, the ranges inx
are always clipped on the left to keep only their positive portion i.e. their intersection with the [1, +inf) interval. Ifwidth
is a single non-negative integer, then they're also clipped on the right to keep only their intersection with the [1, width] interval. In that casecoverage
returns a vector of lengthwidth
. Otherwise, it returns a vector that extends to the last position in the underlying space covered by the shifted ranges.If
x
is a Views object: Same as for a IntegerRanges object, except that, ifwidth
isNULL
then it's treated as if it waslength(subject(x))
.If
x
is a IntegerRangesList object:width
must beNULL
or an integer vector parallel tox
(i.e. with one element per list element inx
). If notNULL
, the vector must contain NAs or non-negative integers and it will get recycled to the length ofx
if necessary. IfNULL
, it is replaced withNA
and recycled to the length ofx
. Finallywidth[i]
is used to compute the coverage vector forx[[i]]
and is therefore treated like explained above (whenx
is a IntegerRanges object).
|method
| Ifmethod
is set to"sort"
, thenx
is sorted previous to the calculation of the coverage. Ifmethod
is set tohash
, thenx
is hashed directly to a vector of lengthwidth
without previous sorting. The"hash"
method is faster than the"sort"
method whenx
is large (i.e. contains a lot of ranges). Whenx
is small andwidth
is big (e.g.x
represents a small set of reads aligned to a big chromosome), thenmethod="sort"
is faster and uses less memory thanmethod="hash"
. Usingmethod="auto"
selects the best method based onlength(x)
andwidth
. | |...
| Further arguments to be passed to or from other methods. |
Value
If x
is a IntegerRanges or Views object:
An integer- or numeric- Rle object depending on whether weight
is an integer or numeric vector.
If x
is a IntegerRangesList object:
An RleList object with one coverage vector per list element
in x
, and with x
names propagated to it. The i-th coverage
vector can be either an integer- or numeric- Rle object, depending
on the type of weight[[i]]
(after weight
has gone thru
as.list
and recycling, like described previously).
Seealso
coverage-methods in the GenomicRanges package for more
coverage
methods.The
slice
function for slicing the Rle or RleList object returned bycoverage
.IntegerRanges , IPos , IntegerRangesList , Rle , and RleList objects.
Author
H. Pagès and P. Aboyoun
Examples
## ---------------------------------------------------------------------
## A. COVERAGE OF AN IRanges OBJECT
## ---------------------------------------------------------------------
x <- IRanges(start=c(-2L, 6L, 9L, -4L, 1L, 0L, -6L, 10L),
width=c( 5L, 0L, 6L, 1L, 4L, 3L, 2L, 3L))
coverage(x)
coverage(x, shift=7)
coverage(x, shift=7, width=27)
coverage(x, shift=c(-4, 2)) # 'shift' gets recycled
coverage(x, shift=c(-4, 2), width=12)
coverage(x, shift=-max(end(x)))
coverage(restrict(x, 1, 10))
coverage(reduce(x), shift=7)
coverage(gaps(shift(x, 7), start=1, end=27))
## With weights:
coverage(x, weight=as.integer(10^(0:7))) # integer-Rle
coverage(x, weight=c(2.8, -10)) # numeric-Rle, 'shift' gets recycled
## ---------------------------------------------------------------------
## B. COVERAGE OF AN IPos OBJECT
## ---------------------------------------------------------------------
pos_runs <- IRanges(c(1, 5, 9), c(10, 8, 15))
ipos <- IPos(pos_runs)
coverage(ipos)
## ---------------------------------------------------------------------
## C. COVERAGE OF AN IRangesList OBJECT
## ---------------------------------------------------------------------
x <- IRangesList(A=IRanges(3*(4:-1), width=1:3), B=IRanges(2:10, width=5))
cvg <- coverage(x)
cvg
stopifnot(identical(cvg[[1]], coverage(x[[1]])))
stopifnot(identical(cvg[[2]], coverage(x[[2]])))
coverage(x, width=c(50, 9))
coverage(x, width=c(NA, 9))
coverage(x, width=9) # 'width' gets recycled
## Each list element in 'shift' and 'weight' gets recycled to the length
## of the corresponding element in 'x'.
weight <- list(as.integer(10^(0:5)), -0.77)
cvg2 <- coverage(x, weight=weight)
cvg2 # 1st coverage vector is an integer-Rle, 2nd is a numeric-Rle
identical(mapply(coverage, x=x, weight=weight), as.list(cvg2))
## ---------------------------------------------------------------------
## D. SOME MATHEMATICAL PROPERTIES OF THE coverage() FUNCTION
## ---------------------------------------------------------------------
## PROPERTY 1: The coverage vector is not affected by reordering the
## input ranges:
set.seed(24)
x <- IRanges(sample(1000, 40, replace=TRUE), width=17:10)
cvg0 <- coverage(x)
stopifnot(identical(coverage(sample(x)), cvg0))
## Of course, if the ranges are shifted and/or assigned weights, then
## this doesn't hold anymore, unless the 'shift' and/or 'weight'
## arguments are reordered accordingly.
## PROPERTY 2: The coverage of the concatenation of 2 IntegerRanges
## objects 'x' and 'y' is the sum of the 2 individual coverage vectors:
y <- IRanges(sample(-20:280, 36, replace=TRUE), width=28)
stopifnot(identical(coverage(c(x, y), width=100),
coverage(x, width=100) + coverage(y, width=100)))
## Note that, because adding 2 vectors in R recycles the shortest to
## the length of the longest, the following is generally FALSE:
identical(coverage(c(x, y)), coverage(x) + coverage(y)) # FALSE
## It would only be TRUE if the 2 coverage vectors that we add had the
## same length, which would only happen by chance. By using the same
## 'width' value when we computed the 2 coverages previously, we made
## sure they had the same length.
## Because of properties 1 & 2, we have:
x1 <- x[c(TRUE, FALSE)] # pick up 1st, 3rd, 5th, etc... ranges
x2 <- x[c(FALSE, TRUE)] # pick up 2nd, 4th, 6th, etc... ranges
cvg1 <- coverage(x1, width=100)
cvg2 <- coverage(x2, width=100)
stopifnot(identical(coverage(x, width=100), cvg1 + cvg2))
## PROPERTY 3: Multiplying the weights by a scalar has the effect of
## multiplying the coverage vector by the same scalar:
weight <- runif(40)
cvg3 <- coverage(x, weight=weight)
stopifnot(all.equal(coverage(x, weight=-2.68 * weight), -2.68 * cvg3))
## Because of properties 1 & 2 & 3, we have:
stopifnot(identical(coverage(x, width=100, weight=c(5L, -11L)),
5L * cvg1 - 11L * cvg2))
## PROPERTY 4: Using the sum of 2 weight vectors produces the same
## result as using the 2 weight vectors separately and summing the
## 2 results:
weight2 <- 10 * runif(40) + 3.7
stopifnot(all.equal(coverage(x, weight=weight + weight2),
cvg3 + coverage(x, weight=weight2)))
## PROPERTY 5: Repeating any input range N number of times is
## equivalent to multiplying its assigned weight by N:
times <- sample(0:10L, length(x), replace=TRUE)
stopifnot(all.equal(coverage(rep(x, times), weight=rep(weight, times)),
coverage(x, weight=weight * times)))
## In particular, if 'weight' is not supplied:
stopifnot(identical(coverage(rep(x, times)), coverage(x, weight=times)))
## PROPERTY 6: If none of the input range actually gets clipped during
## the "shift and clip" process, then:
##
## sum(cvg) = sum(width(x) * weight)
##
stopifnot(sum(cvg3) == sum(width(x) * weight))
## In particular, if 'weight' is not supplied:
stopifnot(sum(cvg0) == sum(width(x)))
## Note that this property is sometimes used in the context of a
## ChIP-Seq analysis to estimate "the number of reads in a peak", that
## is, the number of short reads that belong to a peak in the coverage
## vector computed from the genomic locations (a.k.a. genomic ranges)
## of the aligned reads. Because of property 6, the number of reads in
## a peak is approximately the area under the peak divided by the short
## read length.
## PROPERTY 7: If 'weight' is not supplied, then disjoining or reducing
## the ranges before calling coverage() has the effect of "shaving" the
## coverage vector at elevation 1:
table(cvg0)
shaved_cvg0 <- cvg0
runValue(shaved_cvg0) <- pmin(runValue(cvg0), 1L)
table(shaved_cvg0)
stopifnot(identical(coverage(disjoin(x)), shaved_cvg0))
stopifnot(identical(coverage(reduce(x)), shaved_cvg0))
## ---------------------------------------------------------------------
## E. SOME SANITY CHECKS
## ---------------------------------------------------------------------
dummy_coverage <- function(x, shift=0L, width=NULL)
{
y <- IRanges:::unlist_as_integer(shift(x, shift))
if (is.null(width))
width <- max(c(0L, y))
Rle(tabulate(y, nbins=width))
}
check_real_vs_dummy <- function(x, shift=0L, width=NULL)
{
res1 <- coverage(x, shift=shift, width=width)
res2 <- dummy_coverage(x, shift=shift, width=width)
stopifnot(identical(res1, res2))
}
check_real_vs_dummy(x)
check_real_vs_dummy(x, shift=7)
check_real_vs_dummy(x, shift=7, width=27)
check_real_vs_dummy(x, shift=c(-4, 2))
check_real_vs_dummy(x, shift=c(-4, 2), width=12)
check_real_vs_dummy(x, shift=-max(end(x)))
## With a set of distinct single positions:
x3 <- IRanges(sample(50000, 20000), width=1)
stopifnot(identical(sort(start(x3)), which(coverage(x3) != 0L)))
extractList()
Group elements of a vector-like object into a list-like object
Description
relist
and split
are 2 common ways of grouping the elements
of a vector-like object into a list-like object. The IRanges and
S4Vectors packages define relist
and split
methods
that operate on a Vector object and return a List object.
Note that the split
methods defined in
S4Vectors delegate to the splitAsList
function defined in
IRanges and documented below.
Because relist
and split
both impose restrictions on
the kind of grouping that they support (e.g. every element in the input
object needs to go in a group and can only go in one group), the
IRanges package introduces the extractList
generic function
for performing arbitrary groupings.
Usage
## relist()
## --------
list(list("relist"), list("ANY,List"))(flesh, skeleton)
list(list("relist"), list("Vector,list"))(flesh, skeleton)
## splitAsList()
## -------------
splitAsList(x, f, drop=FALSE, ...)
## extractList()
## -------------
extractList(x, i)
## regroup()
## ---------
regroup(x, g)
Arguments
Argument | Description |
---|---|
flesh, x | A vector-like object. |
skeleton | A list-like object. Only the "shape" (i.e. element lengths) of skeleton matters. Its exact content is ignored. |
f | An atomic vector or a factor (possibly in Rle form). |
drop | Logical indicating if levels that do not occur should be dropped (if f is a factor). |
i | A list-like object. Unlike for skeleton , the content here matters (see Details section below). Note that i can be a IntegerRanges object (a particular type of list-like object), and, in that case, extractList is particularly fast (this is a common use case). |
g | A Grouping or an object coercible to one. For regroup , g groups the elements of x . |
... | Arguments to pass to methods. |
Details
relist
, split
, and extractList
have in common that
they return a list-like object where each list element has the same class
as the original vector-like object. Thus they need to be able to select
the appropriate List concrete subclass to use for this returned
value. This selection is performed by relistToClass
and is based only on the class of the original object.
By default, extractList(x, i)
is equivalent to:
list(" relist(x[unlist(i)], i)
")
An exception is made when x
is a data-frame-like object. In that
case x
is subsetted along the rows, that is, extractList(x, i)
is equivalent to:
list(" relist(x[unlist(i), ], i)
")
This is more or less how the default method is implemented, except for
some optimizations when i
is a IntegerRanges object.
relist
and split
(or splitAsList
) can be seen as
special cases of extractList
:
list(" relist(flesh, skeleton) is equivalent to
", " extractList(flesh, PartitioningByEnd(skeleton))
", "
", " split(x, f) is equivalent to
", " extractList(x, split(seq_along(f), f))
")
It is good practise to use extractList
only for cases not covered
by relist
or split
. Whenever possible, using relist
or split
is preferred as they will always perform more efficiently.
In addition their names carry meaning and are familiar to most R
users/developers so they'll make your code easier to read/understand.
Note that the transformation performed by relist
or split
is always reversible (via unlist
and unsplit
, respectively),
but not the transformation performed by extractList
(in general).
The regroup
function splits the elements of unlist(x)
into a list according to the grouping g
. Each element of
unlist(x)
inherits its group from its parent element of
x
. regroup
is different from relist
and
split
, because x
is already grouped, and the goal is to
combine groups.
Value
The relist
methods behave like utils::relist
except that they
return a List object. If skeleton
has names, then they are
propagated to the returned value.
splitAsList
behaves like base::split
except that the
former returns a List object instead of an ordinary list.
extractList
returns a list-like object parallel to i
and with
the same "shape" as i
(i.e. same element lengths).
If i
has names, then they are propagated to the returned value.
All these functions return a list-like object where the list elements have
the same class as x
. relistToClass
gives
the exact class of the returned object.
Seealso
The
unlist
andrelist
functions in the base and utils packages, respectively.The
split
methods defined in the S4Vectors package.Vector , List , Rle , and DataFrame objects in the S4Vectors package.
relistToClass
is documented in the man page for List objects.IntegerRanges objects.
Author
Hervé Pagès
Examples
## On an Rle object:
x <- Rle(101:105, 6:2)
i <- IRanges(6:10, 16:12, names=letters[1:5])
extractList(x, i)
## On a DataFrame object:
df <- DataFrame(X=x, Y=LETTERS[1:20])
extractList(df, i)
extractListFragments()
Extract list fragments from a list-like object
Description
Utilities for extracting list fragments from a list-like object.
Usage
extractListFragments(x, aranges, use.mcols=FALSE,
msg.if.incompatible=INCOMPATIBLE_ARANGES_MSG)
equisplit(x, nchunk, chunksize, use.mcols=FALSE)
Arguments
Argument | Description |
---|---|
x | The list-like object from which to extract the list fragments. Can be any List derivative for extractListFragments . Can also be an ordinary list if extractListFragments is called with use.mcols=TRUE . Can be any List derivative that supports relist() for equisplit . |
aranges | An IntegerRanges derivative containing the list("absolute ranges") (i.e. the ranges list("along ", list("unlist(x)")) ) of the list fragments to extract. The ranges in aranges must be compatible with the list("cumulated length") of all the list elements in x , that is, start(aranges) and end(aranges) must be >= 1 and <= sum(elementNROWS(x)) , respectively. Also please note that only IntegerRanges objects that are disjoint and sorted are supported at the moment. |
use.mcols | Whether to propagate the metadata columns on x (if any) or not. Must be TRUE or FALSE (the default). If set to FALSE , instead of having the metadata columns propagated from x , the object returned by extractListFragments has metadata columns revmap and revmap2 , and the object returned by equisplit has metadata column revmap . Note that this is the default. |
msg.if.incompatible | The error message to use if aranges is not compatible with the cumulated length of all the list elements in x . |
nchunk | The number of chunks. Must be a single positive integer. |
chunksize | The size of the chunks (last chunk might be smaller). Must be a single positive integer. |
Details
A list fragment of list-like object x
is a window in one of
its list elements.
extractListFragments
is a low-level utility that extracts list
fragments from list-like object x
according to the absolute ranges
in aranges
.
equisplit
fragments and splits list-like object x
into a
specified number of partitions with equal (total) width. This is useful
for instance to ensure balanced loading of workers in parallel evaluation.
For example, if x
is a GRanges object,
each partition is also a GRanges object and the
set of all partitions is returned as a GRangesList
object.
Value
An object of the same class as x
for extractListFragments
.
An object of class relistToClass
for
equisplit
.
Seealso
IRanges and IRangesList objects.
Partitioning objects.
IntegerList objects.
breakInChunks
from breaking a vector-like object in chunks.GRanges and GRangesList objects defined in the GenomicRanges package.
List objects defined in the S4Vectors package.
intra-range-methods and inter-range-methods for intra range and inter range transformations.
Author
Hervé Pagès
Examples
## ---------------------------------------------------------------------
## A. extractListFragments()
## ---------------------------------------------------------------------
x <- IntegerList(a=101:109, b=5:-5)
x
aranges <- IRanges(start=c(2, 4, 8, 17, 17), end=c(3, 6, 14, 16, 19))
aranges
extractListFragments(x, aranges)
x2 <- IRanges(c(1, 101, 1001, 10001), width=c(10, 5, 0, 12),
names=letters[1:4])
mcols(x2)$label <- LETTERS[1:4]
x2
aranges <- IRanges(start=13, end=20)
extractListFragments(x2, aranges)
extractListFragments(x2, aranges, use.mcols=TRUE)
aranges2 <- PartitioningByWidth(c(3, 9, 13, 0, 2))
extractListFragments(x2, aranges2)
extractListFragments(x2, aranges2, use.mcols=TRUE)
x2b <- as(x2, "IntegerList")
extractListFragments(x2b, aranges2)
x2c <- as.list(x2b)
extractListFragments(x2c, aranges2, use.mcols=TRUE)
## ---------------------------------------------------------------------
## B. equisplit()
## ---------------------------------------------------------------------
## equisplit() first calls breakInChunks() internally to create a
## PartitioningByWidth object that contains the absolute ranges of the
## chunks, then calls extractListFragments() on it 'x' to extract the
## fragments of 'x' that correspond to these absolute ranges. Finally
## the IRanges object returned by extractListFragments() is split into
## an IRangesList object where each list element corresponds to a chunk.
equisplit(x2, nchunk=2)
equisplit(x2, nchunk=2, use.mcols=TRUE)
equisplit(x2, chunksize=5)
library(GenomicRanges)
gr <- GRanges(c("chr1", "chr2"), IRanges(1, c(100, 1e5)))
equisplit(gr, nchunk=2)
equisplit(gr, nchunk=1000)
## ---------------------------------------------------------------------
## C. ADVANCED extractListFragments() EXAMPLES
## ---------------------------------------------------------------------
## === D1. Fragment list-like object into length 1 fragments ===
## First we construct a Partitioning object where all the partitions
## have a width of 1:
x2_cumlen <- nobj(PartitioningByWidth(x2)) # Equivalent to
# length(unlist(x2)) except
# that it doesn't unlist 'x2'
# so is much more efficient.
aranges1 <- PartitioningByEnd(seq_len(x2_cumlen))
aranges1
## Then we use it to fragment 'x2':
extractListFragments(x2, aranges1)
extractListFragments(x2b, aranges1)
extractListFragments(x2c, aranges1, use.mcols=TRUE)
## === D2. Fragment a Partitioning object ===
partitioning2 <- PartitioningByEnd(x2b) # same as PartitioningByEnd(x2)
extractListFragments(partitioning2, aranges2)
## Note that when the 1st arg is a Partitioning derivative, then
## swapping the 1st and 2nd elements in the call to extractListFragments()
## doesn't change the returned partitioning:
extractListFragments(aranges2, partitioning2)
## ---------------------------------------------------------------------
## D. SANITY CHECKS
## ---------------------------------------------------------------------
## If 'aranges' is 'PartitioningByEnd(x)' or 'PartitioningByWidth(x)'
## and 'x' has no zero-length list elements, then
## 'extractListFragments(x, aranges, use.mcols=TRUE)' is a no-op.
check_no_ops <- function(x) {
aranges <- PartitioningByEnd(x)
stopifnot(identical(
extractListFragments(x, aranges, use.mcols=TRUE), x
))
aranges <- PartitioningByWidth(x)
stopifnot(identical(
extractListFragments(x, aranges, use.mcols=TRUE), x
))
}
check_no_ops(x2[lengths(x2) != 0])
check_no_ops(x2b[lengths(x2b) != 0])
check_no_ops(x2c[lengths(x2c) != 0])
check_no_ops(gr)
findOverlaps_methods()
Finding overlapping ranges
Description
Various methods for finding/counting interval overlaps between two "range-based" objects: a query and a subject.
NOTE: This man page describes the methods that operate on
IntegerRanges and IntegerRangesList derivatives. See
?`` in the GenomicRanges package for methods that operate on [GenomicRanges](#genomicranges) or [GRangesList](#grangeslist) objects. ## Usage ```r findOverlaps(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), select=c("all", "first", "last", "arbitrary"), ...) countOverlaps(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) overlapsAny(query, subject, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), ...) query %over% subject query %within% subject query %outside% subject subsetByOverlaps(x, ranges, maxgap=-1L, minoverlap=0L, type=c("any", "start", "end", "within", "equal"), invert=FALSE, ...) overlapsRanges(query, subject, hits=NULL, ...) poverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type = c("any", "start", "end", "within", "equal"), ...) mergeByOverlaps(query, subject, ...) findOverlapPairs(query, subject, ...) ``` ## Arguments |Argument |Description| |------------- |----------------| |
query, subject, x, ranges| Each of them can be an [IntegerRanges](#integerranges) (e.g. [IRanges](#iranges) , [Views](#views) ) or [IntegerRangesList](#integerrangeslist) (e.g. [IRangesList](#irangeslist) , [ViewsList](#viewslist) ) derivative. In addition, if
subjector
rangesis an [IntegerRanges](#integerranges) object,
queryor
xcan be an integer vector to be converted to length-one ranges. If
query(or
x) is an [IntegerRangesList](#integerrangeslist) object, then
subject(or
ranges) must also be an [IntegerRangesList](#integerrangeslist) object. If both arguments are list-like objects with names, each list element from the 2nd argument is paired with the list element from the 1st argument with the matching name, if any. Otherwise, list elements are paired by position. The overlap is then computed between the pairs as described below. If
subjectis omitted,
queryis queried against itself. In this case, and only this case, the
drop.selfand
drop.redundantarguments are allowed. By default, the result will contain hits for each range against itself, and if there is a hit from A to B, there is also a hit for B to A. If
drop.selfis
TRUE, all self matches are dropped. If
drop.redundantis
TRUE, only one of A->B and B->A is returned. | |
maxgap| A single integer >= -1. If
typeis set to
"any",
maxgapis interpreted as the maximum gap that is allowed between 2 ranges for the ranges to be considered as overlapping. The gap between 2 ranges is the number of positions that separate them. The gap between 2 adjacent ranges is 0. By convention when one range has its start or end strictly inside the other (i.e. non-disjoint ranges), the gap is considered to be -1. If
typeis set to anything else,
maxgaphas a special meaning that depends on the particular
type. See
typebelow for more information. | |
minoverlap| A single non-negative integer. Only ranges with a minimum of
minoverlapoverlapping positions are considered to be overlapping. When
typeis
"any", at least one of
maxgapand
minoverlapmust be set to its default value. | |
type| By default, any overlap is accepted. By specifying the
typeparameter, one can select for specific types of overlap. The types correspond to operations in Allen's Interval Algebra (see references). If
typeis
startor
end, the intervals are required to have matching starts or ends, respectively. Specifying
equalas the type returns the intersection of the
startand
endmatches. If
typeis
within, the query interval must be wholly contained within the subject interval. Note that all matches must additionally satisfy the
minoverlapconstraint described above. The
maxgapparameter has special meaning with the special overlap types. For
start,
end, and
equal, it specifies the maximum difference in the starts, ends or both, respectively. For
within, it is the maximum amount by which the subject may be wider than the query. If
maxgapis set to -1 (the default), it's replaced internally by 0. | |
select| If
queryis an [IntegerRanges](#integerranges) derivative: When
selectis
"all"(the default), the results are returned as a [Hits](#hits) object. Otherwise the returned value is an integer vector parallel to
query(i.e. same length) containing the first, last, or arbitrary overlapping interval in
subject, with
NAindicating intervals that did not overlap any intervals in
subject. If
queryis an [IntegerRangesList](#integerrangeslist) derivative: When
selectis
"all"(the default), the results are returned as a [HitsList](#hitslist) object. Otherwise the returned value depends on the
dropargument. When
select != "all" && !drop, an [IntegerList](#integerlist) is returned, where each element of the result corresponds to a space in
query. When
select != "all" && drop, an integer vector is returned containing indices that are offset to align with the unlisted
query. | |
invert| If
TRUE, keep only the ranges in
xthat do not overlap
ranges. | |
hits| The [Hits](#hits) or [HitsList](#hitslist) object returned by
findOverlaps, or
NULL. If
NULLthen
hitsis computed by calling
findOverlaps(query, subject, ...)internally (the extra arguments passed to
overlapsRangesare passed to
findOverlaps). | |
...| Further arguments to be passed to or from other methods: | *
drop: Supported only when
queryis an [IntegerRangesList](#integerrangeslist) derivative.
FALSEby default. See
selectargument above for the details. *
drop.self,
drop.redundant: When
subjectis omitted, the
drop.selfand
drop.redundantarguments (both
FALSEby default) are allowed. See
queryand
subjectarguments above for the details. ## Details A common type of query that arises when working with intervals is finding which intervals in one set overlap those in another. The simplest approach is to call the
findOverlapsfunction on a [IntegerRanges](#integerranges) or other object with range information (aka "range-based object"). ## Value For
findOverlaps: see
selectargument above. For
countOverlaps: the overlap hit count for each range in
queryusing the specified
findOverlapsparameters. For [IntegerRangesList](#integerrangeslist) objects, it returns an [IntegerList](#integerlist) object.
overlapsAnyfinds the ranges in
querythat overlap any of the ranges in
subject. For [IntegerRanges](#integerranges) derivatives, it returns a logical vector of length equal to the number of ranges in
query. For [IntegerRangesList](#integerrangeslist) derivatives, it returns a [LogicalList](#logicallist) object where each element of the result corresponds to a space in
query.
%over%and
%within%are convenience wrappers for the 2 most common use cases. Currently defined as ``%over%
<- function(query, subject) overlapsAny(query, subject)and ``%within%
<- function(query, subject).
%outside%is simply the inverse of
%over%.
subsetByOverlapsreturns the subset of
xthat has an overlap hit with a range in
rangesusing the specified
findOverlapsparameters. When
hitsis a [Hits](#hits) (or [HitsList](#hitslist) ) object,
overlapsRanges(query, subject, hits)returns a [IntegerRanges](#integerranges) (or [IntegerRangesList](#integerrangeslist) ) object of the list("same ", " shape") as
hitsholding the regions of intersection between the overlapping ranges in objects
queryand
subject, which should be the same query and subject used in the call to
findOverlapsthat generated
hits. list("Same shape") means same length when
hitsis a [Hits](#hits) object, and same length and same elementNROWS when
hitsis a [HitsList](#hitslist) object.
poverlapscompares
queryand
subjectin parallel (like e.g.,
pmin) and returns a logical vector indicating whether each pair of ranges overlaps. Integer vectors are treated as width-one ranges.
mergeByOverlapscomputes the overlap between query and subject according to the arguments in list() . It then extracts the corresponding hits from each object and returns a
DataFramecontaining one column for the query and one for the subject, as well as any
mcolsthat were present on either object. The query and subject columns are named by quoting and deparsing the corresponding argument.
findOverlapPairsis like
mergeByOverlaps, except it returns a formal [
Pairs](#pairs) object that provides useful downstream conveniences, such as finding the intersection of the overlapping ranges with [
pintersect`](#pintersect) .
## Seealso
Hits and HitsList objects in the S4Vectors package for representing a set of hits between 2 vector-like or list-like objects.
findOverlaps,GenomicRanges,GenomicRanges-method in the GenomicRanges package for methods that operate on GRanges or GRangesList objects.
The NCList class and constructor.
The IntegerRanges , Views , IntegerRangesList , and ViewsList classes.
* The IntegerList and LogicalList classes.
## Author
Michael Lawrence and Hervé Pagès
## References
Allen's Interval Algebra:
James F. Allen: Maintaining knowledge about temporal intervals. In:
Communications of the ACM. 26/11/1983. ACM Press. S. 832-843, ISSN 0001-0782
## Examples
r ## --------------------------------------------------------------------- ## findOverlaps() ## --------------------------------------------------------------------- query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) findOverlaps(query, subject) ## at most one hit per query findOverlaps(query, subject, select="first") findOverlaps(query, subject, select="last") findOverlaps(query, subject, select="arbitrary") ## including adjacent ranges in the result findOverlaps(query, subject, maxgap=0L) query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2), c(5, 4)) ## one IRanges object with itself findOverlaps(query) ## single points as query subject <- IRanges(c(1, 6, 13), c(4, 9, 14)) findOverlaps(c(3L, 7L, 10L), subject, select="first") ## special overlap types query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) findOverlaps(query, subject, type="start") findOverlaps(query, subject, type="start", maxgap=1L) findOverlaps(query, subject, type="end", select="first") ov <- findOverlaps(query, subject, type="within", maxgap=1L) ov ## Using pairs to find intersection of overlapping ranges hits <- findOverlaps(query, subject) p <- Pairs(query, subject, hits=hits) pintersect(p) ## Shortcut p <- findOverlapPairs(query, subject) pintersect(p) ## --------------------------------------------------------------------- ## overlapsAny() ## --------------------------------------------------------------------- overlapsAny(query, subject, type="start") overlapsAny(query, subject, type="end") query %over% subject # same as overlapsAny(query, subject) query %within% subject # same as overlapsAny(query, subject, # type="within") ## --------------------------------------------------------------------- ## overlapsRanges() ## --------------------------------------------------------------------- ## Extract the regions of intersection between the overlapping ranges: overlapsRanges(query, subject, ov) ## --------------------------------------------------------------------- ## Using IntegerRangesList objects ## --------------------------------------------------------------------- query <- IRanges(c(1, 4, 9), c(5, 7, 10)) qpartition <- factor(c("a","a","b")) qlist <- split(query, qpartition) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) spartition <- factor(c("a","a","b")) slist <- split(subject, spartition) ## at most one hit per query findOverlaps(qlist, slist, select="first") findOverlaps(qlist, slist, select="last") findOverlaps(qlist, slist, select="arbitrary") query <- IRanges(c(1, 5, 3, 4), width=c(2, 2, 4, 6)) qpartition <- factor(c("a","a","b","b")) qlist <- split(query, qpartition) subject <- IRanges(c(1, 3, 5, 6), width=c(4, 4, 5, 4)) spartition <- factor(c("a","a","b","b")) slist <- split(subject, spartition) overlapsAny(qlist, slist, type="start") overlapsAny(qlist, slist, type="end") qlist%over% slist subsetByOverlaps(qlist, slist) countOverlaps(qlist, slist)
inter_range_methods()
Inter range transformations of an IntegerRanges, Views, IntegerRangesList, or MaskCollection object
Description
Range-based transformations are grouped in 2 categories:
- Intra range transformations (e.g.
shift
) transform each range individually (and independently of the other ranges). They return an object parallel to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described in the intra-range-methods man page (see?`` ). * Inter range transformations (e.g.
reduce()) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally NOT parallel to the input object. Those transformations are described below. ## Usage ```r ## range() ## ------- list(list("range"), list("IntegerRanges"))(x, ..., with.revmap=FALSE, na.rm=FALSE) list(list("range"), list("IntegerRangesList"))(x, ..., with.revmap=FALSE, na.rm=FALSE) ## reduce() ## -------- reduce(x, drop.empty.ranges=FALSE, ...) list(list("reduce"), list("IntegerRanges"))(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) list(list("reduce"), list("Views"))(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) list(list("reduce"), list("IntegerRangesList"))(x, drop.empty.ranges=FALSE, min.gapwidth=1L, with.revmap=FALSE, with.inframe.attrib=FALSE) ## gaps() ## ------ gaps(x, start=NA, end=NA) ## disjoin(), isDisjoint(), and disjointBins() ## ------------------------------------------- disjoin(x, ...) list(list("disjoin"), list("IntegerRanges"))(x, with.revmap=FALSE) list(list("disjoin"), list("IntegerRangesList"))(x, with.revmap=FALSE) isDisjoint(x, ...) disjointBins(x, ...) ``` ## Arguments |Argument |Description| |------------- |----------------| |
x| A [IntegerRanges](#integerranges) or [IntegerRangesList](#integerrangeslist) object for
range,
disjoin,
isDisjoint, and
disjointBins. A [IntegerRanges](#integerranges) , [Views](#views) , or [IntegerRangesList](#integerrangeslist) object for
reduceand
gaps. | |
...| For
range, additional [IntegerRanges](#integerranges) or [IntegerRangesList](#integerrangeslist) object to consider. | |
na.rm| Ignored. | |
drop.empty.ranges|
TRUEor
FALSE. Should empty ranges be dropped? | |
min.gapwidth| Ranges separated by a gap of at least
min.gapwidthpositions are not merged. | |
with.revmap|
TRUEor
FALSE. Should the mapping from output to input ranges be stored in the returned object? If yes, then it is stored as metadata column
revmapof type [IntegerList](#integerlist) . | |
with.inframe.attrib|
TRUEor
FALSE. For internal use. | |
start, end| | * If
xis a [IntegerRanges](#integerranges) or [Views](#views) object: A single integer or
NA. Use these arguments to specify the interval of reference i.e. which interval the returned gaps should be relative to. * If
xis a [IntegerRangesList](#integerrangeslist) object: Integer vectors containing the coordinate bounds for each [IntegerRangesList](#integerrangeslist) top-level element. ## Details Unless specified otherwise, when
xis a [IntegerRangesList](#integerrangeslist) object, any transformation described here is equivalent to applying the transformation to each [IntegerRangesList](#integerrangeslist) top-level element separately. list(list("reduce"), list(" ", " ", " ", list("reduce"), " first orders the ranges in ", list("x"), " from left to right, ", " then merges the overlapping or adjacent ones. ", " ", " ")) list(list("range"), list(" ", " ", " ", list("range"), " first concatenates ", list("x"), " and the objects in ", list("..."), " ", " together. If the ", list("IRanges"), " object resulting from this concatenation ", " contains at least 1 range, then ", list("range"), " returns an ", list("IRanges"), " ", " instance with a single range, from the minimum start to the maximum end ", " of the concatenated object. ", " Otherwise (i.e. if the concatenated object contains no range), ", " ", list("IRanges()"), " is returned (i.e. an ", list("IRanges"), " instance of ", " length 0). ", " ", " When passing more than 1 ", list("IntegerRangesList"), " object to ", list("range()"), ", ", " they are first merged into a single ", list("IntegerRangesList"), " object: by ", " name if all objects have names, otherwise, if they are all of the same ", " length, by position. Else, an exception is thrown. ", " ", " ")) list(list("gaps"), list(" ", " ", " ", list("gaps"), " returns the "normal" ", list("IRanges"), " object representing ", " the set of integers that remain after the set of integers represented ", " by ", list("x"), " has been removed from the interval specified by the ", " ", list("start"), " and ", list("end"), " arguments. ", " ", " If ", list("x"), " is a ", list("Views"), " object, then ", list("start=NA"), " and ", " ", list("end=NA"), " are interpreted as ", list( "start=1"), " and ", " ", list("end=length(subject(x))"), ", respectively, so, if ", list("start"), " ", " and ", list("end"), " are not specified, then gaps are extracted with respect ", " to the entire subject. ", " ", " ")) list(list("isDisjoint"), list(" ", " ", " An ", list("IntegerRanges"), " object ", list("x"), " is considered to be "disjoint" ", " if its ranges are non-overlapping. ", list("isDisjoint"), " tests whether the ", " object is "disjoint" or not. ", " ", " Note that a "normal" ", list("IntegerRanges"), " object is always "disjoint" but ", " the opposite is not true. See ", list("?isNormal"), " for more information about ", " normal ", list("IntegerRanges"), " objects. ", " ", " About empty ranges. ", list("isDisjoint"), " handles empty ranges (a.k.a. ", " zero-width ranges) as follow: single empty range A is considered to ", " overlap with single range B iff it's contained in B without being on ", " the edge of B (in which case it would be ambiguous whether A is ", " contained in or adjacent to B). More precisely, single empty range A ", " is considered to overlap with single range B iff ", " ", list(" start(B) < start(A) and end(A) < end(B)"), " ", " Because A is an empty range it verifies ", list("end(A) = start(A) - 1"), " ", " so the above is equivalent to: ", " ", list(" start(B) < start(A) <= end(B)"), " ", " and also equivalent to: ", " ", list(" start(B) <= end(A) < end(B)"), " ", " Finally, it is also equivalent to: ", " ", list(" pcompare(A, B) == 2"), " ", " See ", list("?
", list("IPosRanges-comparison"), ""), " for the meaning of the codes ", " returned by the ", list(list("pcompare")), " function. ", " ", " ")) list(list("disjoin"), list(" ", " ", " ", list("disjoin"), " returns a disjoint object, by finding the union of the ", " end points in ", list("x"), ". In other words, the result consists of a range ", " for every interval, of maximal length, over which the set of overlapping ", " ranges in ", list("x"), " is the same and at least of size 1. ", " ", " ")) list(list("disjointBins"), list(" ", " ", " ", list("disjointBins"), " segregates ", list("x"), " into a set of bins so that the ", " ranges in each bin are disjoint. Lower-indexed bins are filled first. ", " The method returns an integer vector indicating the bin index for each ", " range. ", " ")) ## Value If
xis an [IntegerRanges](#integerranges) object: *
range,
reduce,
gaps, and
disjoinreturn an [IRanges](#iranges) instance. *
isDisjointreturns
TRUEor
FALSE. *
disjointBinsreturns an integer vector parallel to
x, that is, where the i-th element corresponds to the i-th element in
x. If
xis a [Views](#views) object:
reduceand
gapsreturn a [Views](#views) object on the same subject as
xbut with modified views. If
xis a [IntegerRangesList](#integerrangeslist) object: *
range,
reduce,
gaps, and
disjoinreturn a [IntegerRangesList](#integerrangeslist) object parallel to
x. *
isDisjointreturns a logical vector parallel to
x. *
disjointBinsreturns an [IntegerList](#integerlist) object parallel to
x. ## Seealso * [intra-range-methods](#intra-range-methods) for intra range transformations. * The [IntegerRanges](#integerranges) , [Views](#views) , [IntegerRangesList](#integerrangeslist) , and [MaskCollection](#maskcollection) classes. * The [inter-range-methods](#inter-range-methods) man page in the GenomicRanges package for inter range transformations of genomic ranges. * [setops-methods](#setops-methods) for set operations on [IRanges](#iranges) objects. * [
endoapply`](#endoapply) in the S4Vectors package. ## Author H. Pagès, M. Lawrence, and P. Aboyoun ## Examplesr ## --------------------------------------------------------------------- ## range() ## --------------------------------------------------------------------- ## On an IntegerRanges object: x <- IRanges(start=c(-2, 6, 9, -4, 1, 0, -6, 3, 10), width=c( 5, 0, 6, 1, 4, 3, 2, 0, 3)) range(x) ## On an IntegerRangesList object (XVector package required): range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) # with empty ranges collection <- IRangesList(one=range1, range2, range3) if (require(XVector)) { range(collection) } irl1 <- IRangesList(a=IRanges(c(1, 2),c(4, 3)), b=IRanges(c(4, 6),c(10, 7))) irl2 <- IRangesList(c=IRanges(c(0, 2),c(4, 5)), a=IRanges(c(4, 5),c(6, 7))) range(irl1, irl2) # matched by names names(irl2) <- NULL range(irl1, irl2) # now by position ## --------------------------------------------------------------------- ## reduce() ## --------------------------------------------------------------------- ## On an IntegerRanges object: reduce(x) y <- reduce(x, with.revmap=TRUE) mcols(y)$revmap # an IntegerList reduce(x, drop.empty.ranges=TRUE) y <- reduce(x, drop.empty.ranges=TRUE, with.revmap=TRUE) mcols(y)$revmap ## Use the mapping from reduced to original ranges to split the DataFrame ## of original metadata columns by reduced range: ir0 <- IRanges(c(11:13, 2, 7:6), width=3) mcols(ir0) <- DataFrame(id=letters[1:6], score=1:6) ir <- reduce(ir0, with.revmap=TRUE) ir revmap <- mcols(ir)$revmap revmap relist(mcols(ir0)[unlist(revmap), ], revmap) # a SplitDataFrameList ## On an IntegerRangesList object. These 4 are the same: res1 <- reduce(collection) res2 <- IRangesList(one=reduce(range1), reduce(range2), reduce(range3)) res3 <- do.call(IRangesList, lapply(collection, reduce)) res4 <- endoapply(collection, reduce) stopifnot(identical(res2, res1)) stopifnot(identical(res3, res1)) stopifnot(identical(res4, res1)) reduce(collection, drop.empty.ranges=TRUE) ## --------------------------------------------------------------------- ## gaps() ## --------------------------------------------------------------------- ## On an IntegerRanges object: x0 <- IRanges(start=c(-2, 6, 9, -4, 1, 0, -6, 10), width=c( 5, 0, 6, 1, 4, 3, 2, 3)) gaps(x0) gaps(x0, start=-6, end=20) ## On a Views object: subject <- Rle(1:-3, 6:2) v <- Views(subject, start=c(8, 3), end=c(14, 4)) gaps(v) ## On an IntegerRangesList object. These 4 are the same: res1 <- gaps(collection) res2 <- IRangesList(one=gaps(range1), gaps(range2), gaps(range3)) res3 <- do.call(IRangesList, lapply(collection, gaps)) res4 <- endoapply(collection, gaps) stopifnot(identical(res2, res1)) stopifnot(identical(res3, res1)) stopifnot(identical(res4, res1)) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks gaps(mymasks) ## --------------------------------------------------------------------- ## disjoin() ## --------------------------------------------------------------------- ## On an IntegerRanges object: ir <- IRanges(c(1, 1, 4, 10), c(6, 3, 8, 10)) disjoin(ir) # IRanges(c(1, 4, 7, 10), c(3, 6, 8, 10)) disjoin(ir, with.revmap=TRUE) ## On an IntegerRangesList object: disjoin(collection) disjoin(collection, with.revmap=TRUE) ## --------------------------------------------------------------------- ## isDisjoint() ## --------------------------------------------------------------------- ## On an IntegerRanges object: isDisjoint(IRanges(c(2,5,1), c(3,7,3))) # FALSE isDisjoint(IRanges(c(2,9,5), c(3,9,6))) # TRUE isDisjoint(IRanges(1, 5)) # TRUE ## Handling of empty ranges: x <- IRanges(c(11, 16, 11, -2, 11), c(15, 29, 10, 10, 10)) stopifnot(isDisjoint(x)) ## Sliding an empty range along a non-empty range: sapply(11:17, function(i) pcompare(IRanges(i, width=0), IRanges(12, 15))) sapply(11:17, function(i) isDisjoint(c(IRanges(i, width=0), IRanges(12, 15)))) ## On an IntegerRangesList object: isDisjoint(collection) ## --------------------------------------------------------------------- ## disjointBins() ## --------------------------------------------------------------------- ## On an IntegerRanges object: disjointBins(IRanges(1, 5)) # 1L disjointBins(IRanges(c(3, 1, 10), c(5, 12, 13))) # c(2L, 1L, 2L) ## On an IntegerRangesList object: disjointBins(collection)
intra_range_methods()
Intra range transformations of an IRanges, IPos, Views, RangesList, or MaskCollection object
Description
Range-based transformations are grouped in 2 categories:
Intra range transformations (e.g.
shift()
) transform each range individually (and independently of the other ranges). They return an object parallel to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described below.Inter range transformations (e.g.
reduce
) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally NOT parallel to the input object. Those transformations are described in the inter-range-methods man page (see?`` ). Except for
threebands(), all the transformations described in this man page are endomorphisms that operate on a single "range-based" object, that is, they transform the ranges contained in the input object and return them in an object of the same class as the input object. ## Usage ```r shift(x, shift=0L, use.names=TRUE) narrow(x, start=NA, end=NA, width=NA, use.names=TRUE) resize(x, width, fix="start", use.names=TRUE, ...) flank(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...) promoters(x, upstream=2000, downstream=200, use.names=TRUE, ...) reflect(x, bounds, use.names=TRUE) restrict(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE) threebands(x, start=NA, end=NA, width=NA) ``` ## Arguments |Argument |Description| |------------- |----------------| |
x| An [IRanges](#iranges) , [IPos](#ipos) , [Views](#views) , [RangesList](#rangeslist) , or [MaskCollection](#maskcollection) object. | |
shift| An integer vector containing the shift information. Recycled as necessary so that each element corresponds to a range in
x. Can also be a list-like object parallel to
xif
xis a [RangesList](#rangeslist) object. | |
use.names|
TRUEor
FALSE. Should names be preserved? | |
start, end| If
xis an [IRanges](#iranges) , [IPos](#ipos) or [Views](#views) object: A vector of integers for all functions except for
flank. For
restrict, the supplied
startand
endarguments must be vectors of integers, eventually with NAs, that specify the restriction interval(s). Recycled as necessary so that each element corresponds to a range in
x. Same thing for
narrowand
threebands, except that here
startand
endmust contain coordinates relative to the ranges in
x. See the Details section below. For
flank,
startis a logical indicating whether
xshould be flanked at the start (
TRUE) or the end (
FALSE). Recycled as necessary so that each element corresponds to a range in
x. Can also be list-like objects parallel to
xif
xis a [RangesList](#rangeslist) object. | |
width| If
xis an [IRanges](#iranges) , [IPos](#ipos) or [Views](#views) object: For
narrowand
threebands, a vector of integers, eventually with NAs. See the SEW (Start/End/Width) interface for the details (
?solveUserSEW). For
resizeand
flank, the width of the resized or flanking regions. Note that if
bothis
TRUE, this is effectively doubled. Recycled as necessary so that each element corresponds to a range in
x. Can also be a list-like object parallel to
xif
xis a [RangesList](#rangeslist) object. | |
fix| If
xis an [IRanges](#iranges) , [IPos](#ipos) or [Views](#views) object: A character vector or character-Rle of length 1 or
length(x)containing the values
"start",
"end", and
"center"denoting what to use as an anchor for each element in
x. Can also be a list-like object parallel to
xif
xis a [RangesList](#rangeslist) object. | |
...| Additional arguments for methods. | |
both| If
TRUE, extends the flanking region
widthpositions into the range. The resulting range thus straddles the end point, with
widthpositions on either side. | |
upstream, downstream| Vectors of non-NA non-negative integers. Recycled as necessary so that each element corresponds to a range in
x. Can also be list-like objects parallel to
xif
xis a [RangesList](#rangeslist) object.
upstreamdefines the number of nucleotides toward the 5' end and
downstreamdefines the number toward the 3' end, relative to the transcription start site. Promoter regions are formed by merging the upstream and downstream ranges. Default values for
upstreamand
downstreamwere chosen based on our current understanding of gene regulation. On average, promoter regions in the mammalian genome are 5000 bp upstream and downstream of the transcription start site. | |
bounds| An [IRanges](#iranges) object to serve as the reference bounds for the reflection, see below. | |
keep.all.ranges|
TRUEor
FALSE. Should ranges that don't overlap with the restriction interval(s) be kept? Note that "don't overlap" means that they end strictly before
start - 1or start strictly after
end + 1. Ranges that end at
start - 1or start at
end + 1are always kept and their width is set to zero in the returned [IRanges](#iranges) object. | ## Details Unless specified otherwise, when
xis a [RangesList](#rangeslist) object, any transformation described here is equivalent to applying the transformation to each list element in
x. list(list("shift"), list(" ", " ", " ", list("shift"), " shifts all the ranges in ", list("x"), " by the amount specified ", " by the ", list("shift"), " argument. ", " ", " ")) list(list("narrow"), list(" ", " ", " ", list("narrow"), " narrows the ranges in ", list("x"), " i.e. each range in the ", " returned ", list("IntegerRanges"), " object is a subrange of the corresponding ", " range in ", list("x"), ". ", " The supplied start/end/width values are solved by a call to ", " ", list("solveUserSEW(width(x), start=start, end=end, width=width)"), " ", " and therefore must be compliant with the rules of the SEW ", " (Start/End/Width) interface (see ", list("?", list("solveUserSEW")), " ", " for the details). ", " Then each subrange is derived from the original range according ", " to the solved start/end/width values for this range. Note that those ", " solved values are interpreted relatively to the original range. ", " ", " ")) list(list("resize"), list(" ", " ", " ", list("resize"), " resizes the ranges to the specified width where either ", " the start, end, or center is used as an anchor. ", " ", " ")) list(list("flank"), list(" ", " ", " ", list("flank"), " generates flanking ranges for each range in ", list("x"), ". If ", " ", list("start"), " is ", list("TRUE"), " for a given range, the flanking occurs at ", " the start, otherwise the end. The widths of the flanks are given by ", " the ", list("width"), " parameter. The widths can be negative, in which case ", " the flanking region is reversed so that it represents a prefix or ", " suffix of the range in ", list("x"), ". The ", list("flank"), " operation is ", " illustrated below for a call of the form ", list("flank(x, 3, TRUE)"), ", ", " where ", list("x"), " indicates a range in ", list("x"), " and ", list("-"), " indicates ", " the resulting flanking region: ", " ", list(" ---xxxxxxx"), " ", " If ", list("start"), " were ", list("FALSE"), ": ", " ", list(" xxxxxxx---"), " ", " For negative width, i.e. ", list("flank(x, -3, FALSE)"), ", where ", list("*"), " ", " indicates the overlap between ", list("x"), " and the result: ", " ", list(" xxxx***"), " ", " If ", list("both"), " is ", list("TRUE"), ", then, for all ranges in ", list("x"), ", the ", " flanking regions are extended ", list("into"), " (or out of, if width is ", " negative) the range, so that the result straddles the given endpoint ", " and has twice the width given by ", list("width"), ". This is illustrated below ", " for ", list("flank(x, 3, both=TRUE)"), ": ", " ", list(" ---***xxxx"), " ", " ", " ")) list(list("promoters"), list(" ", " ", " ", list("promoters"), " generates promoter ranges for each range in ", list("x"), " ", " relative to the transcription start site (TSS), where TSS is ", " ", list("start(x)"), ". The promoter range is expanded around the TSS ", " according to the ", list("upstream"), " and ", list("downstream"), " arguments. ", " ", list("upstream"), " represents the number of nucleotides in the 5' ", " direction and ", list("downstream"), " the number in the 3' direction. ", " The full range is defined as, ", " (start(x) - upstream) to (start(x) + downstream - 1). ", " For documentation for using ", list("promoters"), " on a ", " ", list("GRanges"), " object see ", " ", list("?
", list("promoters,GenomicRanges-method"), ""), " in ", " the ", list("GenomicRanges"), " package. ", " ", " ")) list(list("reflect"), list(" ", " ", " ", list("reflect"), " "reflects" or reverses each range in ", list("x"), " relative to ", " the corresponding range in ", list("bounds"), ", which is recycled as ", " necessary. Reflection preserves the width of a range, but shifts it ", " such the distance from the left bound to the start of the range ", " becomes the distance from the end of the range to the right ", " bound. This is illustrated below, where ", list("x"), " represents ", " a range in ", list("x"), " and ", list("["), " and ", list("]"), " indicate the bounds: ", " ", list(" [..xxx.....] ", " becomes ", " [.....xxx..]"), " ", " ", " ")) list(list("restrict"), list(" ", " ", " ", list("restrict"), " restricts the ranges in ", list("x"), " to the interval(s) ", " specified by the ", list("start"), " and ", list("end"), " arguments. ", " ", " ")) list(list("threebands"), list(" ", " ", " ", list("threebands"), " extends the capability of ", list("narrow"), " by returning ", " the 3 ranges objects associated to the narrowing operation. ", " The returned value ", list("y"), " is a list of 3 ranges objects named ", " ", list(""left""), ", ", list(""middle""), " and ", list(""right""), ". ", " The middle component is obtained by calling ", list("narrow"), " with the ", " same arguments (except that names are dropped). The left and right ", " components are also instances of the same class as ", list("x"), " and they ", " contain what has been removed on the left and right sides (respectively) ", " of the original ranges during the narrowing. ", " ", " Note that original object ", list("x"), " can be reconstructed from the ", " left and right bands with ", list("punion(y$left, y$right, fill.gap=TRUE)"), ". ", " ", " ")) ## Seealso * [inter-range-methods](#inter-range-methods) for inter range transformations. * The [IRanges](#iranges) , [IPos](#ipos) , [Views](#views) , [RangesList](#rangeslist) , and [MaskCollection](#maskcollection) classes. * The [intra-range-methods](#intra-range-methods) man page in the GenomicRanges package for intra range transformations of genomic ranges. * [setops-methods](#setops-methods) for set operations on [IRanges](#iranges) objects. * [
endoapply`](#endoapply) in the S4Vectors package. ## Author H. Pagès, M. Lawrence, and P. Aboyoun ## Examplesr ## --------------------------------------------------------------------- ## shift() ## --------------------------------------------------------------------- ## On an IRanges object: ir1 <- successiveIRanges(c(19, 5, 0, 8, 5)) ir1 shift(ir1, shift=-3) ## On an IRangesList object: range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8)) range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5)) range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0)) # with empty ranges collection <- IRangesList(one=range1, range2, range3) shift(collection, shift=5) # same as endoapply(collection, shift, shift=5) ## Sanity check: res1 <- shift(collection, shift=5) res2 <- endoapply(collection, shift, shift=5) stopifnot(identical(res1, res2)) ## --------------------------------------------------------------------- ## narrow() ## --------------------------------------------------------------------- ## On an IRanges object: ir2 <- ir1[width(ir1) != 0] narrow(ir2, start=4, end=-2) narrow(ir2, start=-4, end=-2) narrow(ir2, end=5, width=3) narrow(ir2, start=c(3, 4, 2, 3), end=c(12, 5, 7, 4)) ## On an IRangesList object: narrow(collection[-3], start=2) narrow(collection[-3], end=-2) ## On a MaskCollection object: mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2)) mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1)) mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4)) mymasks <- append(append(mask1, mask2), mask3) mymasks narrow(mymasks, start=8) ## --------------------------------------------------------------------- ## resize() ## --------------------------------------------------------------------- ## On an IRanges object: resize(ir2, 200) resize(ir2, 2, fix="end") ## On an IRangesList object: resize(collection, width=200) ## --------------------------------------------------------------------- ## flank() ## --------------------------------------------------------------------- ## On an IRanges object: ir3 <- IRanges(c(2,5,1), c(3,7,3)) flank(ir3, 2) flank(ir3, 2, start=FALSE) flank(ir3, 2, start=c(FALSE, TRUE, FALSE)) flank(ir3, c(2, -2, 2)) flank(ir3, 2, both = TRUE) flank(ir3, 2, start=FALSE, both=TRUE) flank(ir3, -2, start=FALSE, both=TRUE) ## On an IRangesList object: flank(collection, width=10) ## --------------------------------------------------------------------- ## promoters() ## --------------------------------------------------------------------- ## On an IRanges object: ir4 <- IRanges(20:23, width=3) promoters(ir4, upstream=0, downstream=0) ## no change promoters(ir4, upstream=0, downstream=1) ## start value only promoters(ir4, upstream=1, downstream=0) ## single upstream nucleotide ## On an IRangesList object: promoters(collection, upstream=5, downstream=2) ## --------------------------------------------------------------------- ## reflect() ## --------------------------------------------------------------------- ## On an IRanges object: bounds <- IRanges(c(0, 5, 3), c(10, 6, 9)) reflect(ir3, bounds) ## reflect() does not yet support IRangesList objects! ## --------------------------------------------------------------------- ## restrict() ## --------------------------------------------------------------------- ## On an IRanges object: restrict(ir1, start=12, end=34) restrict(ir1, start=20) restrict(ir1, start=21) restrict(ir1, start=21, keep.all.ranges=TRUE) ## On an IRangesList object: restrict(collection, start=2, end=8) ## --------------------------------------------------------------------- ## threebands() ## --------------------------------------------------------------------- ## On an IRanges object: z <- threebands(ir2, start=4, end=-2) ir2b <- punion(z$left, z$right, fill.gap=TRUE) stopifnot(identical(ir2, ir2b)) threebands(ir2, start=-5) ## threebands() does not support IRangesList objects.
multisplit()
Split elements belonging to multiple groups
Description
This is like split
, except elements can belong to
multiple groups, in which case they are repeated to appear in multiple
elements of the return value.
Usage
multisplit(x, f)
Arguments
Argument | Description |
---|---|
x | The object to split, like a vector. |
f | A list-like object of vectors, the same length as x , where each element indicates the groups to which each element of x belongs. |
Value
A list-like object, with an element for each unique value in the
unlisted f
, containing the elements in x
where the
corresponding element in f
contained that value. Just try it.
Author
Michael Lawrence
Examples
multisplit(1:3, list(letters[1:2], letters[2:3], letters[2:4]))
nearest_methods()
Finding the nearest range neighbor
Description
The nearest
, precede
, follow
, distance
and distanceToNearest
methods for IntegerRanges
objects and subclasses.
Usage
list(list("nearest"), list("IntegerRanges,IntegerRanges_OR_missing"))(x, subject, select = c("arbitrary", "all"))
list(list("precede"), list("IntegerRanges,IntegerRanges_OR_missing"))(x, subject, select = c("first", "all"))
list(list("follow"), list("IntegerRanges,IntegerRanges_OR_missing"))(x, subject, select = c("last", "all"))
list(list("distanceToNearest"), list("IntegerRanges,IntegerRanges_OR_missing"))(x, subject, select = c("arbitrary", "all"))
list(list("distance"), list("IntegerRanges,IntegerRanges"))(x, y)
list(list("distance"), list("Pairs,missing"))(x, y)
Arguments
Argument | Description |
---|---|
x | The query IntegerRanges object, or (for distance() ) a Pairs containing both the query (first) and subject (second). |
subject | The subject IntegerRanges object, within which the nearest neighbors are found. Can be missing, in which case x is also the subject. |
select | Logic for handling ties. By default, all the methods select a single interval (arbitrary for nearest ,the first by order in subject for precede , and the last for follow ). To get all matchings, as a Hits object, use all . |
y | For the distance method, a IntegerRanges object. Cannot be missing. If x and y are not the same length, the shortest will be recycled to match the length of the longest. |
hits | The hits between x and subject |
... | Additional arguments for methods |
Details
list("nearest: ") list(" ", " The conventional nearest neighbor finder. Returns an integer vector ", " containing the index of the nearest neighbor range in ", list("subject"), " ", " for each range in ", list("x"), ". If there is no nearest neighbor ", " (if ", list("subject"), " is empty), NA's are returned. ", " ", " Here is roughly how it proceeds, for a range ", list("xi"), " in ", list("x"), ": ", " ", list(" ", " ", list(), " Find the ranges in ", list("subject"), " that overlap ", list("xi"), ". If a ", " single range ", list("si"), " in ", list("subject"), " overlaps ", list("xi"), ", ", " ", list("si"), " is returned as the nearest neighbor of ", list("xi"), ". If there ", " are multiple overlaps, one of the overlapping ranges is chosen ", " arbitrarily. ", " ", list(), " If no ranges in ", list("subject"), " overlap with ", list("xi"), ", then ", " the range in ", list("subject"), " with the shortest distance from its end ", " to the start ", list("xi"), " or its start to the end of ", list("xi"), " is ", " returned. ", " "), " ", " ")
list("precede: ") list(" ", " For each range in ", list("x"), ", ", list("precede"), " returns the index of the ", " interval in ", list("subject"), " that is directly preceded by the query ", " range. Overlapping ranges are excluded. ", list("NA"), " is returned when ", " there are no qualifying ranges in ", list("subject"), ". ", " ")
list("follow: ") list(" ", " The opposite of ", list("precede"), ", this function returns the index ", " of the range in ", list("subject"), " that a query range in ", list("x"), " ", " directly follows. Overlapping ranges are excluded. ", list("NA"), " is ", " returned when there are no qualifying ranges in ", list("subject"), ". ", " ")
list("distanceToNearest: ") list(" ", " Returns the distance for each range in ", list("x"), " to its nearest ", " neighbor in ", list("subject"), ". ", " ")
list("distance: ") list(" ", " Returns the distance for each range in ", list("x"), " to the range in ", " ", list("y"), ". ", " ", " The ", list("distance"), " method differs from others documented on this ", " page in that it is symmetric; ", list("y"), " cannot be missing. If ", list("x"), " ", " and ", list("y"), " are not the same length, the shortest will be recycled to ", " match the length of the longest. The ", list("select"), " argument is not ", " available for ", list("distance"), " because comparisons are made in a ", " pair-wise fashion. The return value is the length of the longest ", " of ", list("x"), " and ", list("y"), ". ", " ", " The ", list("distance"), " calculation changed in BioC 2.12 to accommodate ", " zero-width ranges in a consistent and intuitive manner. The new distance ", " can be explained by a ", list("block"), " model where a range is represented by ", " a series of blocks of size 1. Blocks are adjacent to each other and there ", " is no gap between them. A visual representation of ", list("IRanges(4,7)"), " ", " would be ", " ", " ", list(" ", " +-----+-----+-----+-----+ ", " 4 5 6 7 ", " "), " ", " ", " The distance between two consecutive blocks is 0L (prior to ", " Bioconductor 2.12 it was 1L). The new distance calculation now returns ", " the size of the gap between two ranges. ", " ", " This change to distance affects the notion of overlaps in that ", " we no longer say: ", " ", " x and y overlap <=> distance(x, y) == 0 ", " ", " Instead we say ", " ", " x and y overlap => distance(x, y) == 0 ", " ", " or ", " ", " x and y overlap or are adjacent <=> distance(x, y) == 0 ", " ")
list("selectNearest: ") list(" ", " Selects the hits that have the minimum distance within those for ", " each query range. Ties are possible and can be broken with ", " ", list(list("breakTies")), ". ", " ")
Value
For nearest
, precede
and follow
, an integer
vector of indices in subject
, or a Hits
if select="all"
.
For distanceToNearest
, a Hits
object with an elementMetadata
column of the distance
between the pair. Access distance
with mcols
accessor.
For distance
, an integer vector of distances between the ranges
in x
and y
.
For selectNearest
, a Hits object, sorted
by query.
Seealso
The IntegerRanges and Hits classes.
The GenomicRanges and GRanges classes in the GenomicRanges package.
findOverlaps
for finding just the overlapping ranges.list() list(" ", " GenomicRanges methods for ", " ", list(" ", " ", list(), " ", list("precede"), " ", " ", list(), " ", list("follow"), " ", " ", list(), " ", list("nearest"), " ", " ", list(), " ", list("distance"), " ", " ", list(), " ", list("distanceToNearest"), " ", " "), " ", " are documented at ", " ?", list(list("nearest-methods")), " or ", " ?", list(list("precede,GenomicRanges,GenomicRanges-method")), " ", " ")
Author
M. Lawrence
Examples
## ------------------------------------------
## precede() and follow()
## ------------------------------------------
query <- IRanges(c(1, 3, 9), c(3, 7, 10))
subject <- IRanges(c(3, 2, 10), c(3, 13, 12))
precede(query, subject) # c(3L, 3L, NA)
precede(IRanges(), subject) # integer()
precede(query, IRanges()) # rep(NA_integer_, 3)
precede(query) # c(3L, 3L, NA)
follow(query, subject) # c(NA, NA, 1L)
follow(IRanges(), subject) # integer()
follow(query, IRanges()) # rep(NA_integer_, 3)
follow(query) # c(NA, NA, 2L)
## ------------------------------------------
## nearest()
## ------------------------------------------
query <- IRanges(c(1, 3, 9), c(2, 7, 10))
subject <- IRanges(c(3, 5, 12), c(3, 6, 12))
nearest(query, subject) # c(1L, 1L, 3L)
nearest(query) # c(2L, 1L, 2L)
## ------------------------------------------
## distance()
## ------------------------------------------
## adjacent
distance(IRanges(1,5), IRanges(6,10)) # 0L
## overlap
distance(IRanges(1,5), IRanges(3,7)) # 0L
## zero-width
sapply(-3:3, function(i) distance(shift(IRanges(4,3), i), IRanges(4,3)))
range_squeezers()
Squeeze the ranges out of a range-based object
Description
S4 generic functions for squeezing the ranges out of a range-based object.
These are analog to range squeezers granges
and grglist
defined in the GenomicRanges
package, except that ranges
returns the ranges in an IRanges
object (instead of a GRanges object for
granges
), and rglist
returns them in an
IRangesList object (instead of a GRangesList
object for grglist
).
Usage
ranges(x, use.names=TRUE, use.mcols=FALSE, ...)
rglist(x, use.names=TRUE, use.mcols=FALSE, ...)
Arguments
Argument | Description |
---|---|
x | An object containing ranges e.g. a IntegerRanges , GenomicRanges , RangedSummarizedExperiment , GAlignments , GAlignmentPairs , or GAlignmentsList object, or a Pairs object containing ranges. |
use.names | TRUE (the default) or FALSE . Whether or not the names on x (accessible with names(x) ) should be propagated to the returned object. |
use.mcols | TRUE or FALSE (the default). Whether or not the metadata columns on x (accessible with mcols(x) ) should be propagated to the returned object. |
... | Additional arguments, for use in specific methods. |
Details
Various packages (e.g. IRanges , GenomicRanges , SummarizedExperiment , GenomicAlignments , etc...) define and document various range squeezing methods for various types of objects.
Note that these functions can be seen as object getters or as functions performing coercion.
For some objects (e.g. GAlignments and
GAlignmentPairs objects defined in the
GenomicAlignments package), as(x, "IRanges")
and
as(x, "IRangesList")
, are equivalent to
ranges(x, use.names=TRUE, use.mcols=TRUE)
and
rglist(x, use.names=TRUE, use.mcols=TRUE)
, respectively.
Value
An IRanges object for ranges
.
An IRangesList object for rglist
.
If x
is a vector-like object (e.g.
GAlignments ), the returned object is expected
to be parallel to x
, that is, the i-th element in the output
corresponds to the i-th element in the input.
If use.names
is TRUE, then the names on x
(if any) are propagated to the returned object.
If use.mcols
is TRUE, then the metadata columns on x
(if any) are propagated to the returned object.
Seealso
IRanges and IRangesList objects.
RangedSummarizedExperiment objects in the SummarizedExperiment packages.
GAlignments , GAlignmentPairs , and GAlignmentsList objects in the GenomicAlignments package.
Author
H. Pagès
Examples
## See ?GAlignments in the GenomicAlignments package for examples of
## "ranges" and "rglist" methods.
readMask()
Read a mask from a file
Description
read.agpMask
and read.gapMask
extract the AGAPS mask from an
NCBI "agp" file or a UCSC "gap" file, respectively.
read.liftMask
extracts the AGAPS mask from a UCSC "lift" file
(i.e. a file containing offsets of contigs within sequences).
read.rmMask
extracts the RM mask from a RepeatMasker .out file.
read.trfMask
extracts the TRF mask from a Tandem Repeats Finder .bed
file.
Usage
read.agpMask(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE)
read.gapMask(file, seqname="?", mask.width=NA, gap.types=NULL, use.gap.types=FALSE)
read.liftMask(file, seqname="?", mask.width=NA)
read.rmMask(file, seqname="?", mask.width=NA, use.IDs=FALSE)
read.trfMask(file, seqname="?", mask.width=NA)
Arguments
Argument | Description |
---|---|
file | Either a character string naming a file or a connection open for reading. |
seqname | The name of the sequence for which the mask must be extracted. If no sequence is specified (i.e. seqname="?" ) then an error is raised and the sequence names found in the file are displayed. If the file doesn't contain any information for the specified sequence, then a warning is issued and an empty mask of width mask.width is returned. |
mask.width | The width of the mask to return i.e. the length of the sequence this mask will be put on. See `?`` for more information about the width of a MaskCollection object. |
gap.types | NULL or a character vector containing gap types. Use this argument to filter the assembly gaps that are to be extracted from the "agp" or "gap" file based on their type. Most common gap types are "contig" , "clone" , "centromere" , "telomere" , "heterochromatin" , "short_arm" and "fragment" . With gap.types=NULL , all the assembly gaps described in the file are extracted. With gap.types="?" , an error is raised and the gap types found in the file for the specified sequence are displayed. |
use.gap.types | Whether or not the gap types provided in the "agp" or "gap" file should be used to name the ranges constituing the returned mask. See `?`` for more information about the names of an IRanges object. |
use.IDs | Whether or not the repeat IDs provided in the RepeatMasker .out file should be used to name the ranges constituing the returned mask. See `?`` for more information about the names of an IRanges object. |
Seealso
MaskCollection-class , IRanges-class
Examples
## ---------------------------------------------------------------------
## A. Extract a mask of assembly gaps ("AGAPS" mask) with read.agpMask()
## ---------------------------------------------------------------------
## Note: The hs_b36v3_chrY.agp file was obtained by downloading,
## extracting and renaming the hs_ref_chrY.agp.gz file from
##
## ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/
## hs_ref_chrY.agp.gz 5 KB 24/03/08 04:33:00 PM
##
## on May 9, 2008.
chrY_length <- 57772954
file1 <- system.file("extdata", "hs_b36v3_chrY.agp", package="IRanges")
mask1 <- read.agpMask(file1, seqname="chrY", mask.width=chrY_length,
use.gap.types=TRUE)
mask1
mask1[[1]]
mask11 <- read.agpMask(file1, seqname="chrY", mask.width=chrY_length,
gap.types=c("centromere", "heterochromatin"))
mask11[[1]]
## ---------------------------------------------------------------------
## B. Extract a mask of assembly gaps ("AGAPS" mask) with read.liftMask()
## ---------------------------------------------------------------------
## Note: The hg18liftAll.lft file was obtained by downloading,
## extracting and renaming the liftAll.zip file from
##
## http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/
## liftAll.zip 03-Feb-2006 11:35 5.5K
##
## on May 8, 2008.
file2 <- system.file("extdata", "hg18liftAll.lft", package="IRanges")
mask2 <- read.liftMask(file2, seqname="chr1")
mask2
if (interactive()) {
## contigs 7 and 8 for chrY are adjacent
read.liftMask(file2, seqname="chrY")
## displays the sequence names found in the file
read.liftMask(file2)
## specify an unknown sequence name
read.liftMask(file2, seqname="chrZ", mask.width=300)
}
## ---------------------------------------------------------------------
## C. Extract a RepeatMasker ("RM") or Tandem Repeats Finder ("TRF")
## mask with read.rmMask() or read.trfMask()
## ---------------------------------------------------------------------
## Note: The ce2chrM.fa.out and ce2chrM.bed files were obtained by
## downloading, extracting and renaming the chromOut.zip and
## chromTrf.zip files from
##
## http://hgdownload.cse.ucsc.edu/goldenPath/ce2/bigZips/
## chromOut.zip 21-Apr-2004 09:05 2.6M
## chromTrf.zip 21-Apr-2004 09:07 182K
##
## on May 7, 2008.
## Before you can extract a mask with read.rmMask() or read.trfMask(), you
## need to know the length of the sequence that you're going to put the
## mask on:
if (interactive()) {
library(BSgenome.Celegans.UCSC.ce2)
chrM_length <- seqlengths(Celegans)[["chrM"]]
## Read the RepeatMasker .out file for chrM in ce2:
file3 <- system.file("extdata", "ce2chrM.fa.out", package="IRanges")
RMmask <- read.rmMask(file3, seqname="chrM", mask.width=chrM_length)
RMmask
## Read the Tandem Repeats Finder .bed file for chrM in ce2:
file4 <- system.file("extdata", "ce2chrM.bed", package="IRanges")
TRFmask <- read.trfMask(file4, seqname="chrM", mask.width=chrM_length)
TRFmask
desc(TRFmask) <- paste(desc(TRFmask), "[period<=12]")
TRFmask
## Put the 2 masks on chrM:
chrM <- Celegans$chrM
masks(chrM) <- RMmask # this would drop all current masks, if any
masks(chrM) <- append(masks(chrM), TRFmask)
chrM
}
reverse_methods()
reverse
Description
A generic function for reversing vector-like or list-like objects.
This man page describes methods for reversing a character vector,
a Views object, or a MaskCollection object.
Note that reverse
is similar to but not the same as
rev
.
Usage
reverse(x, ...)
Arguments
Argument | Description |
---|---|
x | A vector-like or list-like object. |
... | Additional arguments to be passed to or from methods. |
Details
On a character vector or a Views object, reverse
reverses
each element individually, without modifying the top-level order of the
elements. More precisely, each individual string of a character vector
is reversed.
Value
An object of the same class and length as the original object.
Seealso
reverse-methods ,
Views-class ,
MaskCollection-class ,
endoapply
,
rev
Examples
## On a character vector:
reverse(c("Hi!", "How are you?"))
rev(c("Hi!", "How are you?"))
## On a Views object:
v <- successiveViews(Rle(c(-0.5, 12.3, 4.88), 4:2), 1:4)
v
reverse(v)
rev(v)
## On a MaskCollection object:
mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2))
mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1))
mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4))
mymasks <- append(append(mask1, mask2), mask3)
reverse(mymasks)
seqapply()
2 methods that should be documented somewhere else
Description
unsplit
method for List object and split<-
method for Vector object.
Usage
list(list("unsplit"), list("List"))(value, f, drop = FALSE)
list(list("split"), list("Vector"))(x, f, drop = FALSE, ...) <- value
Arguments
Argument | Description |
---|---|
value | The List object to unsplit. |
f | A factor or list of factors |
drop | Whether to drop empty elements from the returned list |
x | Like X |
list() | Extra arguments to pass to FUN |
Details
unsplit
unlists value
, where the order of the returned
vector is as if value
were originally created by splitting that
vector on the factor f
.
split(x, f, drop = FALSE) <- value
: Virtually splits x
by
the factor f
, replaces the elements of the resulting list with the
elements from the list value
, and restores x
to its original
form. Note that this works for any Vector
, even though split
itself is not universally supported.
Author
Michael Lawrence
setops_methods()
Set operations on IntegerRanges and IntegerRangesList objects
Description
Performs set operations on IntegerRanges and IntegerRangesList objects.
Usage
## Vector-wise set operations
## --------------------------
list(list("union"), list("IntegerRanges,IntegerRanges"))(x, y)
list(list("union"), list("Pairs,missing"))(x, y, ...)
list(list("intersect"), list("IntegerRanges,IntegerRanges"))(x, y)
list(list("intersect"), list("Pairs,missing"))(x, y, ...)
list(list("setdiff"), list("IntegerRanges,IntegerRanges"))(x, y)
list(list("setdiff"), list("Pairs,missing"))(x, y, ...)
## Element-wise (aka "parallel") set operations
## --------------------------------------------
list(list("punion"), list("IntegerRanges,IntegerRanges"))(x, y, fill.gap=FALSE)
list(list("punion"), list("Pairs,missing"))(x, y, ...)
list(list("pintersect"), list("IntegerRanges,IntegerRanges"))(x, y, resolve.empty=c("none", "max.start", "start.x"))
list(list("pintersect"), list("Pairs,missing"))(x, y, ...)
list(list("psetdiff"), list("IntegerRanges,IntegerRanges"))(x, y)
list(list("psetdiff"), list("Pairs,missing"))(x, y, ...)
list(list("pgap"), list("IntegerRanges,IntegerRanges"))(x, y)
Arguments
Argument | Description |
---|---|
x, y | Objects representing ranges. |
fill.gap | Logical indicating whether or not to force a union by using the rule start = min(start(x), start(y)), end = max(end(x), end(y)) . |
resolve.empty | One of "none" , "max.start" , or "start.x" denoting how to handle ambiguous empty ranges formed by intersections. "none" - throw an error if an ambiguous empty range is formed, "max.start" - associate the maximum start value with any ambiguous empty range, and "start.x" - associate the start value of x with any ambiguous empty range. (See Details section below for the definition of an ambiguous range.) |
... | The methods for Pairs objects pass any extra argument to the internal call to punion(first(x), last(x), ...) , pintersect(first(x), last(x), ...) , etc... |
Details
The union
, intersect
and setdiff
methods
for IntegerRanges objects return a "normal" IntegerRanges
object representing the union, intersection and (asymmetric!)
difference of the sets of integers represented by x
and
y
.
punion
, pintersect
, psetdiff
and pgap
are generic functions that compute the element-wise (aka "parallel")
union, intersection, (asymmetric!) difference and gap between
each element in x
and its corresponding element in y
.
Methods for IntegerRanges objects are defined. For these methods,
x
and y
must have the same length (i.e. same number
of ranges). They return a IntegerRanges object parallel
to x
and y
i.e. where the i-th range corresponds
to the i-th range in x
and in y
) and represents
the union/intersection/difference/gap of/between the corresponding
x[i]
and y[i]
.
If x
is a Pairs
object, then y
should be missing, and the operation is performed between the members
of each pair.
By default, pintersect
will throw an error when an "ambiguous
empty range" is formed. An ambiguous empty range can occur three
different ways: 1) when corresponding non-empty ranges elements x
and y
have an empty intersection, 2) if the position of an empty
range element does not fall within the corresponding limits of a non-empty
range element, or 3) if two corresponding empty range elements do not have
the same position. For example if empty range element [22,21] is intersected
with non-empty range element [1,10], an error will be produced; but if
it is intersected with the range [22,28], it will produce [22,21].
As mentioned in the Arguments section above, this behavior can be
changed using the resolve.empty
argument.
Value
On IntegerRanges objects, union
, intersect
, and
setdiff
return an IRanges instance that is guaranteed
to be normal (see isNormal
) but is NOT promoted to
NormalIRanges .
On IntegerRanges objects, punion
, pintersect
,
psetdiff
, and pgap
return an object of the same class
and length as their first argument.
Seealso
pintersect
is similar tonarrow
, except the end points are absolute, not relative.pintersect
is also similar torestrict
, except ranges outside of the restriction become empty and are not discarded.setops-methods in the GenomicRanges package for set operations on genomic ranges.
findOverlaps-methods for finding/counting overlapping ranges.
intra-range-methods and inter-range-methods for intra range and inter range transformations.
IntegerRanges and IntegerRangesList objects. In particular, normality of an IntegerRanges object is discussed in the man page for IntegerRanges objects.
mendoapply
in the S4Vectors package.
Author
H. Pagès and M. Lawrence
Examples
x <- IRanges(c(1, 5, -2, 0, 14), c(10, 9, 3, 11, 17))
subject <- Rle(1:-3, 6:2)
y <- Views(subject, start=c(14, 0, -5, 6, 18), end=c(20, 2, 2, 8, 20))
## Vector-wise operations:
union(x, ranges(y))
union(ranges(y), x)
intersect(x, ranges(y))
intersect(ranges(y), x)
setdiff(x, ranges(y))
setdiff(ranges(y), x)
## Element-wise (aka "parallel") operations:
try(punion(x, ranges(y)))
punion(x[3:5], ranges(y)[3:5])
punion(x, ranges(y), fill.gap=TRUE)
try(pintersect(x, ranges(y)))
pintersect(x[3:4], ranges(y)[3:4])
pintersect(x, ranges(y), resolve.empty="max.start")
psetdiff(ranges(y), x)
try(psetdiff(x, ranges(y)))
start(x)[4] <- -99
end(y)[4] <- 99
psetdiff(x, ranges(y))
pgap(x, ranges(y))
## On IntegerRangesList objects:
irl1 <- IRangesList(a=IRanges(c(1,2),c(4,3)), b=IRanges(c(4,6),c(10,7)))
irl2 <- IRangesList(c=IRanges(c(0,2),c(4,5)), a=IRanges(c(4,5),c(6,7)))
union(irl1, irl2)
intersect(irl1, irl2)
setdiff(irl1, irl2)
slice_methods()
Slice a vector-like or list-like object
Description
slice
is a generic function that creates views on a vector-like
or list-like object that contain the elements that are within the
specified bounds.
Usage
slice(x, lower=-Inf, upper=Inf, ...)
list(list("slice"), list("Rle"))(x, lower=-Inf, upper=Inf,
includeLower=TRUE, includeUpper=TRUE, rangesOnly=FALSE)
list(list("slice"), list("RleList"))(x, lower=-Inf, upper=Inf,
includeLower=TRUE, includeUpper=TRUE, rangesOnly=FALSE)
Arguments
Argument | Description |
---|---|
x | An Rle or RleList object, or any object coercible to an Rle object. |
lower, upper | The lower and upper bounds for the slice. |
includeLower, includeUpper | Logical indicating whether or not the specified boundary is open or closed. |
rangesOnly | A logical indicating whether or not to drop the original data from the output. |
... | Additional arguments to be passed to specific methods. |
Details
slice
is useful for finding areas of absolute maxima (peaks),
absolute minima (troughs), or fluctuations within specified limits.
One or more view summarization methods can be used on the result of
slice
. See ?
link{view-summarization-methods}` ## Value The method for [Rle](#rle) objects returns an [RleViews](#rleviews) object if
rangesOnly=FALSEor an [IRanges](#iranges) object if
rangesOnly=TRUE. The method for [RleList](#rlelist) objects returns an [RleViewsList](#rleviewslist) object if
rangesOnly=FALSEor an [IRangesList](#irangeslist) object if
rangesOnly=TRUE. ## Seealso * [view-summarization-methods](#view-summarization-methods) for summarizing the views returned by
slice. * [slice-methods](#slice-methods) in the XVector package for more
slicemethods. * [
coverage`](#coverage) for computing the coverage across a set of ranges.
* The Rle , RleList , RleViews , and RleViewsList classes.
## Author
P. Aboyoun
## Examples
r ## Views derived from coverage x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L), width=c(5L, 6L, 3L, 4L, 3L, 3L)) cvg <- coverage(x) slice(cvg, lower=2) slice(cvg, lower=2, rangesOnly=TRUE)
view_summarization_methods()
Summarize views on a vector-like object with numeric values
Description
viewApply
applies a function on each view of a Views or
ViewsList object.
viewMins
, viewMaxs
, viewSums
, viewMeans
calculate respectively the minima, maxima, sums, and means of the views
in a Views or ViewsList object.
Usage
viewApply(X, FUN, ..., simplify = TRUE)
viewMins(x, na.rm=FALSE)
list(list("min"), list("Views"))(x, ..., na.rm = FALSE)
viewMaxs(x, na.rm=FALSE)
list(list("max"), list("Views"))(x, ..., na.rm = FALSE)
viewSums(x, na.rm=FALSE)
list(list("sum"), list("Views"))(x, ..., na.rm = FALSE)
viewMeans(x, na.rm=FALSE)
list(list("mean"), list("Views"))(x, ...)
viewWhichMins(x, na.rm=FALSE)
list(list("which.min"), list("Views"))(x)
viewWhichMaxs(x, na.rm=FALSE)
list(list("which.max"), list("Views"))(x)
viewRangeMins(x, na.rm=FALSE)
viewRangeMaxs(x, na.rm=FALSE)
Arguments
Argument | Description |
---|---|
X | A Views object. |
FUN | The function to be applied to each view in X . |
... | Additional arguments to be passed on. |
simplify | A logical value specifying whether or not the result should be simplified to a vector or matrix if possible. |
x | An RleViews or RleViewsList object. |
na.rm | Logical indicating whether or not to include missing values in the results. |
Details
The viewMins
, viewMaxs
, viewSums
, and viewMeans
functions provide efficient methods for calculating the specified numeric
summary by performing the looping in compiled code.
The viewWhichMins
, viewWhichMaxs
, viewRangeMins
, and
viewRangeMaxs
functions provide efficient methods for finding the
locations of the minima and maxima.
Value
For all the functions in this man page (except viewRangeMins
and
viewRangeMaxs
): A numeric vector of the length of x
if x
is an RleViews object, or a List object of
the length of x
if it's an RleViewsList object.
For viewRangeMins
and viewRangeMaxs
: An IRanges
object if x
is an RleViews object, or an IRangesList
object if it's an RleViewsList object.
Seealso
view-summarization-methods in the XVector package for more view summarization methods.
The RleViews and RleViewsList classes.
Note
For convenience, methods for min
, max
, sum
,
mean
, which.min
and which.max
are provided as
wrappers around the corresponding view*
functions (which might
be deprecated at some point).
Author
P. Aboyoun
Examples
## Views derived from coverage
x <- IRanges(start=c(1L, 9L, 4L, 1L, 5L, 10L),
width=c(5L, 6L, 3L, 4L, 3L, 3L))
cvg <- coverage(x)
cvg_views <- slice(cvg, lower=2)
viewApply(cvg_views, diff)
viewMins(cvg_views)
viewMaxs(cvg_views)
viewSums(cvg_views)
viewMeans(cvg_views)
viewWhichMins(cvg_views)
viewWhichMaxs(cvg_views)
viewRangeMins(cvg_views)
viewRangeMaxs(cvg_views)