bioconductor v3.9.0 Gviz
Genomic data analyses requires integrated visualization
Link to this section Summary
Functions
AlignedReadTrack class and methods (NOTE: THIS IS STILL IN DEVELOPMENT AND SUBJECT TO CHANGE)
AlignmentsTrack class and methods
AnnotationTrack class and methods
BiomartGeneRegionTrack class and methods
CustomTrack class and methods
DataTrack class and methods
DisplayPars class and method
GdObject class and methods
GeneRegionTrack class and methods
GenomeAxisTrack class and methods
HighlightTrack class and methods
IdeogramTrack class and methods
ImageMap class and methods
NumericTrack class and methods
OverlayTrack class and methods
RangeTrack class and methods
ReferenceTrack class and methods
SequenceTrack class and methods
StackedTrack class and methods
Meta-constructor for GenomeGraph tracks fetched directly from the various UCSC data sources.
Dynamic content based on the available resolution
Data sets
Export GenomeGraph tracks to a annotation file representation.
Grouping of annotation features
The main plotting function for one or several GenomeGraph tracks.
Setting display parameters to control the look and feel of the plots
Link to this section Functions
AlignedReadTrack_class()
AlignedReadTrack class and methods (NOTE: THIS IS STILL IN DEVELOPMENT AND SUBJECT TO CHANGE)
Description
A class to represent short sequences that have been aligned to a reference genome as they are typically generated in a next generation sequencing experiment.
Usage
AlignedReadTrack(range=NULL, start=NULL, end=NULL, width=NULL, chromosome, strand, genome,
stacking="squish", name="AlignedReadTrack", coverageOnly=FALSE, ...)
Arguments
Argument | Description |
---|
|range
| An object of class list("GRanges") , or a data.frame
which will be coerced into one in which case it needs to contain at least the three columns: list("
", "
", " ", list(list(), list(list("start"), ", ", list("end"), ": the start and end coordinates
", " for the track items.")), "
", "
", " ", list(list(), list(list("strand"), ": the strand information for the track
", " items. It may be provided in the form ", list("+"), " for the Watson
", " strand, ", list("-"), " for the Crick strand or ", list("*"), " for either one
", " of the two.")), "
", "
", " ") Alternatively, the range
argument may be missing, in which case the relevant information has to be provided as individual function arguments (see below). |
|start, end, width
| Integer vectors, giving the start and the end end coordinates for the individual track items, or their width. Two of the three need to be specified, and have to be of equal length or of length one, in which case this value will be recycled. Otherwise, the usual R recycling rules for vectors do not apply.|
|strand
| Character vector, the strand information for the individual track items. Needs to be of equal length as the start, end
or width
vectors, or of length 1. Please note that grouped items need to be on the same strand, and erroneous entries will result in casting of an error.|
|chromosome
| The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier. Please note that at this stage only syntactic checking takes place, i.e., the argument value needs to be a single integer, numeric character or a character of the form chrx
, where x may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point.|
|stacking
| The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack,full)
. Currently, only hide (don't show the track items, squish (make best use of the available space) and dense (no stacking at all) are implemented.|
|name
| Character scalar of the track's name used in the title panel when plotting.|
|coverageOnly
| Instead of storing individual reads, just compute the coverage and store the resulting coverage vector.|
|list()
| Additional items which will all be interpreted as further display parameters.|
Value
The return value of the constructor function is a new object of class
AlignedReadTrack
.
Seealso
AnnotationTrack
DataTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
RangeTrack
StackedTrack
Author
Florian Hahne
Examples
## Construct from individual arguments
arTrack <- AlignedReadTrack(start=runif(1000, 100, 200), width=24,
genome="mm9", chromosome=7, strand=sample(c("+", "-"), 1000, TRUE))
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(arTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(arTrack)
## Track names
names(arTrack)
names(arTrack) <- "foo"
plotTracks(arTrack)
## Subsetting and splitting
subTrack <- subset(arTrack, from=110, to=130)
length(subTrack)
subTrack[1:2]
split(arTrack, strand(arTrack))
## Accessors
start(arTrack)
end(arTrack)
width(arTrack)
position(arTrack)
width(subTrack) <- 30
strand(arTrack)
strand(subTrack) <- "-"
chromosome(arTrack)
chromosome(subTrack) <- "chrX"
genome(arTrack)
genome(subTrack) <- "mm9"
range(arTrack)
ranges(arTrack)
coverage(arTrack)
## Annotation
values(arTrack)
## Stacking
stacking(arTrack)
stacking(arTrack) <- "dense"
## coercion
as(arTrack, "data.frame")
AlignmentsTrack_class()
AlignmentsTrack class and methods
Description
A class to represent short sequences that have been aligned to a reference genome as they are typically generated in next generation sequencing experiments.
Usage
AlignmentsTrack(range=NULL, start=NULL, end=NULL, width=NULL, strand, chromosome, genome,
stacking="squish", id, cigar, mapq, flag, isize, groupid, status, md, seqs,
name="AlignmentsTrack", isPaired=TRUE, importFunction, referenceSequence, ...)
Arguments
Argument | Description |
---|
|range
| An optional meta argument to handle the different input types. If the range
argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below). The different input options for range
are: list("
", "
", " ", list(list(), list("A ", list("character"), " string: the path to a ", list("BAM"), " file
", " containing the read alignments. To be precise, this will result
", " in the instantiation of a ", list("ReferenceAlignmentsTrack"), "
", " object, but for the user this implementation detail should be of
", " no concern.")), "
", "
", " ", list(list(), list("A ", list("GRanges"), " object: the genomic ranges of the
", " individual reads as well as the optional additional
", |
" metadata columns ", list("id"), ", ", list("cigar"), ",
", " ", list("mapq"), ", ", list("flag"), ", ", list("isize"), ", ", list("groupid"), ", ", " ", list("status"), ", ", list("md"), " and ", list("seqs"), " (see description of the ", " individual function parameters below for details). Calling the ", " constructor on a ", list("GRanges"), " object without further ", " arguments, e.g. ", list("AlignmentsTrack(range=obj)"), " is equivalent ", " to calling the coerce method ",
list("as(obj, "AlignmentsTrack")"), ".")), "
", " ", " ", list(list(), list("An ", list(list("IRanges")), " object: almost identical ", " to the ", list("GRanges"), " case, except that the chromosome and ", " strand information as well as all additional metadata has to be ", " provided in the separate ", list("chromosome"), ", ", list("strand"), ", ", " ", list("feature"), ", ", list("group"), " or ", list("id"), " arguments, because it ", " can not be directly encoded in an ",
list("IRanges"), " object. Note
", " that none of those inputs are mandatory, and if not provided ", " explicitely the more or less reasonable default values ", " ", list("chromosome=NA"), " and ", list("strand="*""), " are used. ")), " ", " ", " ", list(list(), list("A ", list("data.frame"), " object: the ", list("data.frame"), " needs to ", " contain at least the two mandatory columns ", list("start"), " and ", " ", list("end"), " with the range coordinates. It may also contain a ",
" ", list("chromosome"), " and a ", list("strand"), " column with the chromosome
", " and strand information for each range. If missing it will be ", " drawn from the separate ", list("chromosome"), " or ", list("strand"), " ", " arguments. In addition, the ", list("id"), ", ", list("cigar"), ", ", " ", list("mapq"), ", ", list("flag"), ", ", list("isize"), ", ", list("groupid"), ", ", " ", list("status"), ", ", list("md"), " and ", list("seqs"), " data can be provided as ", " additional columns. The above comments about potential default ",
" values also apply here.")), "
", "
", " ")
|start, end, width
| Integer vectors, giving the start and the end coordinates for the individual track items, or their width. Two of the three need to be specified, and have to be of equal length or of length one, in which case this single value will be recycled. Otherwise, the usual R recycling rules for vectors do not apply here.|
|id
| Character vector of read identifiers. Those identifiers have to be unique, i.e., each range representing a read needs to have a unique id
.|
|cigar
| A character vector of valid CIGAR strings describing details of the alignment. Typically those include alignemnts gaps or insertions and deletions, but also hard and soft clipped read regions. If missing, a fully mapped read without gaps or indels is assumed. Needs to be of equal length as the provided genomic coordinates, or of length 1.|
|mapq
| A numeric vector of read mapping qualities. Needs to be of equal length as the provided genomic coordinates, or of length 1.|
|flag
| A numeric vector of flag values. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.|
|isize
| A numeric vector of empirical insert sizes. This only applies if the reads are paired. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.|
|groupid
| A factor (or vector than can be coerced into one) defining the read pairs. Reads with the same groupid
are considered to be mates. Please note that each read group may only have one or two members. Needs to be of equal length as the provided genomic coordinates, or of length 1.|
|status
| A factor describing the mapping status of a read. Has to be one in mated
, unmated
or ambiguous
. Needs to be of equal length as the provided genomic coordinates, or of length 1.|
|md
| A character vector describing the mapping details. This is effectively and alternative to the CIGAR encoding and it removes the dependency on a reference sequence to figure out read mismatches. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.|
|seqs
| DNAStringSet
of read sequences.|
|strand
| Character vector, the strand information for the reads. It may be provided in the form +
for the Watson strand, -
for the Crick strand or *
for either one of the two. Needs to be of equal length as the provided genomic coordinates, or of length 1. Please note that paired reads need to be on opposite strands, and erroneous entries will result in casting of an error.|
|chromosome
| The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE)
. Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx
, where x
may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to construct the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA
. Please note that by definition all objects in the Gviz
package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<-
replacement method in order to change to a different active chromosome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided input, and eventually will fall back to the default value of NA
.|
|stacking
| The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack, full)
. Currently, only squish (make best use of the available space), dense (no stacking, collapse overlapping ranges), and hide (do not show any track items at all) are implemented.|
|name
| Character scalar of the track's name used in the title panel when plotting.|
|isPaired
| A logical scalar to determine whether the reads are paired or not. While this may be used to render paired-end data as single-end, the oppsite will typically not have any effect because the appropriate groupid
settings will not be present. Thus setting isPaired
to TRUE
can usually be used to autodetect the pairing state of the input data.|
|importFunction
| A user-defined function to be used to import the data from a file. This only applies when the range
argument is a character string with the path to the input data file. The function needs to accept an argument x
containing the file path and a second argument selection
with the desired plotting ranges. It has to return a proper GRanges
object with all the necessary metadata columns set. A single default import function is already implemented in the package for BAM
files.|
|referenceSequence
| An optional SequenceTrack object containing the reference sequence against which the reads have been aligned. This is only needed when mismatch information has to be added to the plot (i.e., the showMismatchs
display parameter is TRUE
) because this is normally not encoded in the BAM
file. If not provided through this argument, the plotTracks
function is smart enough to detect the presence of a SequenceTrack object in the track list and will use that as a reference sequence.|
|list()
| Additional items which will all be interpreted as further display parameters. See settings
and the "Display Parameters" section below for details.|
Value
The return value of the constructor function is a new object of class
AlignmentsTrack
or ReferenceAlignmentsTrack
.
Seealso
AnnotationTrack
DataTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
RangeTrack
SequenceTrack
StackedTrack
Author
Florian Hahne
Examples
## Creating objects
afrom <- 2960000
ato <- 3160000
alTrack <- AlignmentsTrack(system.file(package="Gviz", "extdata",
"gapped.bam"), isPaired=TRUE)
plotTracks(alTrack, from=afrom, to=ato, chromosome="chr12")
## Omit the coverage or the pile-ups part
plotTracks(alTrack, from=afrom, to=ato, chromosome="chr12",
type="coverage")
plotTracks(alTrack, from=afrom, to=ato, chromosome="chr12",
type="pileup")
## Including sequence information with the constructor
if(require(BSgenome.Hsapiens.UCSC.hg19)){
strack <- SequenceTrack(Hsapiens, chromosome="chr21")
afrom <- 44945200
ato <- 44947200
alTrack <- AlignmentsTrack(system.file(package="Gviz", "extdata",
"snps.bam"), isPaired=TRUE, referenceSequence=strack)
plotTracks(alTrack, chromosome="chr21", from=afrom, to=ato)
## Including sequence information in the track list
alTrack <- AlignmentsTrack(system.file(package="Gviz", "extdata",
"snps.bam"), isPaired=TRUE)
plotTracks(c(alTrack, strack), chromosome="chr21", from=44946590,
to=44946660)
}
AnnotationTrack_class()
AnnotationTrack class and methods
Description
A fairly generic track object for arbitrary genomic range annotations,
with the option of grouped track items. The extended
DetailsAnnotationTrack
provides a more flexible interface to
add user-defined custom information for each range.
Usage
AnnotationTrack(range=NULL, start=NULL, end=NULL, width=NULL, feature,
group, id, strand, chromosome, genome,
stacking="squish", name="AnnotationTrack", fun,
selectFun, importFunction, stream=FALSE, ...)
Arguments
Argument | Description |
---|
|range
| An optional meta argument to handle the different input types. If the range
argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below). The different input options for range
are: list("
", "
", " ", list(list(), list("A ", list("GRanges"), " object: the genomic ranges for the
", " ", list("Annotation"), " track as well as the optional additional
", " metadata columns ", list("feature"), ", ", list("group"), " and
", " ", list("id"), " (see description of the individual function parameters
", " below for details). Calling the constructor on a ", list("GRanges"), "
", " object without further arguments, e.g.
", " ", list("AnnotationTrack(range=obj)"), " is equivalent to calling the coerce
", |
" method ", list("as(obj, "AnnotationTrack")"), ".")), "
", " ", " ", list(list(), list("A ", list("GRangesList"), " object: this is very similar to the ", " previous case, except that the grouping information that is part ", " of the list structure is preserved in the ", " ", list("AnnotationTrack"), ". I.e., all the elements within one list ", " item receive the same group id. For consistancy, there is also a ", " coercion method from ", list("GRangesLists"), " ", list("as(obj, ",
" "AnnotationTrack")"), ".")), "
", " ", " ", list(list(), list("An ", list(list("IRanges")), " object: almost identical ", " to the ", list("GRanges"), " case, except that the chromosome and ", " strand information as well as all additional metadata has to be ", " provided in the separate ", list("chromosome"), ", ", list("strand"), ", ", " ", list("feature"), ", ", list("group"), " or ", list("id"), " arguments, because it ", " can not be directly encoded in an ", list("IRange"),
" object. Note
", " that none of those inputs are mandatory, and if not provided ", " explicitely the more or less reasonable default values ", " ", list("chromosome=NA"), " and ", list("strand="*""), " are used. ")), " ", " ", " ", list(list(), list("A ", list("data.frame"), " object: the ", list("data.frame"), " needs to ", " contain at least the two mandatory columns ", list("start"), " and ", " ", list("end"), " with the range coordinates. It may also contain a ", " ",
list("chromosome"), " and a ", list("strand"), " column with the chromosome
", " and strand information for each range. If missing it will be ", " drawn from the separate ", list("chromosome"), " or ", list("strand"), " ", " arguments. In addition, the ", list("feature"), ", ", list("group"), " and ", " ", list("id"), " data can be provided as additional columns. The above ", " comments about potential default values also apply here.")), " ", " ", " ", list(list(), list("A ",
list("character"), " scalar: in this case the value of the
", " ", list("range"), " argument is considered to be a file path to an
", " annotation file on disk. A range of file types are supported by
", " the ", list("Gviz"), " package as identified by the file extension. See
", " the ", list("importFunction"), " documentation below for further
", " details.")), "
", "
", " ")
|start, end, width
| Integer vectors, giving the start and the end end coordinates for the individual track items, or their width. Two of the three need to be specified, and have to be of equal length or of length one, in which case this single value will be recycled. Otherwise, the usual R recycling rules for vectors do not apply here.|
|feature
| Factor (or other vector that can be coerced into one), giving the feature types for the individual track items. When plotting the track to the device, if a display parameter with the same name as the value of feature
is set, this will be used as the track item's fill color. See grouping
for details. Needs to be of equal length as the provided genomic coordinates, or of length 1.|
|group
| Factor (or other vector that can be coerced into one), giving the group memberships for the individual track items. When plotting to the device, all items in the same group will be connected. See grouping
for details. Needs to be of equal length as the provided genomic coordinates, or of length 1.|
|id
| Character vector of track item identifiers. When plotting to the device, it's value will be used as the identifier tag if the display parameter showFeatureId=TRUE
. Needs to be of equal length as the provided genomic ranges, or of length 1.|
|strand
| Character vector, the strand information for the individual track items. It may be provided in the form +
for the Watson strand, -
for the Crick strand or *
for either one of the two. Needs to be of equal length as the provided genomic coordinates, or of length 1. Please note that grouped items need to be on the same strand, and erroneous entries will result in casting of an error.|
|chromosome
| The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE)
. Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx
, where x
may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to construct the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA
. Please note that by definition all objects in the Gviz
package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<-
replacement method in order to change to a different active chromosome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided input, and eventually will fall back to the default value of NA
.|
|stacking
| The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack,full)
. Currently, only squish (make best use of the available space), dense (no stacking, collapse overlapping ranges), and hide (do not show any track items at all) are implemented.|
|name
| Character scalar of the track's name used in the title panel when plotting.|
|fun
| A function that is being called for each entry in the AnnotationTrack
object. See section 'Details' and 'Examples' for further information. When called internally by the plotting machinery, a number of arguments are automatically passed on to this function, and the user needs to make sure that they can all be digested (i.e., either have all of them as formal named function arguments, or gobble up everything that is not needed in list() ). These arguments are: |
list(list("start"), ": the genomic start coordinate of the range ", " item.")
list(list("end"), ": the genomic end coordinates of the range ", " item.")
list(list("strand"), ": the strand information for the range item.")
list(list("chromosome"), ": the chromosome of the range item.")
list(list("identifier"), ": the identifier of the range item, i.e., ", " the result of calling ", list("identifier(DetailsAnnotationTrack, ", " lowest=TRUE)"), ". Typically those identifiers are passed on to ", " the object constructor during instantiation as the ", list("id"), " ", " argument.")
list(list("index"), ": a counter enumerating the ranges. The ", " ", list("AnnotationTrack"), " object is sorted internally for ", " visibility, and the ", list("index"), " argument refers to the index ", " of plotting.")
list(list("GdObject"), ": a reference to the currently plotted ", " ", list("DetailsAnnotationTrack"), " object.")
list(list("GdObject.original"), ": a reference to the ", " ", list("DetailsAnnotationTrack"), " before any processing like item ", " collapsing has taken place. Essentially, this is the track ", " object as it exists in your working environment.") Additional arguments can be passed to the plotting function by means of the
detailsFunArgs
argument (see below). Note that the plot must use grid graphics (e.g. function in the 'lattice' package or low-level grid functions). To access a data object such a matrix or data frame within the function you can either store it as a variable in the global environment or, to avoid name space conflicts, you can make it part of the function environment by means of a closure. Alternatively, you may want to explicitely stick it into an environment or pass it along in thedetailsFunArgs
list. To figure out in your custom plotting function which annotation element is currently being plotted you can either use the identifier which has to be unique for each range element, or you may want to use the genomic position (start/end/strand/chromosome) e.g. if the data is stored in aGRanges
object.
|selectFun
| A function that is being called for each entry in theAnnotationTrack
object with exactly the same arguments as infun
. The purpose of this function is to decide for each track element whether details should be drawn, and consequently it has to return a single logical scalar. If the return value isTRUE
, details will be drawn for the item, if it isFALSE
, the details strip for the item is omitted. | |importFunction
| A user-defined function to be used to import the data from a file. This only applies when therange
argument is a character string with the path to the input data file. The function needs to accept an argumentx
containing the file path and has to return a properGRanges
object with all the necessary metadata columns set. A set of default import functions is already implemented in the package for a number of different file types, and one of these defaults will be picked automatically based on the extension of the input file name. If the extension can not be mapped to any of the existing import function, an error is raised asking for a user-defined import function via this argument. Currently the following file types can be imported with the default functions:gff
,gff1
,gff2
,gff3
,bed
,bam
.| |stream
| A logical flag indicating that the user-provided import function can deal with indexed files and knows how to process the additionalselection
argument when accessing the data on disk. This causes the constructor to return aReferenceAnnotationTrack
object which will grab the necessary data on the fly during each plotting operation.| |list()
| Additional items which will all be interpreted as further display parameters. Seesettings
and the "Display Parameters" section below for details.|
Value
The return value of the constructor function is a new object of class
AnnotationTrack
or of class DetailsAnnotationTrack
,
depending on the constructor arguments. Typically the user will not
have to be troubled with this distinction and can rely on the
constructor to make the right choice.
Seealso
DisplayPars
GRanges
GRangesList
GdObject
IRanges
ImageMap
RangeTrack
StackedTrack
Author
Florian Hahne, Arne Mueller
Examples
## An empty object
AnnotationTrack()
## Construct from individual arguments
st <- c(2000000, 2070000, 2100000, 2160000)
ed <- c(2050000, 2130000, 2150000, 2170000)
str <- c("-", "+", "-", "-")
gr <- c("Group1","Group2","Group1", "Group3")
annTrack <- AnnotationTrack(start=st, end=ed, strand=str, chromosome=7, genome="hg19", feature="test",
group=gr, id=paste("annTrack item", 1:4), name="generic annotation", stacking="squish")
## Or from a data.frame
df <- data.frame(start=st, end=ed, strand=str, id=paste("annTrack item", 1:4), feature="test",
group=gr)
annTrack <- AnnotationTrack(range=df, genome="hg19", chromosome=7, name="generic annotation",
stacking="squish")
## Or from a GRanges object
gr <- GRanges(seqnames="chr7", range=IRanges(start=df$start, end=df$end), strand=str)
genome(gr) <- "hg19"
mcols(gr) <- df[,-(1:3)]
annTrack <- AnnotationTrack(range=gr, name="generic annotation", stacking="squish")
## Finally from a GRangesList
grl <- split(gr, values(gr)$group)
AnnotationTrack(grl)
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(annTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(annTrack)
## Track names
names(annTrack)
names(annTrack) <- "foo"
plotTracks(annTrack)
## Subsetting and splitting
subTrack <- subset(annTrack, to=2155000)
length(subTrack)
subTrack[1:2]
split(annTrack, c(1,2,1,2))
## Accessors
start(annTrack)
end(annTrack)
width(annTrack)
position(annTrack)
width(subTrack) <- width(subTrack)+1000
strand(annTrack)
strand(subTrack) <- "-"
chromosome(annTrack)
chromosome(subTrack) <- "chrX"
genome(annTrack)
genome(subTrack) <- "mm9"
range(annTrack)
ranges(annTrack)
## Annotation
identifier(annTrack)
identifier(annTrack, "lowest")
identifier(subTrack) <- "bar"
feature(annTrack)
feature(subTrack) <- "foo"
values(annTrack)
## Grouping
group(annTrack)
group(subTrack) <- "Group 1"
chromosome(subTrack) <- "chr7"
plotTracks(subTrack)
## Stacking
stacking(annTrack)
stacking(annTrack) <- "dense"
plotTracks(annTrack)
## coercion
as(annTrack, "data.frame")
as(annTrack, "UCSCData")
## HTML image map
coords(annTrack)
tags(annTrack)
annTrack <- plotTracks(annTrack)$foo
coords(annTrack)
tags(annTrack)
## DetailsAnnotationTrack
library(lattice) # need to use grid grapics
## generate two random distributions per row (probe/feature)
## the difference between the distributions increases from probe 1 to 4
m <- matrix(c(rgamma(400, 1)), ncol=100)
m[,51:100] <- m[,51:100] + 0:3
## rownames must be accessible by AnnotationTrack element identifier
rownames(m) <- identifier(annTrack, "lowest")
## create a lattice density plot for the values (signals) of the two groups
## as the chart must be placed into a pre-set grid view port we have to use
## print without calling plot.new! Note, use a common prefix for all lattice.
## Avoid wasting space by removing y-axis decorations.
## Note, in this example 'm' will be found in the environment the 'details'
## function is defined in. To avoid overwriting 'm' you should use a closure
## or environment to access 'm'.
details <- function(identifier, ...) {
d = data.frame(signal=m[identifier,], group=rep(c("grp1","grp2"), each=50))
print(densityplot(~signal, group=group, data=d, main=identifier,
scales=list(draw=FALSE, x=list(draw=TRUE)), ylab="", xlab="",
), newpage=FALSE, prefix="plot")
}
deTrack <- AnnotationTrack(range=gr, genome="hg19", chromosome=7,
name="generic annotation with details per entry", stacking="squish",
fun=details, details.ratio=1)
plotTracks(deTrack)
set.seed(1234)
deTrack <- AnnotationTrack(range=gr, genome="hg19", chromosome=7,
name="generic annotation with details per entry",
stacking="squish",fun=details,
details.ratio=1, selectFun=function(...){sample(c(FALSE, TRUE), 1)})
plotTracks(deTrack)
BiomartGeneRegionTrack_class()
BiomartGeneRegionTrack class and methods
Description
A class to hold gene model data for a genomic region fetched dynamically from EBI's Biomart Ensembl data source.
Usage
BiomartGeneRegionTrack(start, end, biomart, chromosome, strand, genome,
stacking="squish", filters=list(), featureMap=NULL,
name="BiomartGeneRegionTrack", symbol=NULL, gene=NULL, entrez=NULL,
transcript=NULL, ...)
Arguments
Argument | Description |
---|---|
start | An integer scalar with the genomic start coordinates for the gene model range. |
end | An integer scalar with the genomic end coordinates for the gene model range. |
biomart | An optional Mart object providing access to the EBI Biomart webservice. As default the appropriate Ensembl data source is selected based on the provided genome and chromosome. |
strand | Character scalar, the strand for which to fetch gene information from Biomart. One in + , - , or +- . |
chromosome | The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier. Please note that at this stage only syntactic checking takes place, i.e., the argument value needs to be a single integer, numeric character or a character of the form chrx , where x may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. |
genome | The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If no mapping from genome to Biomart Ensembl data source is possible, the biomart argument needs to be provided by the user. |
stacking | The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack,full) . Currently, only hide (don't show the track items, squish (make best use of the available space) and dense (no stacking at all) are implemented. |
filters | A list of additional filters to be applied in the Biomart query. See getBM for details. |
featureMap | Named character vector or list to map between the fields in the Biomart data base and the features as they are used to construct the track. If multiple values are provided in a single list item, the package will use the first one that is defined in the selected Biomart. |
name | Character scalar of the track's name used in the title panel when plotting. |
symbol,transcript,gene,entrez | Character vector giving one or several gene symbols, Ensembl transcript identifiers, Ensembl gene identifiers, or ENTREZ gene identifiers, respectively. The genomic locus of their gene model will be fetch from Biomart instead of providing explicit start and end coordinates. |
list() | Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details. |
Details
A track containing all gene models in a particular region as fetched from EBI's Biomart service. Usually the user does not have to take care of the Biomart connection, which will be established automatically based on the provided genome and chromosome information. However, for full flexibility a valid Mart object may be passed on to the constructor. Please note that this assumes a connection to one of the Ensembl gene data sources, mapping the available query data back to the internal object slots.
Value
The return value of the constructor function is a new object of class
BiomartGeneRegionTrack
.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
Mart
RangeTrack
StackedTrack
Author
Florian Hahne
References
EBI Biomart webservice at http://www.biomart.org .
Examples
list("
", "## Load some sample data
", "data(bmTrack)
")
## Construct the object
bmTrack <- BiomartGeneRegionTrack(start=26682683, end=26711643,
chromosome=7, genome="mm9")
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(bmTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(bmTrack)
## Track names
names(bmTrack)
names(bmTrack) <- "foo"
plotTracks(bmTrack)
## Subsetting and splitting
subTrack <- subset(bmTrack, from=26700000, to=26705000)
length(subTrack)
subTrack <- bmTrack[transcript(bmTrack)=="ENSMUST00000144140"]
split(bmTrack, transcript(bmTrack))
## Accessors
start(bmTrack)
end(bmTrack)
width(bmTrack)
position(bmTrack)
width(subTrack) <- width(subTrack)+100
strand(bmTrack)
strand(subTrack) <- "-"
chromosome(bmTrack)
chromosome(subTrack) <- "chrX"
genome(bmTrack)
genome(subTrack) <- "hg19"
range(bmTrack)
ranges(bmTrack)
## Annotation
identifier(bmTrack)
identifier(bmTrack, "lowest")
identifier(subTrack) <- "bar"
feature(bmTrack)
feature(subTrack) <- "foo"
exon(bmTrack)
exon(subTrack) <- letters[1:2]
gene(bmTrack)
gene(subTrack) <- "bar"
symbol(bmTrack)
symbol(subTrack) <- "foo"
transcript(bmTrack)
transcript(subTrack) <- c("foo", "bar")
chromosome(subTrack) <- "chr7"
plotTracks(subTrack)
values(bmTrack)
## Grouping
group(bmTrack)
group(subTrack) <- "Group 1"
transcript(subTrack)
plotTracks(subTrack)
## Stacking
stacking(bmTrack)
stacking(bmTrack) <- "dense"
plotTracks(bmTrack)
## coercion
as(bmTrack, "data.frame")
as(bmTrack, "UCSCData")
## HTML image map
coords(bmTrack)
tags(bmTrack)
bmTrack <- plotTracks(bmTrack)$foo
coords(bmTrack)
tags(bmTrack)
CustomTrack_class()
CustomTrack class and methods
Description
A fully customizable track object to be populated via a user-defined plotting function.
Usage
CustomTrack(plottingFunction=function(GdObject, prepare=FALSE, ...){}, variables=list(), name="CustomTrack", ...)
Arguments
Argument | Description |
---|---|
plottingFunction | A user-defined function to be executed once the track coordinates have been properly set up. The function needs to accept two mandatory arguments: GdObject , the CustomTrack object to be plotted, and prepare , a logical flag indicating whether the function has been called in preparation mode or in drawing mode. It also needs to return the input GdObject , potentially with modifications. |
variables | A list of additional variables for the user-defined plotting function. |
name | Character scalar of the track's name. |
list() | Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details. |
Details
A track to allow for any sort of plotting, with the currently
displayed genomic location set. Essentially this acts as a simple
callback into the Gviz
plotting machinery after all the track
panels and coordinates have been set up. It is entirely up to the user
what to plot in the track, or even to use the predefined coordinate
system. The only prerequesite is that all plotting operations need to
utilize Grid graphics.
Seealso
DisplayPars
GdObject
ImageMap
Author
Florian Hahne
DataTrack_class()
DataTrack class and methods
Description
A class to store numeric data values along genomic coordinates. Multiple samples as well as sample groupings are supported, with the restriction of equal genomic coordinates for a single observation across samples.
Usage
DataTrack(range=NULL, start=NULL, end=NULL, width=NULL, data, chromosome, strand, genome,
name="DataTrack", importFunction, stream=FALSE, ...)
Arguments
Argument | Description |
---|
|range
| An optional meta argument to handle the different input types. If the range
argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below). The different input options for range
are: list("
", "
", " ", list(list(), list("A ", list("GRanges"), " object: essentially all the necessary
", " information to create a ", list("DataTrack"), " can be contained in a
", " single ", list("GRanges"), " object. The track's coordinates are taken
", " from the ", list("start"), ", ", list("end"), " and ", list("seqnames"), " slots, the
", " genome information from the genome slot, and the numeric data
", " values can be extracted from additional metadata columns
", " columns (please note that non-numeric columns are being ignored
", |
" with a warning). As a matter of fact, calling the constructor on
", " a ", list("GRanges"), " object without further arguments, e.g. ", " ", list("DataTrack(range=obj)"), " is equivalent to calling the coerce ", " method ", list("as(obj, "DataTrack")"), ". Alternatively, the ", " ", list("GRanges"), " object may only contain the coordinate ", " information, in which case the numeric data part is expected to ", " be present in the separate ", list("data"), " argument, and the ranges ",
" have to match the dimensions of the data matrix. If ", list("data"), "
", " is not ", list("NULL"), ", this will always take precedence over ", " anything defined in the ", list("range"), " argument. See below for ", " details.")), " ", " ", " ", list(list(), list("An ", list(list("IRanges")), " object: this is very ", " similar to the above case, except that the numeric data part now ", " always has to be provided in the separate ", list("data"), " ", " argument. Also the chromosome information must be provided in ",
" the ", list("chromosome"), " argument, because neither of the two can
", " be directly encoded in an ", list("IRange"), " object.")), " ", " ", " ", list(list(), list("A ", list("data.frame"), " object: the ", list("data.frame"), " needs to ", " contain at least the two mandatory columns ", list("start"), " and ", " ", list("end"), " with the range coordinates. It may also contain a ", " ", list("chromosome"), " column with the chromosome information for ", " each range. If missing it will be drawn from the separate ",
" ", list("chromosome"), " argument. All additional numeric columns will
", " be interpreted as data columns, unless the ", list("data"), " argument ", " is explicitely provided.")), " ", " ", " ", list(list(), list("A ", list("character"), " scalar: in this case the value of the ", " ", list("range"), " argument is considered to be a file path to an ", " annotation file on disk. A range of file types are supported by ", " the ", list("Gviz"), " package as identified by the file extension. See ",
" the ", list("importFunction"), " documentation below for further
", " details.")), "
", "
", " ")
|start, end, width
| Integer vectors, giving the start and the end end coordinates for the individual track items, or their width. Two of the three need to be specified, and have to be of equal length or of length one, in which case the single value will be recycled accordingly. Otherwise, the usual R recycling rules for vectors do not apply and the function will cast an error.|
|data
| A numeric matrix of data points with the number of columns equal to the number of coordinates in range
, or a numeric vector of appropriate length that will be coerced into such a one-row matrix. Each individual row is supposed to contain data for a given sample, where the coordinates for each single observation are constant across samples. Depending on the plotting type of the data (see 'Details' and 'Display Parameters' sections), sample grouping or data aggregation may be available. Alternatively, this can be a character vector of column names that point into the element metadata of the range
object for subsetting. Naturally, this is only supported when the range
argument is of class GRanges
.|
|strand
| Character vector, the strand information for the individual track items. Currently this has to be unique for the whole track and doesn't really have any visible consequences, but we might decide to make DataTracks
strand-specific at a later stage.|
|chromosome
| The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE)
. Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx
, where x
may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to construct the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA
. Please note that by definition all objects in the Gviz
package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<-
replacement method in order to change to a different active chromosome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided input, and eventually will fall back to the default value of NA
.|
|name
| Character scalar of the track's name used in the title panel when plotting.|
|importFunction
| A user-defined function to be used to import the data from a file. This only applies when the range
argument is a character string with the path to the input data file. The function needs to accept an argument file
containing the file path and has to return a proper GRanges
object with the data part attached as numeric metadata columns. Essentially the process is equivalent to constructing a DataTrack
directly from a GRanges
object in that non-numeric columns will be dropped, and further subsetting can be archived by means of the data
argument. A set of default import functions is already implemented in the package for a number of different file types, and one of these defaults will be picked automatically based on the extension of the input file name. If the extension can not be mapped to any of the existing import function, an error is raised asking for a user-defined import function. Currently the following file types can be imported with the default functions: wig
, bigWig/bw
, bedGraph
and bam
. Some file types support indexing by genomic coordinates (e.g., bigWig
and bam
), and it makes sense to only load the part of the file that is needed for plotting. To this end, the Gviz
package defines the derived ReferenceDataTrack
class, which supports streaming data from the file system. The user typically does not have to deal with this distinction but may rely on the constructor function to make the right choice as long as the default import functions are used. However, once a user-defined import function has been provided and if this function adds support for indexed files, you will have to make the constructor aware of this fact by setting the stream
argument to TRUE
. Please note that in this case the import function needs to accept a second mandatory argument selection
which is a GRanges
object containing the dimensions of the plotted genomic range. As before, the function has to return an appropriate GRanges
object.|
|stream
| A logical flag indicating that the user-provided import function can deal with indexed files and knows how to process the additional selection
argument when accessing the data on disk. This causes the constructor to return a ReferenceDataTrack
object which will grab the necessary data on the fly during each plotting operation.|
|list()
| Additional items which will all be interpreted as further display parameters.|
Details
Depending on the setting of the type
display parameter, the
data can be plotted in various different forms as well as combinations
thereof. Supported plotting types are:
list(" ", " ", " ", list(list(), list(list("p"), ": simple xy-plot.")), " ", " ", " ", list(list(), list(list("l"), ": lines plot. In the case of multiple samples this ", " plotting type is not overly usefull since the points in the data ", " matrix are connected in column-wise order. Type ", list("a"), " might be ", " more appropriate in these situations.")), " ", " ", " ", list(list(), list(list("b"), ": combination of xy-plot and lines plot.")), " ", " ", " ",
list(list(), list(list("a"), ": lines plot of the column-wise average values.")), "
", " ", " ", list(list(), list(list("s"), ": sort and connect data points along the x-axis")), " ", " ", " ", list(list(), list(list("S"), ": sort and connect data points along the y-axis")), " ", " ", " ", list(list(), list(list("g"), ": add grid lines. To ensure a consitant look and ", " feel across multiple tracks, grid lines should preferentially be ", " added by using the ", list(
"grid"), " display parameter.")), "
", " ", " ", list(list(), list(list("r"), ": add a regression line to the plot.")), " ", " ", " ", list(list(), list(list("h"), ": histogram-like vertical lines centered in the ", " middle of the coordinate ranges.")), " ", " ", " ", list(list(), list(list("smooth"), ": add a loess fit to the plot. The following ", " display parameters can be used to control the loess calculation: ", " ", list("span, degree, family, evaluation"),
". See
", " ", list(list("panel.loess")), " for details.")), " ", " ", " ", list(list(), list(list("histogram"), ": plot data as a histogram, where the width ", " of the histogram bars reflects the width of the genomic ranges in ", " the ", list("range"), " slot.")), " ", " ", " ", list(list(), list(list("mountain"), ": plot a smoothed version of the data ", " relative to a baseline, as defined by the ", list("baseline"), " display ", " parameter. The following display parameters can be used to control ",
" the smoothing: ", list("span, degree, family, evaluation"), ". See
", " ", list(list("panel.loess")), " for details. The layout of the plot can ", " be further customized via the following display parameters: ", " ", list("col.mountain, lwd.mountain, lty.mountain, fill.mountain"), ".")), " ", " ", " ", list(list(), list(list("polygon"), ": plot data as a polygon (similar to ", " ", list("mountain"), "-type but without smoothing). Data are plotted relative ",
" to a baseline, as defined by the ", list("baseline"), " display
", " parameter. The layout of the plot can be further customized via ", " the following display parameters: ", list("col.mountain, ", " lwd.mountain, lty.mountain, fill.mountain"), ".")), " ", " ", " ", " ", list(list(), list(list("boxplot"), ": plot the data as box-and-whisker ", " plots. The layout of the plot can be further customized via the ", " following display parameters: ",
list("box.ratio, box.width, varwidt,
", " notch, notch.frac, levels.fos, stats, coef, do.out"), ". See ", " ", list(list("panel.bwplot")), " for details.")), " ", " ", " ", " ", list(list(), list(list("gradient"), ": collapse the data across samples and plot this ", " average value as a color-coded gradient. Essenitally this is ", " similar to the heatmap-type plot of a single sample. The layout of ", " the plot can be further customized via the display parameters ",
" ", list("ncolor"), " and ", list("gradient"), " which control the number of
", " gradient colors as well as the gradient base colors, ", " respectively.")), " ", " ", " ", list(list(), list(list("heatmap"), ": plot the color-coded values for all samples ", " in the form of a heatmap. The data for individual samples can be ", " visually separated by setting the ", list("separator"), " display ", " parameter. It's value is taken as the amount of spacing in pixels ",
" in between two heatmap rows. The layout of the plot can be further
", " customized via the display parameters ", list("ncolor"), " and ", " ", list("gradient"), " which control the number of gradient colors as ", " well as the gradient base colors, respectively.")), " ", " ", " ", list(list(), list(list("horizon"), ": plot continuous data by cutting the y range into ", " segments and overplotting them with color representing the ", " magnitude and direction of deviation. This is particularly useful ",
" when comparing multiple samples, in which case the horizon strips
", " are stacked. See ", list(list("horizonplot")), " for details. Please ", " note that the ", list("origin"), " and ", list("horizonscale"), " arguments of ", " the Lattice ", list("horizonplot"), " function are available as display ", " parameters ", list("horizon.origin"), " and ", list("horizon.scale"), ".")), " ", " ", " ")
For some of the above plotting-types the groups
display
parameter can be used to indicate sample sub-groupings. Its value is
supposed to be a factor vector of similar length as the number of
samples. In most cases, the groups are shown in different plotting
colors and data aggregation operations are done in a stratified
fashion.
The window
display parameter can be used to aggregate the data
prior to plotting. Its value is taken as the number of equal-sized
windows along the genomic coordinates of the track for which to
compute average values. The special value auto
can be used to
automatically determine a reasonable number of windows which can be
particularly useful when plotting very large genomic regions with many
data points.
The aggregation
parameter can be set to define the aggregation
function to be used when averaging in windows or across collapsed
items. It takes the form of either a function which should condense a
numeric vector into a single number, or one of the predefined options
as character scalars "mean"
, "median"
or "sum"
for mean, median or summation, respectively. Defaults to computing
mean values for each sample. Note that the predefined options can be
much faster because they are optimized to work on large numeric
tables.
Value
The return value of the constructor function is a new object of class
DataTrack
or ReferenceDataTrack
.
Seealso
DataTrack
DisplayPars
GRanges
GdObject
IRanges
ImageMap
NumericTrack
RangeTrack
Author
Florian Hahne
Examples
## Object construction:
## An empty object
DataTrack()
## from individual arguments
dat <- matrix(runif(400), nrow=4)
dtTrack <- DataTrack(start=seq(1,1000, len=100), width=10, data=dat,
chromosome=1, genome="mm9", name="random data")
## from GRanges
library(GenomicRanges)
gr <- GRanges(seqnames="chr1", ranges=IRanges(seq(1,1000, len=100),
width=10))
values(gr) <- t(dat)
dtTrack <- DataTrack(range=gr, genome="mm9", name="random data")
## from IRanges
dtTrack <- DataTrack(range=ranges(gr), data=dat, genome="mm9",
name="random data", chromosome=1)
## from a data.frame
df <- as.data.frame(gr)
colnames(df)[1] <- "chromosome"
dtTrack <- DataTrack(range=df, genome="mm9", name="random data")
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(dtTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(dtTrack)
## Track names
names(dtTrack)
names(dtTrack) <- "foo"
plotTracks(dtTrack)
## Subsetting and splitting
subTrack <- subset(dtTrack, from=100, to=300)
length(subTrack)
subTrack[1:2,]
subTrack[,1:2]
split(dtTrack, rep(1:2, each=50))
## Accessors
start(dtTrack)
end(dtTrack)
width(dtTrack)
position(dtTrack)
width(subTrack) <- width(subTrack)-5
strand(dtTrack)
strand(subTrack) <- "-"
chromosome(dtTrack)
chromosome(subTrack) <- "chrX"
genome(dtTrack)
genome(subTrack) <- "mm9"
range(dtTrack)
ranges(dtTrack)
## Data
values(dtTrack)
score(dtTrack)
## coercion
as(dtTrack, "data.frame")
DisplayPars_class()
DisplayPars class and method
Description
All tracks within this package are highly customizable. The
DisplayPars
class facilitates this and provides a unified API
to the customization parameters.
Usage
DisplayPars(...)
availableDisplayPars(class)
Arguments
Argument | Description |
---|---|
list() | All named arguments are stored in the object's environment as individual parameters, regardless of their type. |
class | A valid track object class name, or the object itself, in which case the class is derived directly from it. |
Details
The individual parameters in a DisplayParameters
class are
stored as pointers in an environment. This has the upshot of not
having to copy the whole track object when changing parameters, and
parameters can be updated without the need to explicietly reassign the
track to a symbol (i.e., updating of parameters happens in place). The
downside is that upon copying of track objects, the parameter
emvironment needs to be reinstantiated.
The default display parameters for a track object class can be queried
using the availableDisplayPars
function.
Value
The return value of the constructor function is a new object of class
DisplayPars
.
availableDisplayPars
returns a list of the default display
parameters.
Author
Florian Hahne
Examples
## Construct object
dp <- DisplayPars(col="red", lwd=2, transformation=log2)
dp
## Query parameters
displayPars(dp)
displayPars(dp, "col")
getPar(dp, c("col", "transformation"))
## Modify parameters
displayPars(dp) <- list(lty=1, fontsize=3)
setPar(dp, "pch", 20)
dp
## Default parameters
availableDisplayPars("GenomeAxisTrack")
GdObject_class()
GdObject class and methods
Description
The virtual parent class for all track items in the Gviz
package. This class definition contains all the common entities that
are needed for a track to be plotted. During object instantiation for
any of the sub-classes inheriting from GdObject
, this class'
global ininitializer has to be called in order to assure that all
necessary settings are present.
Seealso
AnnotationTrack
DisplayPars
GeneRegionTrack
ImageMap
Author
Florian Hahne
GeneRegionTrack_class()
GeneRegionTrack class and methods
Description
A class to hold gene model data for a genomic region.
Usage
GeneRegionTrack(range=NULL, rstarts=NULL, rends=NULL, rwidths=NULL,
strand, feature, exon, transcript, gene, symbol,
chromosome, genome, stacking="squish",
name="GeneRegionTrack", start=NULL, end=NULL,
importFunction, stream=FALSE, ...)
Arguments
Argument | Description |
---|
|range
| An optional meta argument to handle the different input types. If the range
argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below). The different input options for range
are: list("
", "
", " ", list(list(), list("A ", list("TxDb"), " object: all the necessary gene model
", " information including exon locations, transcript groupings and
", " associated gene ids are contained in ", list("TxDb"), "
", " objects, and the coercion between the two is almost completely
", " automated. If desired, the data to be fetched from the
", " ", list("TxDb"), " object can be restricted using the
", " constructor's ", list("chromosome"), ", ", list("start"), " and ", |
list("end"), "
", " arguments. See below for details. A direct coercion method ", " ", list("as(obj, "GeneRegionTrack")"), " is also available. A nice ", " added benefit of this input option is that the UTR and coding ", " region information that is part of the original ", " ", list("TxDb"), " object is retained in the ", " ", list("GeneRegionTrack"), ".")), " ", " ", " ", list(list(), list("A ", list("GRanges"), " object: the genomic ranges for the ", " ", list("GeneRegion"),
" track as well as the optional additional
", " metadata columns ", list("feature"), ", ", " ", list("transcript"), ", ", list("gene"), ", ", list("exon"), " and ", list("symbol"), " ", " (see description of the individual function parameters below for ", " details). Calling the constructor on a ", list("GRanges"), " object ", " without further arguments, e.g. ", " ", list("GeneRegionTrack(range=obj)"), " is equivalent to calling the ", " coerce method ", list("as(obj, "GeneRegionTrack")"),
".")), "
", " ", " ", list(list(), list("A ", list("GRangesList"), " object: this is very similar to the ", " previous case, except that the grouping information that is part ", " of the list structure is preserved in the ", " ", list("GeneRegionTrack"), ". I.e., all the elements within one list ", " item receive the same group id. For consistancy, there is also a ", " coercion method from ", list("GRangesLists"), " ", list("as(obj, ", " "GeneRegionTrack")"), ". Please note that unless the necessary ",
" information about gene ids, symbols, etc. is present in the
", " individual ", list("GRanges"), " meta data slots, the object will not ", " be particularly useful, because all the identifiers will be set ", " to a common default value.")), " ", " ", " ", list(list(), list("An ", list(list("IRanges")), " object: almost identical ", " to the ", list("GRanges"), " case, except that the chromosome and ", " strand information as well as all additional data has to be ", " provided in the separate ",
list("chromosome"), ", ", list("strand"), ",
", " ", list("feature"), ", ", list("transcript"), ", ", list("symbol"), ", ", list("exon"), " or ", " ", list("gene"), " arguments, because it can not be directly encoded in ", " an ", list("IRanges"), " object. Note that only the former two are ", " mandatory (if not provided explicitely the more or less ", " reasonable default values ", list("chromosome=NA"), " and ", " ", list("strand=*"), " are used, but not providing information about ",
" the gene-to-transcript relationship or the human-readble symbols
", " renders a lot of the class' functionality useles.")), " ", " ", " ", list(list(), list("A ", list("data.frame"), " object: the ", list("data.frame"), " needs to ", " contain at least the two mandatory columns ", list("start"), " and ", " ", list("end"), " with the range coordinates. It may also contain a ", " ", list("chromosome"), " and a ", list("strand"), " column with the chromosome ", " and strand information for each range. If missing, this ",
" information will be drawn from the constructor's
", " ", list("chromosome"), " or ", list("strand"), " arguments. In addition, the ", " ", list("feature"), ", ", list("exon"), ", ", list("transcript"), ", ", list("gene"), " and ", " ", list("symbol"), " data can be provided as columns in the ", " ", list("data.frame"), ". The above comments about potential default ", " values also apply here.")), " ", " ", " ", list(list(), list("A ", list("character"), " scalar: in this case the value of the ",
" ", list("range"), " argument is considered to be a file path to an
", " annotation file on disk. A range of file types are supported by
", " the ", list("Gviz"), " package as identified by the file extension. See
", " the ", list("importFunction"), " documentation below for further
", " details.")), "
", "
", " ")
|start, end
| An integer scalar with the genomic start or end coordinate for the gene model range. If those are missing, the default value will automatically be the smallest (or largest) value, respectively in rstarts
and rends
for the currently active chromosome. When building a GeneRegionTrack
from a TxDb
object, these arguments can be used to subset the desired annotation data by genomic coordinates. Please note this in that case the chromosome
parameter must also be set.|
|rstarts
| An integer vector of the start coordinates for the actual gene model items, i.e., for the individual exons. The relationship between exons is handled via the gene
and transcript
factors. Alternatively, this can be a vector of comma-separated lists of integer coordinates, one vector item for each transcript, and each comma-separated element being the start location of a single exon within that transcript. Those lists will be exploded upon object instantiation and all other annotation arguments will be recycled accordingly to regenerate the exon/transcript/gene relationship structure. This implies the approriate number of items in all annotation and coordinates arguments.|
|rends
| An integer vector of the end coordinates for the actual gene model items. Both rstarts
and rends
have to be of equal length.|
|rwidths
| An integer vector of widths for the actual gene model items. This can be used instead of either rstarts
or rends
to specify the range coordinates.|
|feature
| Factor (or other vector that can be coerced into one), giving the feature types for the individual track exons. When plotting the track to the device, if a display parameter with the same name as the value of feature
is set, this will be used as the track item's fill color. Additionally, the feature type defines whether an element in the GeneRegionTrack
is considered to be coding or non-coding. The details section as well as the section about the thinBoxFeature
display parameter further below has more information on this. See also grouping
for details.|
|exon
| Character vector of exon identifiers. It's values will be used as the identifier tag when plotting to the device if the display parameter showExonId=TRUE
.|
|strand
| Character vector, the strand information for the individual track exons. It may be provided in the form +
for the Watson strand, -
for the Crick strand or *
for either one of the two. Please note that all items within a single gene or transcript model need to be on the same strand, and erroneous entries will result in casting of an error.|
|transcript
| Factor (or other vector that can be coerced into one), giving the transcript memberships for the individual track exons. All items with the same transcript identifier will be visually connected when plotting to the device. See grouping
for details. Will be used as labels when showId=TRUE
, and geneSymbol=FALSE
.|
|gene
| Factor (or other vector that can be coerced into one), giving the gene memberships for the individual track exons.|
|symbol
| A factor with human-readable gene name aliases which will be used as labels when showId=TRUE
, and geneSymbol=TRUE
.|
|chromosome
| The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE)
. Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx
, where x
may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to build the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA
. Please note that by definition all objects in the Gviz
package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<-
replacement method in order to change to a different active chromosome. When creating a GeneRegionTrack
from a TxDb
object, the value of this parameter can be used to subset the data to fetch only transcripts from a single chromosome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided inputs, and eventually will fall back to the default value of NA
.|
|stacking
| The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack,full)
. Currently, only hide (don't show the track items, squish (make best use of the available space) and dense (no stacking at all) are implemented.|
|name
| Character scalar of the track's name used in the title panel when plotting.|
|importFunction
| A user-defined function to be used to import the data from a file. This only applies when the range
argument is a character string with the path to the input data file. The function needs to accept an argument x
containing the file path and has to return a proper GRanges
object with all the necessary metadata columns set. A set of default import functions is already implemented in the package for a number of different file types, and one of these defaults will be picked automatically based on the extension of the input file name. If the extension can not be mapped to any of the existing import function, an error is raised asking for a user-defined import function via this argument. Currently the following file types can be imported with the default functions: gff
, gff1
, gff2
, gff3
, gtf
.|
|stream
| A logical flag indicating that the user-provided import function can deal with indexed files and knows how to process the additional selection
argument when accessing the data on disk. This causes the constructor to return a ReferenceGeneRegionTrack
object which will grab the necessary data on the fly during each plotting operation.|
|list()
| Additional items which will all be interpreted as further display parameters. See settings
and the "Display Parameters" section below for details.|
Details
A track containing all gene models in a particular region. The data
are usually fetched dynamially from an online data store, but it is
also possible to manully construct objects from local
data. Connections to particular online data sources should be
implemented as sub-classes, and GeneRegionTrack
is just the
commone denominator that is being used for plotting later on. There
are several levels of data associated to a GeneRegionTrack
:
list(" ", " ", " ", list(list("exon level:"), list("identifiers are stored in the exon column of the ", " ", list(list("GRanges")), " object in the ", list("range"), " slot. Data ", " may be extracted using the ", list("exon"), " method.")), " ", " ", " ", list(list("transcript level:"), list("identifiers are stored in the transcript ", " column of the ", list(list("GRanges")), " object. Data may be ", " extracted using the ", list("transcript"), " method.")), " ",
"
", " ", list(list("gene level:"), list("identifiers are stored in the gene column of the ", " ", list(list("GRanges")), " object, more human-readable versions ", " in the symbol column. Data may be extracted using the ", list("gene"), " ", " or the ", list("symbol"), " methods.")), " ", " ", " ", list(list("transcript-type level:"), list("information is stored in the feature ", " column of the ", list(list("GRanges")), " object. If a display ", " parameter of the same name is specified, the software will use its ",
" value for the coloring.")), "
", " ", " ")
GeneRegionTrack
objects also know about coding regions and
non-coding regions (e.g., UTRs) in a transcript, and will indicate
those by using different shapes (wide boxes for all coding regions,
thinner boxes for non-coding regions). This is archived by setting the
feature
values of the object for non-coding elements to one of
the options that are provided in the thinBoxFeature
display
parameters. All other elements are considered to be coding elements.
Value
The return value of the constructor function is a new object of class
GeneRegionTrack
.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
IRanges
ImageMap
RangeTrack
StackedTrack
TxDb
Author
Florian Hahne, Steve Lianoglou
Examples
## The empty object
GeneRegionTrack()
## Load some sample data
data(cyp2b10)
## Construct the object
grTrack <- GeneRegionTrack(start=26682683, end=26711643,
rstart=cyp2b10$start, rends=cyp2b10$end, chromosome=7, genome="mm9",
transcript=cyp2b10$transcript, gene=cyp2b10$gene, symbol=cyp2b10$symbol,
feature=cyp2b10$feature, exon=cyp2b10$exon,
name="Cyp2b10", strand=cyp2b10$strand)
## Directly from the data.frame
grTrack <- GeneRegionTrack(cyp2b10)
## From a TxDb object
if(require(GenomicFeatures)){
samplefile <- system.file("extdata", "hg19_knownGene_sample.sqlite", package="GenomicFeatures")
txdb <- loadDb(samplefile)
GeneRegionTrack(txdb)
GeneRegionTrack(txdb, chromosome="chr6", start=35000000, end=40000000)
}
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(grTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(grTrack)
## Track names
names(grTrack)
names(grTrack) <- "foo"
plotTracks(grTrack)
## Subsetting and splitting
subTrack <- subset(grTrack, from=26700000, to=26705000)
length(subTrack)
subTrack <- grTrack[transcript(grTrack)=="ENSMUST00000144140"]
split(grTrack, transcript(grTrack))
## Accessors
start(grTrack)
end(grTrack)
width(grTrack)
position(grTrack)
width(subTrack) <- width(subTrack)+100
strand(grTrack)
strand(subTrack) <- "-"
chromosome(grTrack)
chromosome(subTrack) <- "chrX"
genome(grTrack)
genome(subTrack) <- "hg19"
range(grTrack)
ranges(grTrack)
## Annotation
identifier(grTrack)
identifier(grTrack, "lowest")
identifier(subTrack) <- "bar"
feature(grTrack)
feature(subTrack) <- "foo"
exon(grTrack)
exon(subTrack) <- letters[1:2]
gene(grTrack)
gene(subTrack) <- "bar"
symbol(grTrack)
symbol(subTrack) <- "foo"
transcript(grTrack)
transcript(subTrack) <- c("foo", "bar")
chromosome(subTrack) <- "chr7"
plotTracks(subTrack)
values(grTrack)
## Grouping
group(grTrack)
group(subTrack) <- "Group 1"
transcript(subTrack)
plotTracks(subTrack)
## Collapsing transcripts
plotTracks(grTrack, collapseTranscripts=TRUE, showId=TRUE,
extend.left=10000, shape="arrow")
## Stacking
stacking(grTrack)
stacking(grTrack) <- "dense"
plotTracks(grTrack)
## coercion
as(grTrack, "data.frame")
as(grTrack, "UCSCData")
## HTML image map
coords(grTrack)
tags(grTrack)
grTrack <- plotTracks(grTrack)$foo
coords(grTrack)
tags(grTrack)
GenomeAxisTrack_class()
GenomeAxisTrack class and methods
Description
A class representing a customizable genomic axis.
Usage
GenomeAxisTrack(range=NULL, name="Axis", id, ...)
Arguments
Argument | Description |
---|---|
range | Optional GRanges or IRanges object to highlight certain regions on the axis. |
name | Character scalar of the track's name used in the title panel when plotting. |
id | A character vector of the same length as range containing identifiers for the ranges. If missing, the constructor will try to extract the ids from names(range) . |
list() | Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details. |
Details
A GenomeAxisTrack
can be customized using the familiar display
parameters. By providing a GRanges
or IRanges
object to
the constructor, ranges on the axis can be further highlighted.
With the scale
display parameter, a small scale indicator can
be shown instead of the entire genomic axis. The scale can either be
provided as a fraction of the plotting region (it will be rounded to
the nearest human readable absolute value) or as an absolute value and
is always displayed in bp, kb, mb or gb units. Note that most display
parameters for the GenomeAxisTrack
are ignored when a scale is
used insterad of the full axis. In particular, only the parameters
exponent
, alpha
, lwd
, col
, cex
,
distFromAxis
and labelPos
are used.
Value
The return value of the constructor function is a new object of class
GenomeAxisTrack
.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
IRanges
ImageMap
RangeTrack
StackedTrack
Author
Florian Hahne
Examples
## Construct object
axTrack <- GenomeAxisTrack(name="Axis",
range <- IRanges(start=c(100, 300, 800), end=c(150, 400, 1000)))
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(axTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(axTrack, from=0, to=1100)
## Track names
names(axTrack)
names(axTrack) <- "foo"
## Subsetting and splitting
subTrack <- subset(axTrack, from=0, to=500)
length(subTrack)
subTrack[1]
split(axTrack, c(1,1,2))
## Accessors
start(axTrack)
end(axTrack)
width(axTrack)
strand(axTrack)
range(axTrack)
ranges(axTrack)
## Annotation
values(axTrack)
## Grouping
group(axTrack)
## HTML image map
coords(axTrack)
tags(axTrack)
axTrack <- plotTracks(axTrack)$foo
coords(axTrack)
tags(axTrack)
## adding an axis to another track
data(cyp2b10)
grTrack <- GeneRegionTrack(start=26682683, end=26711643,
rstart=cyp2b10$start, rends=cyp2b10$end, chromosome=7, genome="mm9",
transcript=cyp2b10$transcript, gene=cyp2b10$gene, symbol=cyp2b10$symbol,
name="Cyp2b10", strand=cyp2b10$strand)
plotTracks(list(grTrack, GenomeAxisTrack()))
plotTracks(list(grTrack, GenomeAxisTrack(scale=0.1)))
plotTracks(list(grTrack, GenomeAxisTrack(scale=5000)))
plotTracks(list(grTrack, GenomeAxisTrack(scale=0.5, labelPos="below")))
HighlightTrack_class()
HighlightTrack class and methods
Description
A container for other track objects from the Gviz package that allows for the addition of a common highlighting area across tracks.
Usage
HighlightTrack(trackList=list(), range=NULL, start=NULL, end=NULL, width=NULL, chromosome, genome,
name="HighlightTrack", ...)
Arguments
Argument | Description |
---|---|
trackList | A list of Gviz track objects that all have to inherit from class GdObject . |
|range
| An optional meta argument to handle the different input types. If the range
argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below). The different input options for range
are: list("
", "
", " ", list(list(), list("A ", list("GRanges"), " object: the genomic ranges for the
", " highlighting regions.")), "
", "
", " ", list(list(), list("An ", list(list("IRanges")), " object: almost identical
", " to the ", list("GRanges"), " case, except that the chromosome
", " information has to be provided in the separate ", list("chromosome"), "
", " argument, because it can not be directly encoded in an
", " ", list("IRanges"), " object.")), "
", "
", " ", |
list(list(), list("A ", list("data.frame"), " object: the ", list("data.frame"), " needs to
", " contain at least the two mandatory columns ", list("start"), " and
", " ", list("end"), " with the range coordinates. It may also contain a
", " ", list("chromosome"), " column with the chromosome information for
", " each range. If missing, this information will be drawn from the
", " constructor's ", list("chromosome"), " argument.")), "
", "
", " ")
|start, end
| An integer scalar with the genomic start or end coordinates for the highlighting range. Can also be supplied as part of the range
argument.|
|width
| An integer vector of widths for highlighting ranges. This can be used instead of either start
or end
to specify the range coordinates.|
|chromosome
| The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE)
. Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx
, where x
may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to build the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA
. Please note that by definition all objects in the Gviz
package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<-
replacement method in order to change to a different active chromosome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided inputs, and eventually will fall back to the default value of NA
.|
|name
| Character scalar of the track's name. This is not really used and only exists fro completeness.|
|list()
| All additional parameters are ignored.|
Details
A track to conceptionally group other Gviz track objects into a meta
track for the sole purpose of overlaying all the contained tracks with
the same highlighting region as defined by the objects genomic
ranges. During rendering the contained tracks will be treated as if
they had been provided to the plotTracks
function as individual
objects.
Seealso
DisplayPars
GRanges
GdObject
IRanges
ImageMap
OverlayTrack
RangeTrack
Author
Florian Hahne
IdeogramTrack_class()
IdeogramTrack class and methods
Description
A class to represent the schematic display of a chromosome, also known as an ideogram. The respective information is typically directly fetched from UCSC.
Usage
IdeogramTrack(chromosome=NULL, genome, name=NULL, bands=NULL, ...)
Arguments
Argument | Description |
---|---|
chromosome | The chromosome for which to create the ideogram. Has to be a valid UCSC chromosome identifier of the form chrx , or a single integer or numeric character unless option(ucscChromosomeNames=FALSE) . The user has to make sure that the respective chromosome is indeed defined for the the track's genome. |
genome | The genome on which to create the ideogram. This has to be a valid UCSC genome identifier if the ideogram data is to be fetched from the UCSC repository. |
name | Character scalar of the track's name used in the title panel when plotting. Defaults to the selected chromosome. |
bands | A data.frame with the cytoband information for all available chromosomes on the genome similar to the data that would be fetched from UCSC. The table needs to contain the mandatory columns chrom , chromStart , chromEnd , name and gieStain with the chromosome name, cytoband start and end coordinates, cytoband name and coloring information, respectively. This can be used when no connection to the internet is available or when the cytoband information has been cached locally to avoid the somewhat slow connection to UCSC. |
list() | Additional items which will all be interpreted as further display parameters. |
Details
Ideograms are schematic depictions of chromosomes, including
chromosome band information and centromer location. The relevant data
for various species is stored in the UCSC data base. The initializer
method of the class will automatically fetch the respective data for a
given genome and chromosome from UCSC and fill the appropriate object
slots. When plotting IdeogramTracks
, the current genomic
location is indicated on the chromosome by a colored box.
The Gviz.ucscUrl
option controls which URL is being used to
connect to UCSC. For instance, one could switch to the European UCSC
mirror by calling
options(Gviz.ucscUrl="http://genome-euro.ucsc.edu/cgi-bin/"
.
Value
The return value of the constructor function is a new object of class
IdeogramTrack
.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
RangeTrack
StackedTrack
data.frame
Note
When fetching ideogram data from UCSC the results are cached for
faster acces. See clearSessionCache
on details to delete
these cached items.
Author
Florian Hahne
Examples
list("
", "## Load some sample data
", "data(idTrack)
")
## Construct the object
idTrack <- IdeogramTrack(chromosome=7, genome="mm9")
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(idTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
plotTracks(idTrack, from=5000000, to=9000000)
## Track names
names(idTrack)
names(idTrack) <- "foo"
plotTracks(idTrack, from=5000000, to=9000000)
## Accessors
chromosome(idTrack)
chromosome(idTrack) <- "chrX"
genome(idTrack)
genome(id) <- "hg19"
range(idTrack)
ranges(idTrack)
## Annotation
values(idTrack)
## coercion
as(idTrack, "data.frame")
ImageMap_class()
ImageMap class and methods
Description
HTML image map information for annotation tracks.
Author
Florian Hahne
NumericTrack_class()
NumericTrack class and methods
Description
The virtual parent class for all track items in the Gviz package designed to contain numeric data. This class merely exists for dispatching purpose.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
RangeTrack
Author
Florian Hahne
OverlayTrack_class()
OverlayTrack class and methods
Description
A container for other track objects from the Gviz package that allows for overlays of their content on the same region of the plot.
Usage
OverlayTrack(trackList=list(), name="OverlayTrack", ...)
Arguments
Argument | Description |
---|---|
trackList | A list of Gviz track objects that all have to inherit from class GdObject . |
name | Character scalar of the track's name. This is not really used and only exists fro completeness. |
list() | All additional parameters are ignored. |
Details
A track to conceptionally group other Gviz track objects into a meta track in order to merge them into a single overlay visualization. Only the first track in the supplied list will be inferred when setting up the track title and axis, for all the other tracks only the panel content is plotted.
Seealso
DisplayPars
GRanges
GdObject
HighlightTrack
IRanges
ImageMap
RangeTrack
Author
Florian Hahne
RangeTrack_class()
RangeTrack class and methods
Description
The virtual parent class for all track items in the Gviz package that contain some form of genomic ranges.
Seealso
AnnotationTrack
DataTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
Author
Florian Hahne
ReferenceTrack_class()
ReferenceTrack class and methods
Description
A class allow for on-demand streaming of data off the file system.
Usage
availableDefaultMapping(file, trackType)
Arguments
Argument | Description |
---|---|
file | A character scalar with a file name or just a file extension. |
trackType | A character scalar with one of the available track types in the package. |
Details
The availableDefaultMappings
function can be used to find out
whether the package defines a mapping scheme between one of the many
supported input file types and the metadata columns of
the tracks's GRanges
objects.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
RangeTrack
Author
Florian Hahne
SequenceTrack_class()
SequenceTrack class and methods
Description
A track class to represent genomic sequences. The two child classes
SequenceDNAStringSetTrack
and SequenceBSgenomeTrack
do
most of the work, however in practise they are of no particular
relevance to the user.
Usage
SequenceTrack(sequence, chromosome, genome, name="SequenceTrack",
importFunction, stream=FALSE, ...)
Arguments
Argument | Description |
---|
|sequence
| A meta argument to handle the different input types, making the construction of a SequenceTrack
as flexible as possible. The different input options for sequence
are: list("
", "
", " ", list(list(), list("An object of class ", list(list("DNAStringSet")), ". The
", " individual ", list(list("DNAString")), "s are considered to be
", " the different chromosome sequences.")), "
", "
", " ", list(list(), list("An object of class ", list(list("BSgenome")), ". The
", " ", list("Gviz"), " package tries to follow the ", list("BSgenome"), "
", " philosophy in that the respective chromosome sequences are only
", " realized once they are first accessed.")), |
"
", "
", " ", list(list(), list("A ", list("character"), " scalar: in this case the value of the
", " ", list("sequence"), " argument is considered to be a file path to an
", " annotation file on disk. A range of file types are supported by
", " the ", list("Gviz"), " package as identified by the file extension. See
", " the ", list("importFunction"), " documentation below for further
", " details.")), "
", "
", " ")
|chromosome
| The currently active chromosome of the track. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE)
. Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx
, where x
may be any possible string. The user has to make sure that sequences for the respective chromosomes are indeed part of the object. If not provided here, the constructor will set it to the first available sequence. Please note that by definition all objects in the Gviz
package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<-
replacement method in order to change to a different active chromosome.|
|genome
| The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. For a SequenceBSgenomeTrack
object, the genome information is extracted from the input BSgenome
package. For a DNAStringSet
it has too be provided or the constructor will fall back to the default value of NA
.|
|name
| Character scalar of the track's name used in the title panel when plotting.|
|importFunction
| A user-defined function to be used to import the sequence data from a file. This only applies when the sequence
argument is a character string with the path to the input data file. The function needs to accept an argument file
containing the file path and has to return a proper DNAStringSet
object with the sequence information per chromosome. A set of default import functions is already implemented in the package for a number of different file types, and one of these defaults will be picked automatically based on the extension of the input file name. If the extension can not be mapped to any of the existing import function, an error is raised asking for a user-defined import function. Currently the following file types can be imported with the default functions: fa/fasta
and 2bit
. Both file types support indexing by genomic coordinates, and it makes sense to only load the part of the file that is needed for plotting. To this end, the Gviz
package defines the derived ReferenceSequenceTrack
class, which supports streaming data from the file system. The user typically does not have to deal with this distinction but may rely on the constructor function to make the right choice as long as the default import functions are used. However, once a user-defined import function has been provided and if this function adds support for indexed files, you will have to make the constructor aware of this fact by setting the stream
argument to TRUE
. Please note that in this case the import function needs to accept a second mandatory argument selection
which is a GRanges
object containing the dimensions of the plotted genomic range. As before, the function has to return an appropriate DNAStringSet
object.|
|stream
| A logical flag indicating that the user-provided import function can deal with indexed files and knows how to process the additional selection
argument when accessing the data on disk. This causes the constructor to return a ReferenceSequenceTrack
object which will grab the necessary data on the fly during each plotting operation.|
|list()
| Additional items which will all be interpreted as further display parameters. See settings
and the "Display Parameters" section below for details.|
Value
The return value of the constructor function is a new object of class
SequenceDNAStringSetTrack
, SequenceBSgenomeTrack
ore
ReferenceSequenceTrack
, depending on the constructor
arguments. Typically the user will not have to be troubled with this
distinction and can rely on the constructor to make the right choice.
Seealso
AnnotationTrack
BSgenome
DNAString
DNAStringSet
DataTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
Author
Florian Hahne
Examples
## An empty object
SequenceTrack()
## Construct from DNAStringSet
library(Biostrings)
letters <- c("A", "C", "T", "G", "N")
set.seed(999)
seqs <- DNAStringSet(c(chr1=paste(sample(letters, 100000, TRUE),
collapse=""), chr2=paste(sample(letters, 200000, TRUE), collapse="")))
sTrack <- SequenceTrack(seqs, genome="hg19")
sTrack
## Construct from BSGenome object
if(require(BSgenome.Hsapiens.UCSC.hg19)){
sTrack <- SequenceTrack(Hsapiens)
sTrack
}
## Set active chromosome
chromosome(sTrack)
chromosome(sTrack) <- "chr2"
head(seqnames(sTrack))
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(sTrack) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Plotting
## Sequences
plotTracks(sTrack, from=199970, to=200000)
## Boxes
plotTracks(sTrack, from=199800, to=200000)
## Line
plotTracks(sTrack, from=1, to=200000)
## Force boxes
plotTracks(sTrack, from=199970, to=200000, noLetters=TRUE)
## Direction indicator
plotTracks(sTrack, from=199970, to=200000, add53=TRUE)
## Sequence complement
plotTracks(sTrack, from=199970, to=200000, add53=TRUE, complement=TRUE)
## Colors
plotTracks(sTrack, from=199970, to=200000, add53=TRUE, fontcolor=c(A=1,
C=1, G=1, T=1, N=1))
## Track names
names(sTrack)
names(sTrack) <- "foo"
## Accessors
genome(sTrack)
genome(sTrack) <- "mm9"
length(sTrack)
## Sequence extraction
subseq(sTrack, start=100000, width=20)
## beyond the stored sequence range
subseq(sTrack, start=length(sTrack), width=20)
StackedTrack_class()
StackedTrack class and methods
Description
The virtual parent class for all track types in the Gviz package which contain potentially overlapping annotation items that have to be stacked when plotted.
Seealso
AnnotationTrack
DisplayPars
GRanges
GdObject
GeneRegionTrack
IRanges
ImageMap
RangeTrack
Author
Florian Hahne
UcscTrack()
Meta-constructor for GenomeGraph tracks fetched directly from the various UCSC data sources.
Description
The UCSC data base provides a wealth of annotation information. This
function can be used to access UCSC, to retrieve the data available
there and to return it as an annotation track object ameanable to
plotting with plotTracks
.
clearSessionCache
is can be called to remove all cached items
from the session which are generated when connecting with the UCSC
data base.
Usage
UcscTrack(track, table=NULL, trackType=c("AnnotationTrack",
"GeneRegionTrack", "DataTrack", "GenomeAxisTrack"), genome, chromosome,
name=NULL, from, to, ...)
clearSessionCache()
Arguments
Argument | Description |
---|---|
track | Character, the name of the track to fetch from UCSC. To find out about available tracks please consult the online table browser at http://genome.ucsc.edu/cgi-bin/hgTables?command=start . |
table | Character, the name of the table to fetch from UCSC, or NULL , in which case the default selection of tables is used. To find out about available tables for a given track please consult the online table browser at http://genome.ucsc.edu/cgi-bin/hgTables?command=start . |
trackType | Character, one in c("AnnotationTrack", . The function will try to coerce the downloaded data in an object of this class. See below for details. |
genome | Character, a valid USCS genome identifier for which to fetch the data. |
chromosome | Character, a valid USCS character identifier for which to fetch the data. |
name | Character, the name to use for the resulting track object. |
from, to | A range of genomic locations for which to fetch data. |
list() | All additional named arguments are expected to be either display parameters for the resulting objects, or character scalars of column names in the downloaded UCSC data tables that are matched by name to available arguments in the respective constructor functions as defined by the trackType argument. See Details section for more information. |
Details
The data stored at the UCSC data bases can be of different formats:
gene or transcript model data, simple annotation features like CpG
Island locations or SNPs, or numeric data like conservation or
mapability. This function presents a unified API to download all kinds
of data and to map them back to one of the annotation track objects
defined in this package. The type of object to hold the data has to be
given in the trackType
argument, and subsequently the function
passes all data on to the respective object constructor. All
additional named arguments are considered to be relevant for the
constructor of choice, and single character scalars are replaced by
the respective data columns in the dowloaded UCSC tables if
available. For instance, assuming the table for track 'foo' contains
the columns 'id', 'type', 'fromLoc' and 'toLoc', giving the featuer
identifier, type, start end end location. In order to create an
AnnotationTrack object from that data, we have to
pass the additional named arguments id="id"
,
feature="type"
, start="fromLoc"
and code end="toLoc" to the
UcscTrack
function. The complete function call could look like
this:
UcscTrack(track="foo", genome="mm9", chromosome=3,
To reduce the bandwidth, some caching of the UCSC connection takes
place. In order to remove these cached session items, call
clearSessionCache
.
The Gviz.ucscUrl
option controls which URL is being used to
connect to UCSC. For instance, one could switch to the European UCSC
mirror by calling
options(Gviz.ucscUrl="http://genome-euro.ucsc.edu/cgi-bin/"
.
Value
An annotation track object as determined by trackType
.
Seealso
AnnotationTrack
DataTrack
GeneRegionTrack
GenomeAxisTrack
Author
Florian Hahne
collapsing()
Dynamic content based on the available resolution
Description
When plotting features linearily along genomic coordinates one frequently runs into the problem of too little resolution to adequatelty display all details. Most genome browsers try to reasonably reduce the amount of detail that is shown based on the current zoomn level.
Details
Most track classes in this package define an internal
collapseTrack
method which tries to adjust the plotted content
to the available resolution, aims at reducing overplotting and
prevents rendering issues, e.g. when lines are too thin to be
plotted. This feature can be toggled on or off using the
collapse
display parameter (see settings
for
details on setting these parameters).
In the simplest case (for
AnnotationTrack objects) this involves expanding
all shown features to a minimum pixel width and height (using
display parameters min.width
and min.height
) and
collapsing overlapping annotation items (as defined by the parameter
min.distance
into one single item to
prevent overplotting.
For objects of class DataTrack , the data values
underlying collapsed regions will be summarized based on the
summary
display parameter. See the class' documentation for
more details.
Seealso
AnnotationTrack
DataTrack
datasets()
Data sets
Description
Some sample data sets used for the illustrative examples and the vignette.
exportTracks()
Export GenomeGraph tracks to a annotation file representation.
Description
This function is still a bit experimental. So far only BED export is supported.
Usage
exportTracks(tracks, range, chromosome, file)
Arguments
Argument | Description |
---|---|
tracks | A list of annotation track objects to be exported into a single BED file. |
range | A numeric vector or length 2. The genomic range to display when opening the file in a browser. |
chromosome | The chromosome to display when opening the file in a browser. |
file | Character, the path to the file to write into. |
Details
FIXME: Need to support wgl exports as well...
Value
The function is called for its side effect of writing to a file.
Author
Florian Hahne
grouping()
Grouping of annotation features
Description
Many annotation tracks are actually composed of a number of grouped sub-features, for instance exons in a gene model. This man page highlights the use of grouping information to build informative annotation plots.
Details
All track objects that inherit from class
AnnotationTrack support the grouping feature. The
information is usually passed on to the constructor function (for
AnnotationTrack
via the groups
argument and for
GeneRegionTrack objects via the exon
argument) or automatically downloaded from an online annotation
repository ( BiomartGeneRegionTrack ). Group
membership is specified by a factor vector with as many items as there
are annotation items in the track (i.e., the value of
length(track)
. Upon plotting, the grouped annotation features
are displayed together and will not be separated in the stacking of
track items.
Seealso
AnnotationTrack
BiomartGeneRegionTrack
GeneRegionTrack
Author
Florian Hahne
plotTracks()
The main plotting function for one or several GenomeGraph tracks.
Description
plotTracks
is the main interface when plotting single track
objects, or lists of tracks linked together across the same genomic
coordinates. Essentially, the resulting plots are very similar to the
graphical output of the UCSC Genome Browser, execpt for all of the
interactivity.
Usage
plotTracks(trackList, from=NULL, to=NULL, ..., sizes=NULL,
panel.only=FALSE, extend.right=0, extend.left=0, title.width=NULL,
add=FALSE, main, cex.main=2, fontface.main=2, col.main="black",
margin=6, chromosome=NULL, innerMargin=3)
Arguments
Argument | Description |
---|---|
trackList | A list of GenomeGraph track objects, all inheriting from class GdObject . The tracks will all be drawn to the same genomic coordinates, either as defined by the from and to arguments if supplied, or by the maximum range across all individual items in the list. |
from, to | Charactar scalar, giving the range of genomic coordinates to draw the tracks in. Note that to cannot be larger than from . If NULL , the plotting ranges are derived from the individual tracks. See extend.left and extend.right below for the definition of the final plotting ranges. |
list() | Additional arguments which are all interpreted as display parameters to tweak the appearance of the plot. These parameters are global, meaning that they will be used for all tracks in the list where they actually make sense, and they override the track-internal settings. See settings for details on display parameters. |
sizes | A numeric vector of relative vertical sizes for the individual tracks of lenght equal to the number of tracks in trackList , or NULL to auto-detect the most appropriate vertical size proportions. |
panel.only | Logical flag, causing the tracks to be plotted as lattice-like panel functions without resetting the plotting canvas and omitting the title pane. This allows to embed tracks into a trellis layout. Usually the function is called for a single track only when panel.only==TRUE . |
extend.right, extend.left | Numeric scalar, extend the plotting range to the right or to the left by a fixed number of bases. The final plotting range is defined as from-extend.left to to+extend.right . |
title.width | A expansion factor for the width of the title panels. This can be used to make more space, e.g. to accomodate for more detailed data axes. The default is to use as much space as needed to fit all the annotation text. |
add | Logical flag, add the plot to an existing plotting canvas without re-initialising. |
main | Character scalar, the plots main header. |
cex.main, fontface.main,col.main | The fontface, color and expansion factor settings for the main header. |
margin | The margin width to add to the plot in pixels. |
innerMargin | The inner margin width to add to the plot in pixels. |
chromosome | Set the chromosome for all the tracks in the track list. |
Details
GenomeGraph tracks are plotted in a vertically stacked layout. Each track panel is split up into a title section containing the track name, as well as an optional axis for tracks containing numeric data, and a data section showing the actual data along genomic coordinates. In that sense, the output is very similar to the UCSC Genome Browser.
The layout of the individual tracks is highly customizable though so
called "display parameters". See settings
for details.
While plotting a track, the software automatically computes HTML image map coordinates based on the current graphics device. These coordinates as well as the associated annotation information can later be used to embed images of the plots in semi-interactive HTML pages. See ImageMap for details.
Value
A list of GenomeGraph tracks, each one augmented by the computed image
map coordinates in the imageMap
slot, along with the additional
ImageMap
object titles
containing information about the
title panels.
Seealso
GdObject
ImageMap
RangeTrack
StackedTrack
Author
Florian Hahne
Examples
## Create some tracks to plot
st <- c(2000000, 2070000, 2100000, 2160000)
ed <- c(2050000, 2130000, 2150000, 2170000)
str <- c("-", "+", "-", "-")
gr <- c("Group1","Group2","Group1", "Group3")
annTrack <- AnnotationTrack(start=st, end=ed, strand=str, chromosome=7,
genome="hg19", feature="test", group=gr,
id=paste("annTrack item", 1:4),
name="annotation track foo",
stacking="squish")
ax <- GenomeAxisTrack()
dt <- DataTrack(start=seq(min(st), max(ed), len=10), width=18000,
data=matrix(runif(40), nrow=4), genome="hg19", chromosome=7,
type="histogram", name="data track bar")
list("
", "## For some annoying reason the postscript device does not know about
", "## the sans font
", "if(!interactive())
", "{
", "font <- ps.options()$family
", "displayPars(annTrack) <- list(fontfamily=font, fontfamily.title=font)
", "displayPars(ax) <- list(fontfamily=font, fontfamily.title=font)
", "displayPars(dt) <- list(fontfamily=font, fontfamily.title=font)
", "}
")
## Now plot the tracks
res <- plotTracks(list(ax, annTrack, dt))
## Plot only a subrange
res <- plotTracks(list(ax, annTrack, dt), from=2080000, to=2156000)
## Extend plotting ranges
res <- plotTracks(list(ax, annTrack, dt), extend.left=200000, extend.right=200000)
## Add a header
res <- plotTracks(list(ax, annTrack, dt), main="A GenomGraphs plot",
col.main="darkgray")
## Change vertical size and title width
res <- plotTracks(list(ax, annTrack, dt), sizes=c(1,1,5))
names(annTrack) <- "foo"
res <- plotTracks(list(ax, annTrack), title.width=0.6)
## Adding and lattice like plots
library(grid)
grid.newpage()
pushViewport(viewport(height=0.5, y=1, just="top"))
grid.rect()
plotTracks(annTrack, add=TRUE)
popViewport(1)
pushViewport(viewport(height=0.5, y=0, just="bottom"))
grid.rect()
plotTracks(dt, add=TRUE)
popViewport(1)
library(lattice)
myPanel <- function(x, ...) plotTracks(annTrack, panel.only=TRUE,
from=min(x), to=max(x), shape="box")
a <- seq(1900000, 2250000, len=40)
|xyplot(b~a|c, data.frame(a=a, b=1, c=cut(a, 4)), panel=myPanel,|
scales=list(x="free"))
settings()
Setting display parameters to control the look and feel of the plots
Description
The genome track plots in this package are all highly customizable by means of so called 'display parameters'. This page highlights the use of these parameters and list all available settings for the different track classes.
Usage
addScheme(scheme, name)
getScheme(name=getOption("Gviz.scheme"))
Arguments
Argument | Description |
---|---|
scheme | A named nested list of display parameters, where the first level of nesting represents Gviz track object classes, and the second level of nesting represents parameters. |
name | A character scalar with the scheme name. |
Details
All of the package's track objects inherit the dp
slot from the
GdObject parent class, which is the main
container to store an object's display parameters. Internally, the
content of this slot has to be an object of class
DisplayPars , but the user is usually not exposed
to this low level implementation. Instead, there are two main
interaction points, namely the individual object constructor functions
and the final plotTracks
function. In both cases, all
additional arguments that are not caught by any of the formally
defined function parameters are being interpreted as additional
display parameters and are automatically added to the aforementioned
slot. The main difference here is that display parameters that are
passed on to the constructor function are specific for an individual
track object, whereas those supplied to the plotTracks
function
will be applied to all the objects in the plotting list. Not all
display parameters have an effect on the plotting of all track
classes, and those will be silently ignored.
One can query the available display parameters for a given class as
well as their default values by calling the
availableDisplayPars
function, or by inspecting the man
pages of the individual track classes. The structure of the classes
defined in this package is hierarchical, and so are the available
display parameters, i.e., all objects inherit the parameters defined
in the commom GdObject
parent class, and so on.
Once a track object has been created, the display parameters are still
open for modification. To this end, the displayPars
replacement method is available for all objects inheriting from class
GdObject
. The method takes a named list of parameters as input,
e.g.:
displayPars(foo) <- list(col="red", lwd=2)
In the same spirit, the currently set display parameters for the
object foo
can be inferred using the displayPars
method
directly, e.g.:
displayPars(foo)
For track objects inheriting from class
AnnotationTrack , display parameters that are not
formally defined in the class definition or in any of the parent
classes are considered to be valid R color identifiers that are used
to distinguish between different types of annotation features. For
instance, the parameter 'miRNA' will be used to color all annotation
features of class miRNA. The annotation types can be set in the
constructor function of the track object via the feature
argument. For most of the tracks that have been inferred from one of
the online repositories, this classification will usually be
downloaded along with the actual annotation data.
Users might find themselves changing the same parameters over and over
again, and it would make sense to register these modifications in a
central location once and for all. To this end the Gviz package
supports display parameter schemes. A scheme is essentially just a
bunch of nested named lists, where the names on the first level of
nesting should correspond to track class names, and the names on the
second level to the display parameters to set. The currently active
schmeme can be changed by setting the global option
Gviz.scheme
, and a new scheme can be registered by using the
addScheme
function, providing both the list and the name for
the new scheme. The getScheme
function is useful to get the
current scheme as a list structure, for instance to use as a skeleton
for your own custom scheme.
In order to make these settings persitant across R sessions one can
create one or several schemes in the global environment in the special
object .GvizSchemes
, for instance by putting the necessary code
in the .Rprofile
file. This object needs to be a named list of
schemes, and it will be collected when the Givz package loads. Its
content is then automatically added to the collection of available
schemes.
Please note that because display parameters are stored with the track objects, a scheme change only has an effect on those objects that are created after the change has taken place.
Seealso
AnnotationTrack
DataTrack
DisplayPars
GdObject
Author
Florian Hahne